Saturday, October 5, 2013

Apache Hadoop – Introduction

Today’s world is obsessed with data. Data has grown up to such an extent, it is a big challenge for everyone to handle it. This large data is known as ‘Big Data’.
What is ‘Big Data’ ?
Big data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big Data is a data whose size is in petabytes or even more(1 PB = 1000 TB).
The internet giants like Facebook, Yahoo, Google, Amazon, eBay are struggling to handle the Big Data. The well known is very expensive. To put an end to this problem Yahoo has come up with . Hadoop is an open source suite, under an apache foundation (http://hadoop.apache.org) so it is also known as Apache Hadoop. It was created by Doug Cutting.
Features of Hadoop
  • AccessibleHadoop runs on large clusters of commodity machines or on cloud computing services such as Amazons Elastic Compute Cloud (EC2 ).
  • RobustBecause it is intended to run on commodity hardware, Hadoop is architected with the assumption of frequent hardware malfunctions. It can gracefully handle most such failures.
  • ScalableHadoop scales linearly to handle larger data by adding more nodes to the cluster.
  • SimpleHadoop allows users to quickly write efficient parallel code.
via ..
Apache Hadoop – Introduction

Read More >>

No comments:

Post a Comment