Hadoop 1 vs Hadoop 2 vs Hadoop 3

Hadoop 3, launched in 2017 made a notable difference in the world of data and a new era began. The Hadoop framework is at the core of the entire Hadoop ecosystem, and various other libraries strongly depend on it. The table below describes the differences between Hadoop 1.x, Hadoop 2.x and Hadoop 3.x.

Hadoop 1Hadoop 2Hadoop 3
limited to 4000 nodes per clustersupports upto 10000 nodes in a clustersupports over 10000 in a cluster
does not support Microsoft Windowsadded support for Microsoft Windowssupports Microsoft Windows
works on the concept of slotsworks on the concept of containersworks on the concept of containers.
no standby name nodesupports only 1 standby name nodesupports 2 or more standby name node
has single point of failure i.e. name nodehas standby name node to overcome single point of failure, so whenever name node fails it recovers automaticallyhas features to overcome single point of failure, so whenever name node fails it recovers automatically
no need for backward compatibility. MR API compatible and executed without any additional filesMR API compatible with Hadoop 1.x program to execute on Hadoop 2.xhere also MR API is compatible with running Hadoop 1.x programs to execute on Hadoop 3.x
data processing was a problem in Hadoop 1.x as MapReduce wasn’t good enough for processing (resource management)in Hadoop 2.x, YARN ( yet another resource negotiator) provides control resource manager that share a common resource to run multiple applications but has some scalability issuesHadoop 3.x provides a more optimal use of resources using a newer version of yarn which improves the scalability and reliability of timeline service
multi-tenancy was not supported in Hadoop 1.xmulti-tenancy was introduced in Hadoop 2.xHadoop 3.x also supports multi-tenancy
replication factor was 3used 3x replication schemereplication factor was reduced to 2
fault tolerance was done via replicationfault tolerance was also done via replicationfault tolerance was improved via erasure encoding
the default block size of a block in HDFS is 64MBthe default block size of a block in HDFS is 128MBthe typical block size used by HDFS is 128MB
was released in 2008was released in 2011-12was released in 2016-17
here HDFS occupies a 200% overhead storage spacehere HDFS occupies only 50% overhead storage space
MapReduce performed all the taskHDFS balancer is used for data balancingintra-data node balancer is used for data balancing which is invoked via the HDFS disk balancer CLI.
manual intervention is neededmanual intervention is neededmanual intervention is not needed
the user needs to configure HADOOP_HEAPSIZEthe user needs to configure HADOOP_HEAPSIZEprovides auto-tuning of heap
licensed under Apache 2.0 Licenselicensed under Apache 2.0 Licenselicensed under Apache 2.0 License
supports only one namespace per cluster for managing HDFS filesystemsupports multiple namespaces in a cluster for managing HDFS filesystemsupports multiple namespace in a cluster for managing HDFS filesystem
supports only one programming model i.e. MapReducesupports multiple programming models with YARN Component like MapReduce, Hive, Pig, Giraph, HBase, and other Hadoop toolssupports multiple programming models with YARN Component like MapReduce, Hive, Pig, Tez, Hama, Giraph, HBase, Spark, Storm, and other Hadoop tools
some default port lies within the range of ephemeral portsno default port lies within the range of ephemeral ports
supported file systems are HDFS (default file system), and FTP file systemsupported file systems are HDFS (default file system), FTP file system, Amazon S3 file system, and Windows Azure storage blobs file systemsupported file systems are HDFS (default file system), FTP file system, Amazon S3 file system, and Windows Azure storage blobs file system as well as Microsoft Azure DataLake file system
datanode resource is dedicated for MapReducedatatnode resource is no dedicated for MapReduce and can be used by other applicationsdatatnode resource is no dedicated for MapReduce and can be used by other applications

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: