While it may have a funny name, Hadoop provides all of the key elements which make this open-source software framework a powerhouse for those needing speed, strength and dependability. Here's a breakdown of it's key features.
Origins of Apache and Hadoop
The Apache Software Foundation is a nonprofit institution set up to provide support to open-source software projects for the common good. This foundation has over 350 open source projects and initiatives that provide different technological solutions. Hadoop is one of these open-source projects. Haddop is a Java based programming environment that is geared to support the storage of extremely huge data blocks in a distributed computing outfit.
Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Their aim was to return web search results faster by distributing data calculation across different computers so multiple tasks could be accomplished without stress. Another search engine project called Google was also in progress during this period, and both were based on the same concept- storing and processing data in distributed way so that relevant web search result could be returned faster.
In 2006, Cutting joined Yahoo and took with him the Nutch project as well as the idea based on Google’s early work with automating distributed data storage and processing. The Nutch project was divided into two parts; the web crawler portion remained as Nutch while the distributed computing and processing portion became Hadoop, named after a toy elephants of Cutting’s son. In 2008, Yahoo released Hadoop as an open-source project. Today, Hadoop’s framework and ecosystem of technologies are maintained by the non-profit Apache Software Foundation (ASF), a global community of software developers and contributors.
Unlike the traditional enterprise, Hadoop is known for its power of distributed processing, Hadoop can handle vast volumes of structured and unstructured data more efficiently than the traditional enterprise data warehouse. This means that the initial cost saving is dramatic with Hadoop, while it can continue to grow as the organizational data grows.
Four Essential Components
The major aspects that intrigue and interest web developers in a platform include flexibility, scalability and the ability to process huge amounts of data on multiple applications. These are all features that Hadoop possesses which has made the platform a great asset to coders. The fact that data can be retrieved and revised easily in the event of error also endears this platform to many. This is unlike other open source platforms which have difficulty rendering processes along with long complicated set up procedures. With Hadoop, the idea is to make programming accessible and enjoyable even to novices. The fact that this platform option is also affordable is just icing on the cake.