Wednesday, February 27, 2013

Open Source Hadoop Book

Apache Hadoop is a well known name with great utility. Apache Hadoop is an open-source software framework that supports data-intensive distributed applications, licensed under the Apache v2 license. The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

To make this experience even better, Mark Kerzner and Sujee Maniyam have released an open source Hadoop book for Hadoop fans. The book is titled "Hadoop illuminated” and it can be read in multi page and single page formats.

The authors want to make learning about Hadoop and its ecosystem fun and engaging. The book is accompanied by its project on GitHub. The book is work in progress and is in alpha stage. The authors will be updating and adding to it.

The feedback of various stakeholders is welcome.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.