Apache
Hadoop is a well known name with great utility. Apache
Hadoop is an open-source software framework that supports
data-intensive distributed applications, licensed under the Apache v2
license. The Apache™
Hadoop® project develops open-source software for
reliable, scalable, distributed computing.
The Apache Hadoop software library is a framework
that allows for the distributed processing of large data sets across
clusters of computers using simple programming models. It is designed
to scale up from single servers to thousands of machines, each
offering local computation and storage. Rather than rely on hardware
to deliver high-availability, the library itself is designed to
detect and handle failures at the application layer, so delivering a
highly-available service on top of a cluster of computers, each of
which may be prone to failures.
To make this experience even better, Mark
Kerzner and Sujee Maniyam have released an open source
Hadoop book for Hadoop fans. The book is titled "Hadoop
illuminated” and it can be read in multi page and single
page formats.
The authors want to make learning about Hadoop and
its ecosystem fun and engaging. The book is accompanied by its
project on GitHub. The book is work in progress and is in alpha
stage. The authors will be updating and adding to it.
The feedback of various stakeholders is welcome.