What is Hadoop?
Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.
Let’s take a look at what some of those terms mean.
- Open-source software. Open-source software is created and maintained by a network of developers from around the globe. It’s free to download, use and contribute to, though more and more commercial versions of Hadoop are becoming available.
- Framework. In this case, it means that everything you need to develop and run software applications is provided – programs, connections, etc.
- Massive storage. The Hadoop framework breaks big data into blocks, which are stored on clusters of commodity hardware.
- Processing power. Hadoop concurrently processes large amounts of data using multiple low-cost computers for fast results.
What are the benefits of Hadoop?
One of the top reasons that organizations turn to Hadoop is its ability to store and process huge amounts of data – any kind of data – quickly. With data volumes and varieties constantly increasing, especially from social media and the Internet of Things, that’s a key consideration. Other benefits include:
- Computing power. Its distributed computing model quickly processes big data. The more computing nodes you use, the more processing power you have.
- Flexibility. Unlike traditional relational databases, you don’t have to preprocess data before storing it. You can store as much data as you want and decide how to use it later. That includes unstructured data like text, images and videos.
- Fault tolerance. Data and application processing are protected against hardware failure. If a node goes down, jobs are automatically redirected to other nodes to make sure the distributed computing does not fail. And it automatically stores multiple copies of all data.
- Low cost. The open-source framework is free and uses commodity hardware to store large quantities of data.
- Scalability. You can easily grow your system simply by adding more nodes. Little administration is required. Read more about HADOOP.
FUN FACT: “Hadoop” was the name of a yellow toy elephant owned by the son of one of its inventors.