If a study claims that Spark is eating Hadoop, you hardly believe on it. It will sound like SQL is eating RDBMSes or HEMIs are eating trucks. Hadoop ETL developers would like to share that Spark is another execution engine on an overall platform built of several tools and parts.

Major Hadoop development vendors say Hadoop is not an EDW (Enterpise Data Warehouse) solution, nor it is an alternative to EDW solutions. It is so because Hadoop vendors wish to co-sell.


Hadoop technology is not quite as efficient at doing SQL all over the large data sets. Yet the gap is closing and the latest Hive with Tez doing the job faster than before over text files with MapReduce. At the same time, Spark blows most everything inside the Hadoop ecosystem out of the water. The new ‘in memory’ world is a cheaper than yesterday.

You cannot take the Hadoop technology as a Teradata or Netezza. Modern Hadoop distribution is different. It’s a distributed execution platform on which several tools are available for development team.

It is cheaper by the cluster. It means, your buy-in is cheap with this technology. You can easily deploy the stuff on Amazon for zero capital investment or on your preferred commodity hardware for a small investment.

Spark and Tez are making the performance gap narrower each passing day. In modern era, many companies are there in the market today that have replaced Teradata or Netezza with Hadoop technology. However, some of them are working and keeping Teradata or Netezza for old projects that have been intended on these technology and for new project, developers use Hadoop and /or Spark technology for brilliant results. These new projects sometimes include streams of data or social media and sometimes they are run-of-the-mill BI projects.

Hadoop is now the first choice for companies. As Teradata and Netezza were too pricey for companies, many small IT companies and mid size firms can use Tableau or Pentaho to plumb their data lake.

Data is growing with pace and it is very clear that companies are shifting to Hadoop technology and hiring Hadoop developers to get the best results and analysis report.

