Spicule - Data Processing Experts

What will be the impact of the Cloudera / Hortonworks merger?

What will be the impact of the Cloudera / Hortonworks merger?

It’s been a couple of weeks since Cloudera and Hortonworks announced their humungous $5.2bn merger, a titanic double-teaming that some commentators have already described as being like Coke buying up Pepsi. But what will this merger mean to Hadoop, open source and the future of smaller software vendors? We thought we’d share our predictions.

First, let’s start with some background…

Is Cloudera/Hortonworks a merger of equals?

Cloudera and Hortonworks certainly think so, at least according to their initial announcement. But the reality is that Cloudera’s shareholders will own 60 percent of the merged company and take an extra seat on the Board, which means Cloudera will ultimately be running the show whenever major decisions are made. It’s also notable that the new company will be known solely as Cloudera, even though Hortonworks’ CTO Scott Gnau has said the two companies are complementary and will continue to build upon each other’s strengths.1

In their press release, the new Cloudera says it “will create the world’s leading next generation data provider, spanning multi-cloud, on-premises and the Edge. The combination establishes the industry standard for hybrid cloud data management, accelerating customer adoption, community development and partner engagement.”2

That all sounds positive, but is it really going to be so easy to achieve? After all, even though vast chunks of Cloudera and Hortonworks are aligned, they each have key differentiators. How will their first release of a unified stack look? Will customers be happy to migrate? And how will Cloudera and Hortonworks’ different management tools come together to form a single operational platform?

Here’s something else to consider: if there’s a consolidation of the Hadoop market, how will smaller vendors like MapR respond?

The MapR perspective

Even though MapR competes with Cloudera and Hortonworks in the Apache Hadoop market, MapR’s CEO John Schroeder believes MapR have “run a different play than Hortonworks and Cloudera” and doesn’t see how the merger is going to result in better value for customers. He’s also sceptical about the clash of cultures resulting from the new merger. “Hortonworks and Cloudera used to be enemies,” Schroeder said during a recent interview, “There could be religious wars inside the (new) company.”3

Although it’s obvious that MapR will need new innovations to differentiate itself from Cloudera and its other competitors, Cloudera/Hortonworks could have a problem as well. According to one report, they are already a similar size to Amazon and EMR in terms of users – will they be able to continue growing, adopting, and keeping their investors satisfied?

Is Hadoop even necessary?

A few years ago, everyone wanted to climb aboard the Hadoop bandwagon. It was a trendy thing to do. But, now, many businesses accept that other systems can meet their needs just as well, and sometimes far better.

Although unstructured data will always have to be processed in an efficient manner, it doesn’t have to be Hadoop-based. There are many other alternatives available, including Python tools and the Spark processing framework. We work with a lot of unstructured data at JPL and it doesn’t involve Hadoop.

We’re not saying that Hadoop will go away, but many businesses are realising they don’t need to run a massive server 24/7 to get the results they’re looking for. If they are only data processing during certain times, it’s much more efficient and cost-effective to spin up components as and when they need them.

That’s why we developed the ANSSR platform in collaboration with Canonical, so that companies can comfortably handle large amounts of data but only pay for what they use. Spinning up, processing and then shutting down is a far better (and significantly cheaper) strategy than using a full-stack vendor to run large Hadoop clusters 24/7.

Adoption is driven by innovation

Even though software companies are investing millions of pounds and dollars (although other currencies are available!) into their sales pipelines, that doesn’t mean innovation is making it into the marketplace. But, if customers are going to adopt, innovation is important. That could be problematic for Hadoop in more ways than one.

Hadoop is an Apache Software Foundation project and, so far, Hortonworks have been particularly good at pushing patches, new software etc. back up the line. However, Cloudera don’t tend to be quite so forthcoming. Now that they have 60 percent of the merged business, there’s the possibility that Cloudera could keep their Hadoop innovations closed-sourced so that competitors like EMR can’t share the improvements, which will obviously have a snowballing effect on all the other Apache software committers. Apache Cassandra proved there are ways around this obstacle, but if the new Cloudera makes a business decision not to upstream changes, there will inevitably be reverberations in the open source world.

Final thoughts

The merger between Cloudera and Hortonworks isn’t as daunting as it seems, in fact it could be enormously beneficial to smaller vendors. Why? Because a lot of businesses will prefer to swerve big corporations like Cloudera and, instead, work with smaller vendors who offer more choice and innovation. The larger the corporation the slower it can move, whereas smaller vendors (like ourselves) have flexibility, creativity and faster response times on our side… without all the corporate red tape. Our new ANSSR platform is just one indicator of the exciting future that lies ahead.

At Spicule, we offer solutions that aren’t directly related to Cloudera/Hortonworks and we also do things differently to EMR. We can still run your data in private cloud or on physical infrastructure but we believe there has to be a sweet spot between what EMR and GCP (Google Cloud Platform) offers and what more traditional companies like Cloudera and Hortonworks can give you. That’s the middle-ground we fill, and it offers our clients the best of all worlds.

If you’d like to find out more, or if you’re interested in using Hadoop or any of the other unstructured data processing methodologies Spicule provides, give us a call on 01603 327762 or email info@spicule.co.uk. We’d love to discuss the possibilities with you.

1 https://www.computerweekly.com/news/252450674/Hortonworks-CTO-weighs-in-on-Cloudera-merger

2 https://seekingalpha.com/article/4211723-hold-cloudera-hortonworks

3 https://digitizingpolaris.com/mapr-ceo-sees-opportunity-in-cloudera-hortonworks-merger-7acd34ded053


Cloud Company Enterprise Big Data Hadoop Latest