Hadoop and Big Data Solutions

Contents

big data

Hadoop solutions and Big data opens a wide range of alternatives for the treatment of big data. Although Cloudera once released the first commercial package based on Hadoop, today exists a very large figure of commercial distributions that seek to facilitate their configuration and installation.

As a guide, in a recent Analyze Future report titled “Hadoop: Industry growth trends and forecasts to 2020”, the following ten companies are selected as the top companies operating in the Hadoop market:

  • Amazon web services
  • Cisco Systems
  • Cloudera Inc
  • Datameer, Inc
  • Hortonworks, Inc
  • Karmasphere, Inc.
  • MapR Technologies
  • Pentaho Corporation
  • Teradata Corporation
  • Mark logic

Despite this, the landscape is constantly changing. According to the same report, the number of Hadoop distributors is expected to increase and, therefore, the bundled software offering. In reality, this is a current trend, because at the same time these companies, smaller providers are constantly emerging that go gaining strength for its agility, forcing large companies to innovate.

Specifically, the Hadoop packaged software market would register a compound annual rate of 62,9% in the period analyzed, Come in 2013 Y 2020.

Resellers help their customers manage data through Hadoop, an open source software that can categorize and analyze large amounts of Big Data information. It is, in summary, simplify analysis data adding value to the original Apache Hadoop framework, the common framework.

The Hadoop ecosystem

Even though your name is unique, actually Hadoop is a family of open source technologies supervised by the Apache Software Foundation, and therefore some of its products allow various combinations and we can find them in commercial packages.

According to Philip Russom, director of data management research at The Data Warehousing Institute, the Hadoop library includes, “In order of BI priority: Hadoop Distribute File System (HDFS), MapReduce, Pig, Hive, HBase, HCatalog, Ambari, Mohout, Fumes, among others ".

At the same time, the Hadoop community is constantly creating new projects. Even though it can be combined in several ways, in the expert's opinion, a practical tech stack would be HDFS and MapReduce (maybe with Pig, Hive y HBase) for business intelligence applications (WITH A), data storage (DW), data integration (FROM) and advanced analysis.

Hadoop Apache or commercial distribution?

The comparative advantages between distributors focus on their different formulas with respect to implementation and ease of administration, even though open source BI solutions can respond wonderfully to business needs, according to a Forrestar Research report.

Therefore, each distribution is different and, at the same time, they all share their core, even though some manufacturers offer their own MapReduce applications. A) Yes, at the same time of the new generations of tools and the different commercial options available, Apache Hadoop is open to anyone who wants to use it for the storage and processing of large amounts of disparate data.

The fact that Apache Hadoop is open source available through vendors raises the inevitable dilemma of wondering which option is more convenient. At the same time of being key compare The technological alternatives before choosing, you need to pay attention to the economic cost, the inclusion and need for administrative tools, as well as equally decisive aspects such as maintenance and technical assistance.

Related Post:

Image source: Twobee / FreeDigitalPhotos.net

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.