BigQuery, increasingly attractive to Hadoop users

Contents

Cloud Computing

Google announces improvements in BigQuery, the Big Data management tool launched by the Internet giant in 2010 as a service capable of analyzing big data inside and outside the ecosystem of Apache Hadoop. Despite this, with these deep updates, cloud service strengthens its independent operation and seeks to attract Hadoop users, the current leader in Big Data analytics.

While the second generation of Hadoop tries to overcome its weaknesses such as lack of speed and complexity, while reinforcing its obvious advantages, Google focuses the development of BigQuery towards the commercialization of the service as alternative to that.

Even though BigQuery is compatible with Hadoop and both products have been created directly or indirectly by Google, their paths don't seem destined to keep crossing. If they did so far, despite this Google shows that it wants to separate them more and more to boost its competitive advantage on all fronts, including its rivalry with AWS Kinesis.

In reality, BigQuery seeks to be a feasible alternative to the open source option presented by MapReduce and the Hadoop Distributed File System (HDFS). With the deep update that, among other improvements, makes it possible to combine query results from multiple data tables, Google intends to exploit the speed and the real-time analysis provided by Dremel, the product on which the BigQuery design is based.

Cloud data analysis

Conceived as a service that facilitates quick query in the cloud after the user submits data to Google through the BigQuery API, your update continues to focus on queries of type SQL. In this new version, new capabilities are added along with the aforementioned function of joining data from multiple tables in a single query by means of a new JOIN clause, no limit on data size.

Until now, BigQuery could only handle data groups of a maximum of 8 MB and, Besides, add functionalities to import timestamps from other systems, query datetime data or add columns to existing tables and receive automatic emails when they are given access to more data sets.

In the words of Ju-kay Kwek, Product Manager, changes translate into more speed, simplicity and ease of use:

Nowadays, with BigQuery, business ideas can be obtained directly through SQL-like queries, with less effort and at a much higher speed than was previously feasible. Joining tables of terabyte data has traditionally been a difficult task for analysts, since up to now it required sophisticated development skills from MapReduce, powerful hardware and a lot of time.

Its use is totally unrelated to the yellow elephant frame, considering dispensing with it as one more advantage of the product. From Google They comment that instead of installing Hadoop, using BigQuery will save money by paying just for each query rather than the IT cost of the infrastructure required to implement it. With that and with everything, equally, Hadoop was created in its day from technologies such as MapReduce and Google File to process large amounts of data at very low cost.

Microsoft SQL and Hadoop technology

For his part, Microsoft has recently presented its Big Data solutions from the cloud in favor of the Internet of things. Starting from a single platform for data management and analysis, his use of Hadoop is part of one of his major innovations: un SQL Server 2014 faster and its Intelligent Systems Service (IIS) y Analytics Platform System (APS).

The latest version of APS is a low-cost product thanks to the combination of the technology of Hadoop y Microsoft SQL to offer a Data Warehouse that stores and manages traditional data along with the latest generation.

As a new Azure service, se presentó Microsoft Azure Intelligent System Service (ISS), a tool designed to operate from any operating system in order to take advantage of the information generated very early. different sources, like machines, sensors or devices. in addition, CCC is made available thanks to tools like Power BI for Office 365 that make it possible to combine local data and cloud data in a complementary way, resulting in rapid information management.

Related Post:

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.