SQL basics | SQL commands and uses

Contents

If you know 10 people who have been in data science for more than 5 years, Everyone probably knows or has used SQL at some point in some way!! Such is the degree of influence that SQL had on anything to do with structured data.

In this post, we will learn the basics of SQL and focus on SQL for RDBMS. As you will see, SQL is pretty easy to learn and understand.

What is SQL?

SQL stands for Structured Query Language. It is a standard programming language to access a relational database. It has been designed for data management in relational database management systems (RDBMS) como Oracle, MySQL, MS SQL Server, IBM DB2.

SQL is one of the first commercial languages ​​used for Edgar F's relational model.. Codd, further described in his influential post from 1970, “A relational data model for large shared databases. “

Previously, SQL was a de facto language for the generation of information technology professionals. This was due to the fact that the data stores consisted of one or the other RDBMS. The simplicity and beauty of the language enabled data warehousing professionals to query and provide data to business analysts.

Despite this, the problem with RDBMS is that they are often suitable only for structured information. For unstructured information, newer databases like MongoDB and HBase (the Hadoop) prove to be more suitable. Part of this is a compensation in the databases, which is due to the CAP theorem.

What is the CAP theorem?

The CAP theorem states that, in the best case, we can aspire to two of the following three properties. CAP means:

Consistency – This means that the data in the database remains consistent after the execution of an operation.

Availability – This means that the database system is always up to ensure availability..

Partition tolerance – This means that the system continues to function even if the transfer of information between the servers is not reliable..

The various databases and their relationships with the CAP theorem are shown below:

NoSQL visual guide

Database properties:

Despite this, a database transaction must be ACID compliant. ACID means atomic, consistent, insulated and durable, as explained below:

Atomic: A transaction must be completed with all your data modifications or not.

Consistent: At the end of the transaction, all data must be left consistent.

Isolated : Data modifications made by one transaction must be independent of other transactions.

Durable : At the end of the transaction, the effects of the modifications made by the transaction must be permanent in the system.

To counteract ACID, consistent services provide BASE features (Simply available, soft state, eventual consistency).

Command set in SQL

SELECT- The following is an example of a SELECT query that returns a list of cheap books. The query retrieves all the rows from the Library table in which the price The column contains a value less than 10,00. The result is sorted in ascending order by price. The asterisk in the choose list indicates that all columns of the Book

SELECT *
 FROM  Library
 WHERE price < 10.00
 ORDER BY price;

The table must be included in the result set.

UPGRADE –

This query helps update tables in a database. In addition, you can combine the SELECT query with the GROUP BY operator to add statistics of a numeric variable by a categorical variable.

JOINTS-

Therefore, SQL is widely used not only for querying data, but also to join the data returned by such queries or tables. Data fusion in SQL is done through 'joins'. The next infographic is often used to explain SQL joins:

How to use join in SQL

CASE- We have the case operator / when / then / else / than a SQL. Works like but

in other programming languages:

CASE WHEN n > 0
 THEN 'positive'
 WHEN n < 0
 THEN 'negative'
 ELSE 'zero'
 END


Nested subqueries – Queries can be nested so that the results of one query can be used in another query via a relational operator or an aggregate function. A nested query is also known assubquery

.

Where do we use SQL?

  1. SQL has been used extensively to retrieve data, merge data, query group and nested cases over decades. Even for data science, SQL has been widely adopted. Then, some examples of the specific use of SQL parsing are shown:
  2. In the case of the SAS language that uses PROC SQL, we can write SQL queries to query, update and manipulate data.
  3. An R, sqldf package can be used to run SQL queries on data frames.

In Python, pandasql library enables you to query Pandas DataFrames using SQL syntax.

Does SQL also influence other languages?

The downside of relational databases is that they cannot handle unstructured data. To cope with the appearance, new databases have emerged and are given NoSQL as an alternative name to DBMS. But SQL is not dead yet. See also:

A mapping from SQL to MongoDB

Here are some languages ​​in which SQL has a significant influence:

Hive – Apache Hive provides a mechanism to project structure onto data in Hadoop and query that data using a SQL-like language called HiveQL (HQL). It is a data storage infrastructure built on Apache ™ Hadoop® to provide data summary, ad hoc queries and analysis of large data sets. Even HQL, a query language used in Hadoop, heavily uses SQL influences. You can find out morehere

.

SQL-Mapreduce

– Teradata uses the Aster database using SQL with MapReduce for large data sets in the age of Big Data. SQL-MapReduce® is a framework created by Teradata Aster to allow developers to write powerful and highly expressive SQL-MapReduce functions in languages ​​such as Java, C #, Python, C ++ and R and bring them to the discovery platform for high-performance analytics. After, analysts can invoke SQL-MapReduce functions using standard SQL or R via the Aster database.

Spark SQL – Apache's Spark project is forReal-time processing, in-memory and parallel Hadoop data

. Spark SQL builds on it to allow SQL queries to be written to the data. In Impala of Cloudera, data stored in HDFS or HBase can be queried, and the SQL syntax is the same as Apache Hive.

See also: Learn more about ways to query Hadoop using SQLhere

.

Final notes

In this post we discuss SQL, its uses, the CAP theorem and influence of SQL in other languages. A basic knowledge of SQL is very relevant in today's world, where python, R, SAS are dominant languages ​​in data science. SQL is still relevant in the age of BIG DATA. The beauty of language remains its elegant and simple structure. Thinkpot:

Do you think SQL has become an inevitable weapon for data management? Would you recommend any other database languages?

Share your views / opinion / feedback with us in the comment section below. We would love to hear from you!! If you like what you have just read and want to continue learning about analytics,subscribe to our emails , Follow us on twitter or like ourspage the Facebook

Related

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.