8 Charts You Must Know To Excel In The Art Of Data Visualization!

Contents

Introduction

Humans are among the most creative species on this planet. From time immemorial, art has taken various forms, from paleolithic cave paintings to modern art. For instance, the cave paintings of Bhimbetka gave a lot of information about the life of the people back then. The genesis of visual art dates back to the stone age.

Now, as part of the fourth generation of the revolution, who has witnessed art and creativity in various fields and forms, here come various tools and programming languages ​​to our rescue to solve complex business problems using the art of visualization.

Today's businesses use various visualization techniques to understand and gain insights from data to make data-driven business decisions.. Today there are many visualization tools available such as Tableau, Power BI, Looker, Qlik sense and many more. On this issue, we will cover various types of graphics using Python.

The need for data visualization

Data makes more sense and is easy to understand when presented in a simple, visualized format, as it is difficult for the human eye to decipher the pattern, trend and seasonality from raw data. Therefore, data is visualized to understand how different parameters behave.

Various types of charts and their uses.

77887types-8926814

1. Bar and column charts

It is one of the simplest charts to understand how our quantitative field is performing in various categories. Is used for comparison.

62660bar20chart-3112001

In the column chart above, we can see that technology sales are highest and office supplies are the lowest.

87230128ac916-d348-4938-9d90-2b0456ffedfd-image2005-1-01-4988350

The graph shown above is a bar graph showing which L categories perform best.

2. Scatter chart and bubble chart

Scatter and bubble diagrams help us understand how to spread in all the considered range. Can be used to identify patterns, the presence of outliers and the relationship between the two variables.

37256scatter20pot-2678654

We can see that with the increase in discounts the profits are decreasing.

12315screen-shot-2015-02-04-at-4-41-48-pm201-6749829

The graph shown above is a bubble graph.

3. Line graph

Preferred when time-dependent data must be presented. It is more suitable for analyzing the trend.

74740download202-9889025

In the graph above, we can see that sales are increasing throughout the months, but there is a sudden drop in the month of July and the sales are highest in November.

4. Histogram

A histogram is a frequency graph that records the number of occurrences of an entry in a data set. It is useful when you want to understand the distribution of a series.

93299download203-8822264

5. Box plot

Box plots are effective for summarizing spread big data. They use percentile to divide the data range. This helps us understand the data point that is below or above a chosen data point. It helps us to identify outliers in the data.

32022bx-1958681

The box plot divides the complete data into three categories

* Median value: divide the data into two equal halves

* IQR: ranges between the percentile values 25 Y 75.

* Atypical values: these data differ significantly and lie outside the whiskers.

71836download204-2769081

The circles in the graph above show the presence of outliers.

6. Subparcelas

Sometimes it is better to trace different plots on the same grid to understand and compare the data better.

75883download206-1079091

Here you can see that in the single chart we were able to understand the sales over a period of time in different regions.

7. Donut, pie charts and stacked column charts

When we want to find the composition of the data graphs mentioned above is the best.

93779tableau_donut_chart-1579621

The donut chart above shows the sales composition of different product categories.

82710072120_1431_7secretsoft7-1244548

The pie chart above shows the percentage of sales in different years.

68789stackedbar2-5668916

The column chart above shows the sale of two products in different quarters..

8. Heat maps

It is the most preferred graphic when we want to check if there are any. correlation between variables.

37847download207-4874757

Here the positive value shows a positive correlation and the negative value shows a negative correlation. The color indicates the intensity of the correlation, the darker the color, the higher the positive correlation and the lighter the color, the greater the negative correlation.

Understand visualization with Python

Python offers several libraries to understand the data graphically like Matplotlib Y Seaborn etc. Let's start our journey into the world of visualization.

Anubhav is a product-based company that sells different types of products. Let's explore the data to find your sales over a period, what category / product subcategory generates the highest sales, the ratio of profit to an increase in discount.

1. Let's import the relevant libraries first.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import warnings

warnings.filterwarnings ('ignore')


2. The next step would be to load the dataset.

sales=pd.read_excel('Maven Supplies Raw.xlsx',skiprows = 3)
sales.head(2)
48566data-7806500

3. Taking the dataset with us, let's explore the data

# Check the number of rows and columns in the dataframe
sales.shape
(9994, 24)
# Check the column-wise info of the dataframe
sales.info()
16089info-4654611
# Check the summary for the numeric columns 
sales.describe()
90434desc-2566653
sales.columns
50314columns-5837594

4. Now that we better understand the available data, let's visualize them to understand them better.

– First, explore category composition with% of sales.

sales.groupby(['Category'])['Sales'].sum().plot(kind='pie',autopct="%1.2f%%")
plt.title("Sales% of different category")
plt.ylabel(" ")
plt.show();
13886download208-6824856

We can see that the technology is working better compared to other categories.

– There are a lot of subcategories within the data, allows you to see how the different subcategories are performing.

plt.figure(figsize=(10,3))
sales.groupby(['Sub-Category'])['Sales'].sum().sort_values(ascending=False).plot(kind='bar',color="seagreen")
plt.show();
32177download209-2254155

We can see that phone sales are the highest, followed by chairs and so on.

– Sales in different regions will be different. We'll see

sales.groupby(['Region'])['Sales'].sum().sort_values(ascending=False).plot(kind='bar',color="seagreen")
plt.show();
88153download2014-8868949

Sales in the west region are high and the south region is the lowest.

– Now let's see how the regions perform in terms of profits.

sales.groupby(['Region'])['Profit'].sum().sort_values(ascending=False).plot(kind='bar',color="seagreen")
plt.show();
49877download2015-7939149

The worst performing southern region in terms of sales is performing better compared to the central region.

– Heat maps give us a better understanding of how different variables are correlated with each other.

plt.figure(figsize = (10, 5))
sns.heatmap(sales.corr(),annot=True,cmap="summer")
plt.show()
13747download2010-3799728

Clearly discounts are negatively correlated with earnings.

– Let's figure out how profit is affected by increased discounts.

sns.scatterplot(x = 'Discount', y='Profit', data = sales ,color="seagreen")
plt.show;
88975download2011-3807705

We can see that with the increase of the discount the earnings are also decreasing.

– Sales are not constant, increase or decrease based on various factors. Let's see how sales are performing in the different months.

sales.groupby(['month'])['Sales'].sum().plot(kind='line',color="seagreen")
53397download2013-8369825

As mentioned earlier, is showing a pattern with the highest sales in the month of November and the lowest sales in the month of February.

– It is not necessary that even if the sales are high, earnings will show a similar pattern. Let's see how earnings change over time. This may be due to the sale of discounted products as seen in the scatterplot.

sales.groupby(['month'])['Profit'].sum().plot(kind='line',color="seagreen")
44823profit-5239337

we can see that the benefits are high during the month of September and lower during the month of July.

– Sales can show an increasing or decreasing pattern over the year.

sales.groupby(['year'])['Sales'].sum().plot(kind='line',color="seagreen")
plt.xticks([2015,2016,2017,2018])
plt.show()
91907download2016-8680470

We can see that sales show a downward trend in the year 2016 as it grows in all the years.

From a data set, We were able to understand that phones generated the majority of sales and that the West region contributed the highest sales and profits. Over a period of time, sales increased, but with the increase of the discount, earnings showed a downward trend. We saw that there were particular months in which higher sales and profits were recorded.

Therefore, we can say that the visualization speaks a lot, you will always have a story to tell that helps companies make data-driven decisions.

Conclution

In this article, we talked about various types of charts and their uses. We deal with a dataset to understand how to use Python libraries to visualize the data and make sense of it. Therefore, we can say that through visualization, it's easy to decipher a hidden pattern or trend in the data. With some examples, we saw that the graphs help in the comparison and, the most important, they are easy to understand.

Final notes

Thank you for reading!!!

I hope you have enjoyed reading the article and have increased your knowledge of various types of charts and their use..

If I haven't mentioned anything or if you want to share your thoughts, feel free to comment below in the comment section.

About the Author

Sruthi ER

I am a data science enthusiast with an interest in data analysis and visualization, and I am currently pursuing IIIT-Bangalore Data Science Postgraduate Certification. I come from a career in Civil Engineering with 4 years of experience in the construction industry.

Do not hesitate to contact me at Linkedin

The media shown in this article is not the property of DataPeaker and is used at the author's discretion.

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.