Create interactive data visualizations with Plotly

Contents

Introduction

In the actual world, every second the data is getting bigger. To quickly understand data and obtain information, data visualization becomes necessary.

For instance, Consider a case where you are asked to illustrate crucial sales aspects (as sales performance, objective, income, acquisition cost, etc.) from large amounts of sales data, which would you prefer:

  1. Explore the data using Excel (or spreadsheets) and track every aspect of sales manually.
  2. Explore data using different types of sales charts and tables.

Obviously, I would prefer graphs and tables. Therefore, data visualization plays a key role in data exploration and analysis.

Data visualization is the technique to represent the data / information in a pictorial or graphic format. Enables stakeholders and decision makers to visually analyze and explore data and discover deep insights.

“Visualization gives you answers to questions you didn't know you had”. – Ben Schneiderman

Benefits of data visualization

  • Help in data analysis, data exploration and makes data more understandable.
  • Summarize complex quantitative information in a confined space.
  • Help discover the latest trends, hidden patterns in the data.
  • Identify relationships / correlations between variables.
  • Helps examine areas that need attention or improvement.

Why Plotly?

There are several libraries available in Python like Matplotlib, Seaborn, etc. for data visualization. But they only represent the static images of the graphics / graphics and, because of this, many crucial things get lost in visualization. Wouldn't it be amazing if we could better interact with the graphics by hovering over (O) getting closer? Plotly allows us to do the same.

  • Plotly is an open source data visualization library for creating charts / interactive and publication-quality graphics.
  • Plotly offers the implementation of many types of charts / different objects like line diagram, Dispersion diagram, area diagram, histogram, box plot, bar chart, etc.
  • Plotly supports interactive plotting in commonly used programming languages ​​such as Python, R, MATLAB, Javascript, etc.

In this post, we will cover the most commonly used chart types using Plotly. So let's start using Cars93 data set available in Kaggle.

The data set contains 27 car parameters (as a manufacturer, brand, price, horsepower, motor size, weight, cylinders, airbags, passengers, etc.) of 93 different cars.

The dataset looks like this:

48818df_head-3681508

Additional note: To access all Python code, follow kaggle kernel here(https://www.kaggle.com/vikashrajluhaniwal/interactive-visualizations-using-plotly).

Plotly installation

For install Plotly, use the following command in terminal.

pip install plotly

Plotly comes with few modules to create visualizations, namely, gives us the option to use it.

  • Fast: A high-level interface for creating quick visualizations. It's a wrap around Plotly Graph_objects module.
  • Graph_objects: A low-level interface for figures, strokes and designs. It is highly customizable in general for different graphics / boards.
  • figure_factory: Shape factories are dedicated functions for creating very specific types of graphics. It was available before the existence of Plotly Fast, Thus, obsolete as “inherited”.

Having known and installed Plotly, now let's draw different graphs / tables using it.

1. Box plot

  • A box plot (or box-and-whisker plot) is a standardized way of displaying the distribution of quantitative data based on a five-point summary (minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum).
  • The box extends from the values ​​of the quartile Q1 to Q3, while the whiskers extend from the edges of the box to the IQR of 1,5 *. IQR = (Q3 – Q1)

Now let's create a box plot for cars ‘ Price characteristic.
box-plot-of-price-2477278
The best thing about this visualization is that we can start to interact with it by moving to see the values ​​of the quantiles.

In the same way, we can customize it as per requirement. For instance, draw a box plot of Price for each AirBags writes.

box-plot-of-price-across-airbags-types-1590059

2. Histogram

  • A histogram is an accurate representation of the distribution of numerical data.
  • To construct a histogram, follow these steps:
    • Compartment (the deposit) the range of values: divide the entire range of values ​​into a series of intervals.
    • Tell how many values ​​fall in each interval.

Let's draw a histogram for cars ‘ Horsepower characteristic.

histogram-of-horsepower-6600776

Here, the x-axis is about bin ranges of Horsepower while the Y axis talks about frequency / count in each container.

3. Density graph

  • The density plot is a variation of a histogram, where instead of representing the frequency on the Y axis, represents the PDF values (Probability density function).
  • It is useful to visually determine the skewness of the variable.
  • What's more, useful for evaluating the importance of a continuous variable for a classification problem.

The density plot of Horsepower based on AirBags type is as shown below.

density-plot-4179624

4. Bar graphic

  • A bar chart represents categorical data with rectangular bars with weights proportional to the values ​​they represent.
  • A bar chart shows comparisons between discrete categories.

The bar graph of the Writes feature is as below show.

barplot-of-type-3868219

In the same way, we can customize it to show MPG.city means on the Y axis, instead of showing the count.barplot-of-type-2-mpg-city-mean-9948082

5. Pie chart

  • The pie chart is used to represent the numerical proportion of the data in a pie chart.
  • The entire area of ​​the graph represents the 100% of the data, the arc length of each cut represents the relative percentage of the whole.

The pie chart of Writes function is as shown below.

pie-chart-1816478

6. Scatter plot

  • A scatter plot uses points to represent values ​​for two different numerical variables.
  • It is really useful to observe the relationship between two numerical variables.

Let's draw a scatter plot to evaluate the relationship between Horsepower Y MPG.city.

scatter-plot-of-horsepower-vs-mpg-city_-9991322

From this graph, we can observe that as Horsepower increases, MPG in the city decreases.

Plotly also provides a way to draw 3D scatter plots. Let's draw the same using Horsepower, MPG.city, Y Price features.

3d-scatter-plot-2807458

Similarly, we can draw a matrix of scatter plots (a grid / scatter plot matrix) to evaluate pairwise relationships for each combination of variables.

scatter-plot-matrix-2311020

7. Line graph

  • A line chart is a type of chart that displays information as a series of data points called 'markers.’ connected by straight line segments.
  • It is similar to a scatter plot, except that the measurement points are ordered (usually by its x-axis value) and joined with straight line segments.
  • Line charts are generally used to find relationships between two numeric variables or to visualize a trend in time series data..

Let's draw a scatter plot to evaluate the relationship between Horsepower Y MPG.city.

line-chart-of-horsepower-vs-mpg-city_-3581886

8. Heat map

  • A heat map is a two-dimensional graphical representation of data, while the matrix values ​​are represented in different shades of colors.
  • A heat map is intended to provide a color-coded visual summary of data / information.
  • Seaborn also allows annotated heat maps.

Let's draw a heat map to represent the correlation matrix of cars93 data.

correlation-heatmap-7631883

9. Violin frame

  • Fiddle plots are similar to box plots, except they also show the probability density of the data at different values. In other words, the fiddle plot is a combination of box plot and density plot.
  • Wider sections of the violin plot indicate a higher probability, while narrow sections indicate a lower probability.

The violin plot of the Price The function is shown below.

violin-plot-of-price-8180280

In the same way, we can customize it using Plotly to display the table and all data points.

violin-plot-of-price-with-box-and-all-points-7638628

10. Word cloud

  • Word Cloud is a visualization technique to represent the frequency of words within a given text segment.
  • The size of a word indicates how often it appears in the text. The bigger the size, the greater the importance (frequency), while the smaller the size, less will be the importance (frequency).
  • Word clouds are often used to represent the frequency of words in text documents., reports, website data, public speeches, etc.

Word cloud of a chosen one text document it is as shown below.

53286wordcloud-6360280

Final notes

In this article, we discuss different types of graphics / graphics using Plotly and Python. Plotly highly recommended for creating interactive visualizations.

The media shown in this article is not the property of DataPeaker and is used at the author's discretion.

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.