Introduction
In the actual world, every second the data is getting bigger. To quickly understand data and obtain information, data visualization becomes necessary.
For instance, Consider a case where you are asked to illustrate crucial sales aspects (as sales performance, objective, income, acquisition cost, etc.) from large amounts of sales data, which would you prefer:
- Explore the data using Excel (or spreadsheets) and track every aspect of sales manually.
- Explore data using different types of sales charts and tables.
Obviously, I would prefer graphs and tables. Therefore, data visualization plays a key role in data exploration and analysis.
Data visualization is the technique to represent the data / information in a pictorial or graphic format. Enables stakeholders and decision makers to visually analyze and explore data and discover deep insights.
“Visualization gives you answers to questions you didn't know you had”. – Ben Schneiderman
Benefits of data visualization
- Help in data analysis, data exploration and makes data more understandable.
- Summarize complex quantitative information in a confined space.
- Help discover the latest trends, hidden patterns in the data.
- Identify relationships / correlations between variables.
- Helps examine areas that need attention or improvement.
Why Plotly?
There are several libraries available in Python like Matplotlib, Seaborn, etc. for data visualization. But they only represent the static images of the graphics / graphics and, because of this, many crucial things get lost in visualization. Wouldn't it be amazing if we could better interact with the graphics by hovering over (O) getting closer? Plotly allows us to do the same.
- Plotly is an open source data visualization library for creating charts / interactive and publication-quality graphics.
- Plotly offers the implementation of many types of charts / different objects like line diagram, Dispersion diagramThe scatter plot is a graphical tool used in statistics to visualize the relationship between two variables. It consists of a set of points in a Cartesian plane, where each point represents a pair of values corresponding to the variables analyzed. This type of chart allows you to identify patterns, Trends and possible correlations, facilitating data interpretation and decision-making based on the visual information presented...., area diagram, histogram, box plot, bar chart, etc.
- Plotly supports interactive plotting in commonly used programming languages such as Python, R, MATLAB, Javascript, etc.
In this post, we will cover the most commonly used chart types using Plotly. So let's start using Cars93 data set available in Kaggle.
The data set contains 27 parametersThe "parameters" are variables or criteria that are used to define, measure or evaluate a phenomenon or system. In various fields such as statistics, Computer Science and Scientific Research, Parameters are critical to establishing norms and standards that guide data analysis and interpretation. Their proper selection and handling are crucial to obtain accurate and relevant results in any study or project.... de automóvil (as a manufacturer, brand, price, horsepower, motor size, weight, cylinders, airbags, passengers, etc.) of 93 different cars.
The dataset looks like this:
Additional note: To access all Python code, follow kaggle kernel here(https://www.kaggle.com/vikashrajluhaniwal/interactive-visualizations-using-plotly).
Plotly installation
For install Plotly, use the following command in terminal.
pip install plotly
Plotly comes with few modules to create visualizations, namely, gives us the option to use it.
- Fast: A high-level interface for creating quick visualizations. It's a wrap around Plotly Graph_objects module.
- Graph_objects: A low-level interface for figures, strokes and designs. It is highly customizable in general for different graphics / boards.
- figure_factory: Shape factories are dedicated functions for creating very specific types of graphics. It was available before the existence of Plotly Fast, Thus, obsolete as “inherited”.
Having known and installed Plotly, now let's draw different graphs / tables using it.
1. Box plot
- A box plot (or box-and-whisker plot) is a standardized way of displaying the distribution of quantitative data based on a five-point summary (minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum).
- The box extends from the values of the quartile Q1 to Q3, while the whiskers extend from the edges of the box to the IQR of 1,5 *. IQR = (Q3 – Q1)
Now let's create a box plot for cars ‘ Price characteristic.
The best thing about this visualization is that we can start to interact with it by moving to see the values of the quantiles.
In the same way, we can customize it as per requirement. For instance, draw a box plot of Price for each AirBags writes.
2. Histogram
- A histogram is an accurate representation of the distribution of numerical data.
- To construct a histogram, follow these steps:
- Compartment (the deposit) the range of values: divide the entire range of values into a series of intervals.
- Tell how many values fall in each interval.
Let's draw a histogram for cars ‘ Horsepower characteristic.
Here, the x-axis is about bin ranges of Horsepower while the Y axis talks about frequency / count in each container.
3. Density graph
- The density plot is a variation of a histogram, where instead of representing the frequency on the Y axis, represents the PDF values (Probability density function).
- Es útil para determinar visualmente la asimetría de la variableIn statistics and mathematics, a "variable" is a symbol that represents a value that can change or vary. There are different types of variables, and qualitative, that describe non-numerical characteristics, and quantitative, representing numerical quantities. Variables are fundamental in experiments and studies, since they allow the analysis of relationships and patterns between different elements, facilitating the understanding of complex phenomena.....
- What's more, useful for evaluating the importance of a continuous variable for a classification problem.
The density plot of Horsepower based on AirBags type is as shown below.
4. Bar graphicThe bar chart is a visual representation of data that uses rectangular bars to show comparisons between different categories. Each bar represents a value and its length is proportional to it. This type of chart is useful for visualizing and analyzing trends, facilitating the interpretation of quantitative information. It is widely used in various disciplines, such as statistics, Marketing and research, due to its simplicity and effectiveness....
- A bar chart represents categorical data with rectangular bars with weights proportional to the values they represent.
- A bar chart shows comparisons between discrete categories.
The bar graph of the Writes feature is as below show.
In the same way, we can customize it to show MPG.city means on the Y axis, instead of showing the count.
5. Pie chartThe pie chart, Also known as pie chart, is a visual representation that shows the proportion of different parts to a whole. It is commonly used in statistics to illustrate the distribution of categorical data. Each section of the chart represents a percentage of the total, making it easier to compare between categories. Its clear and concise design makes it an effective tool for the presentation of quantitative information....
- The pie chart is used to represent the numerical proportion of the data in a pie chart.
- The entire area of the graph represents the 100% of the data, the arc length of each cut represents the relative percentage of the whole.
The pie chart of Writes function is as shown below.

6. Scatter plotA scatter plot is a visual representation that shows the relationship between two numerical variables using points on a Cartesian plane. Each axis represents a variable, and the location of each point indicates its value in relation to both. This type of chart is useful for identifying patterns, Correlations and trends in the data, facilitating the analysis and interpretation of quantitative relationships....
- A scatter plot uses points to represent values for two different numerical variables.
- It is really useful to observe the relationship between two numerical variables.
Let's draw a scatter plot to evaluate the relationship between Horsepower Y MPG.city.
From this graph, podemos observar que a measureThe "measure" it is a fundamental concept in various disciplines, which refers to the process of quantifying characteristics or magnitudes of objects, phenomena or situations. In mathematics, Used to determine lengths, Areas and volumes, while in social sciences it can refer to the evaluation of qualitative and quantitative variables. Measurement accuracy is crucial to obtain reliable and valid results in any research or practical application.... that Horsepower increases, MPG in the city decreases.
Plotly also provides a way to draw 3D scatter plots. Let's draw the same using Horsepower, MPG.city, Y Price features.
Similarly, we can draw a matrix of scatter plots (a grid / scatter plot matrix) to evaluate pairwise relationships for each combination of variables.
7. Line graphThe line chart is a visual tool used to represent data over time. It consists of a series of points connected by lines, which allows you to observe trends, Fluctuations and patterns in the data. This type of chart is especially useful in areas such as economics, Meteorology and scientific research, making it easier to compare different data sets and identify behaviors across the board..
- A line chart is a type of chart that displays information as a series of data points called 'markers.’ connected by straight line segments.
- It is similar to a scatter plot, except that the measurement points are ordered (usually by its x-axis value) and joined with straight line segments.
- Line charts are generally used to find relationships between two numeric variables or to visualize a trend in time series data..
Let's draw a scatter plot to evaluate the relationship between Horsepower Y MPG.city.
8. Heat mapa "heat map" is a graphical representation that uses colors to show the density of data in a specific area. Commonly used in data analytics, Marketing and behavioral studies, This type of visualization allows you to identify patterns and trends quickly. Through chromatic variations, Heat maps make it easier to interpret large volumes of information, helping to make informed decisions....
- A heat map is a two-dimensional graphical representation of data, while the matrix values are represented in different shades of colors.
- A heat map is intended to provide a color-coded visual summary of data / information.
- Seaborn also allows annotated heat maps.
Let's draw a heat map to represent the correlation matrix of cars93 data.
9. Violin frame
- Los diagramas de violín son similares a los box plotsBox Diagrams, Also known as box and whisker diagrams, are statistical tools that represent the distribution of a dataset. These diagrams show the median, quartiles and outliers, allowing data variability and symmetry to be visualized. They are useful in comparison between different groups and in exploratory analysis, making it easier to identify trends and patterns in the data...., except they also show the probability density of the data at different values. In other words, the fiddle diagramThe violin diagram is a graphical representation that combines features of a boxplot and a density graph. Used to visualize the distribution of a dataset, showing both the median and variability through their shape, that resembles a violin. This type of graph is very useful in statistical analysis, ya que permite comparar múltiples distribuciones de forma clara y efectiva.... es una combinación de diagrama de caja y diagrama de densidad.
- Wider sections of the violin plot indicate a higher probability, while narrow sections indicate a lower probability.
The violin plot of the Price The function is shown below.
In the same way, we can customize it using Plotly to display the table and all data points.
10. Word cloud
- Word Cloud is a visualization technique to represent the frequency of words within a given text segment.
- The size of a word indicates how often it appears in the text. The bigger the size, the greater the importance (frequency), while the smaller the size, less will be the importance (frequency).
- Word clouds are often used to represent the frequency of words in text documents., reports, website data, public speeches, etc.
Word cloud of a chosen one text document it is as shown below.
Final notes
In this article, we discuss different types of graphics / graphics using Plotly and Python. Plotly highly recommended for creating interactive visualizations.
The media shown in this article is not the property of DataPeaker and is used at the author's discretion.