Key Python Packages for Data Science

Introduction

Especially Python libraries for data science, machine learning models are very interesting, easy to understand and absolutely that you can apply immediately and can feel the information of the data and realize / visualize the nature of the dataset.

Even complex algorithms can be implemented in two or three lines of code, all major math concepts are embedded within packages for implementation point of view.

Of course, this is something different and interesting than other programming libraries i have seen so far, That's the main reason why Python plays a vital role in the AI space with this simplicity and robustness!! I think so! I realized, I thoroughly understood and enjoyed it.

What is a package in Python? A package is a collection of Piton modules and assemblies in one package. Once it matters in your notebook cells, can start using classes, methods, attributes, etc., but before that, you should need and use the package and import it into your archive / package.

Let's look at the key Python packages for data science and machine learning.

Pandas
NumPy
Learn Scikit
Matplotlib
Seaborn

Pandas

Used primarily for structured data manipulations and operations. Pandas offers powerful data processing capabilities, I've never seen such wonderful features in my IT journey. Provides high performance, easy to use and applied in data structures and to analyze the data.

How could you install the Pandas library? this is very simple, run the following command on your Jupiter Notebook.

!pip install pandas

The Pandas library will install successfully!! Whats Next? play with this library.

The syntax to import Scikit into your NoteBook

import pandas as pd

Then, your Notebook is ready to extract all functions within pandas. let's do some things here.

Pandas have the following capabilities.

A) Series y DataFrame

The main components of pandas are Serie Y Data frame. Let's take a quick look at this. Series is nothing more than a dictionary and a collection of series, we could build the data frame by merging series, take a look at the following sample. you would understand better.

Code creates series and data frames

import pandas as pd
Eno=[100, 101,102, 103, 104,105]
Empname= ['Raja', 'Babu', 'Kumar','Karthik','Rajesh','xxxxx']
Eno_Series = pd.Series(There) 
Empname_Series = pd.Series(Empname)
df = { 'Eno': Eno_Series, 'Empname': Empname_Series } 
employee = pd.DataFrame(frame)
employee

B. Load data into a data frame object

cereal_df = pd.read_csv("cereal.csv")

cereal_df.head(5)

C. Drop column from data frame object

cereal_df.drop(["type"], axis = 1, inplace = True)

cereal_df.head(5)

D. Select rows from data frame object

cereal_df_filtered = cereal_df[cereal_df['rating'] >= 68]
cereal_df_filtered.head()

E. Group column in data frame

cereal_df_groupby = cereal_df.groupby('shelf')

#print the first entries
cereal_df_groupby.first()

F. Extract a row from the data frame

# return the value 
result = cereal_df.loc[0,'name']
result

Up to now, we have discussed multiple functionalities in the pandas library. There are many more.

NumPy

NumPy is considered one of the most popular machine learning libraries in Python, the best and most important feature of NumPy is the interface and Array manipulations.

Afraid of math while implementing your data science model / machine learning? Do not worry, NumPy makes complex mathematical implementations very simple functions. But remember to understand the requirements and use the package accordingly.

The syntax to import NumPy into your NoteBook

import numpy as np

Let's break down a few things here, how NumPy works magic with given data.

A. Simple matrix formation using NumPy (1-D, 2-D y 3D)

import numpy as np

#1-D arrays

arr1 = np.array([1, 2, 3, 4, 5])

print("1-D Array")

print(arr1)

print("===================")

#2-D arrays

print("2-D Array")

arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print(arr2)

print("===================")

#3-D arrays

print("3-D Array")

arr3 = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr3)

print("===================")

Production

1-D Array
[1 2 3 4 5]
===================
2-D Array
[[1 2 3]
 [4 5 6]]
===================
3-D Array
[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]
===================

B. Array Slicing usando NumPy

#Slicing in python means taking elements from given index range [start:end-1] /[start:end:step].
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print("Slicing at index 1 to 5")
print(arr[1:5])

Production

Slicing at index 1 to 5
[2 3 4 5]

arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[4:])
Output
[5 6 7]

We also have Negative Slicing :). That's so simple, we just have to mention [-x:-Y],

Why don't you try your own?

C. Matrix Shaping and Reshaping Using NumPy

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print("================================")
print("Shape of the array")
print(arr.shape)
print("================================")
Output
================================
Shape of the array
(2, 4)
================================
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print("Before Reshape the array")
print(arr)
print("================================")
newarr = arr.reshape(4, 3)
print("After Reshape the array")
print(newarr)
print("================================")
output 
Before Reshape the array
[ 1  2  3  4  5  6  7  8  9 10 11 12]
================================
After Reshape the array
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
================================

D. Array division using NumPy

arr = np.array([1, 2, 3, 4, 5, 6])
print("Splitting NumPy Arrays into 3 Arrays")
print("================================")
newarr = np.array_split(arr, 3)
print(newarr[0])
print(newarr[1])
print(newarr[2])
print("================================")
output
Splitting NumPy Arrays into 3 Arrays
================================
[1 2]
[3 4]
[5 6]

E.Sorting Array using NumPy

arr = np.array(['banana', 'cherry', 'apple'])
print("Splitting NumPy Arrays into 3 Arrays")
print("================================")
print(np.sort(arr))
print("================================")
output
Splitting NumPy Arrays into 3 Arrays
================================
['apple' 'banana' 'cherry']
================================

If you have started to play with data using NumPy ....

Certainly, needs more and more time … to understand the concepts, all are

extremely organized in this package. trust me!

Learn Scikit

Scikit The Learn library is one of the richest libraries in the Python family, contains a large number of machine learning algorithms and other key performance-related libraries. Python Scikit-learn allows users to perform various specific machine learning tasks. To work, should work together with the SciPy and NumPy libraries, this is something internal, anyway, Keep it in mind. Few algorithms here for your opinions.

Regression
Classification
Grouping
Model selection
Dimensionality reduction

The syntax to import Scikit into your NoteBook

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

Python display packages

Matplotlib and Seaborn Libraries

Python provides 2D graphics functions with the Matplotlib library. this is very simple and easy to understand. you can achieve it with 1 O 2 lines. Even 3D visualization is there too.

The syntax for importing Scikit into your notebook

import matplotlib.pyplot as plt

import seaborn as sns

Hope you have worked on various charts in Excel spreadsheet and other BI tools. But in python, internal visualization packages provide very high quality charts and tables.

Matplotlib y Seaborn

Matplotlib is one of the main and basic visualization packages, that provides histogramasHistograms are graphical representations that show the distribution of a dataset. They are constructed by dividing the range of values into intervals, O "Bins", and counting how much data falls in each interval. This visualization allows you to identify patterns, trends and variability of data effectively, facilitating statistical analysis and informed decision-making in various disciplines.... (Frequency level), Bar graph (Univariate and bivariate plotting), Scatter plotA scatter plot is a visual representation that shows the relationship between two numerical variables using points on a Cartesian plane. Each axis represents a variable, and the location of each point indicates its value in relation to both. This type of chart is useful for identifying patterns, Correlations and trends in the data, facilitating the analysis and interpretation of quantitative relationships.... (Grouping), etc.,

51723few20glimpse20from20matplotlib-5567073

Rich and deluxe data visualization library from Seaborn. Provides a high-level interface for drawing attractive and informative statistical charts. Box plotsBox Diagrams, Also known as box and whisker diagrams, are statistical tools that represent the distribution of a dataset. These diagrams show the median, quartiles and outliers, allowing data variability and symmetry to be visualized. They are useful in comparison between different groups and in exploratory analysis, making it easier to identify trends and patterns in the data.... (Data distribution with different quartiles), Violin Plots (Data distribution Y Probability density), Bar charts (Comparisons between categorical characteristics), Heat mapa "heat map" is a graphical representation that uses colors to show the density of data in a specific area. Commonly used in data analytics, Marketing and behavioral studies, This type of visualization allows you to identify patterns and trends quickly. Through chromatic variations, Heat maps make it easier to interpret large volumes of information, helping to make informed decisions.... (Feature mapping in terms of matrix representation), Word cloud (Visual representation of text data)

Seaborn – Histogram

import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.distplot(df['petal_length'],kde = False)
plt.show()

Seaborn – Box plot

df = sb.load_dataset('iris')
sb.boxplot(x = "species", y = "petal_length", data = df)
plt.show()

Seaborn – Violinplot

sdf = sb.load_dataset('tips')
sb.violinplot(x = "day", y = "total_bill", data=df)
plt.show()

Then, All these libraries are helping us build a good model and play with the data!!

But always remember, before using the induvial packages, you need to understand the need and requirements of the package and then import it into your archive / pack and play with it.

Hope you now have the feeling and some level of detail about Python packages for data science. We will see more detailed concepts in the coming days!! Thanks for your time!

The media shown in this article is not the property of DataPeaker and is used at the author's discretion.

Key Python Packages for Data Science

Contents

Introduction

Pandas

Pandas have the following capabilities.

A) Series y DataFrame

B. Load data into a data frame object

C. Drop column from data frame object

D. Select rows from data frame object

E. Group column in data frame

F. Extract a row from the data frame

NumPy

A. Simple matrix formation using NumPy (1-D, 2-D y 3D)

Production

B. Array Slicing usando NumPy

Production

C. Matrix Shaping and Reshaping Using NumPy

D. Array division using NumPy

E.Sorting Array using NumPy

Learn Scikit

Python display packages

Matplotlib and Seaborn Libraries

Matplotlib y Seaborn

Seaborn – Histogram

Seaborn – Box plot

Seaborn – Violinplot

Related

Recent posts

Artificial Intelligence in Video: How New Technologies Are Changing Video Production?

IT profiles you should consider

How to record a screen on Windows computer?

¿Do you know the seniority levels?

Find Your Best Slip Rings and Rotary Joints Here

Posittion Agency: Advantages of link building for an online store

Subscribe to our Newsletter

Gaming

Brands

Business

Languages

Key Python Packages for Data Science

Contents

Introduction

Pandas

Pandas have the following capabilities.

A) Series y DataFrame

B. Load data into a data frame object

C. Drop column from data frame object

D. Select rows from data frame object

E. Group column in data frame

F. Extract a row from the data frame

NumPy

A. Simple matrix formation using NumPy (1-D, 2-D y 3D)

Production

B. Array Slicing usando NumPy

Production

C. Matrix Shaping and Reshaping Using NumPy

D. Array division using NumPy

E.Sorting Array using NumPy

Learn Scikit

Python display packages

Matplotlib and Seaborn Libraries

Matplotlib y Seaborn

Seaborn – Histogram

Seaborn – Box plot

Seaborn – Violinplot

Related

Related Posts:

Recent posts

Artificial Intelligence in Video: How New Technologies Are Changing Video Production?

IT profiles you should consider

How to record a screen on Windows computer?

¿Do you know the seniority levels?

Find Your Best Slip Rings and Rotary Joints Here

Posittion Agency: Advantages of link building for an online store

Subscribe to our Newsletter

Gaming

Brands

Business

Languages