Python training | Python for data science

Share on facebook
Share on twitter
Share on linkedin
Share on telegram
Share on whatsapp

Contents

Journey from a Python Newbie to a Kaggler in Python

Then you pretend become a data scientist or it may be that you already are and want to expand your repository of tools. You have landed in the right place. The goal of this page is to provide a comprehensive learning path for people new to Python for data science.. This path provides a complete overview of the steps you should learn to use. Python for data science. If you already have some background, or you don't need all the components, feel free to adapt your own paths and let us know how you made the changes along the way.

You can also check the mini version of this learning path -> Infographics: Quick Guide to Learning Data Science in Python.

Reading this in 2019? We have designed a updated learning path for you! Check it out on our course portal and start your data science journey today.

Paso 0: Heating

Before starting your journey, the first question to answer is:

Why use python?

O

How would Python be useful?

Look at the first 30 minutes of this talk about jeremy, Founder of DataRobot at PyCon 2014, Ukraine, to get an idea of ​​how useful Python could be.

Paso 1: configuration of your machine

Now that you made up your mind, it's time to configure your machine. The simplest way to proceed is simply download Anaconda by Continuum.io. It comes packaged with most of the things you will need. The main disadvantage of taking this route is that you will have to wait for Continuum to update your packages, even when an update may be available for the underlying libraries. If you are a beginner, that shouldn't matter.

If you face any installation challenge, can find more detailed instructions for various operating systems here.

Paso 2: learn the basics of the Python language

You must start by understanding the basics of the language, libraries and data structure. the for free DataPeaker course in Python it is one of the best places to start your trip. This course focuses on how to get started using Python for data science and, in the end, should be comfortable with the basics of the language.

Assignment: Take the awesome free Python course from DataPeaker

Alternative resources: If interactive coding is not your learning style, You can also check The Google class for Python. It is a series of kinds of 2 days and also covers some of the parts that are discussed later.

Paso 3: learn regular expressions in Python

You will need to use them a lot for data cleansing, especially if you are working with text data. the best way to learn regular expressions is go through google class and keep this cheat sheet practical.

Assignment: Make the baby names exercise

If you still need more practice, follow this tutorial for cleaning text. Will challenge you in various steps involved in data management.

Paso 4: Learn Scientific Libraries in Python: NumPy, SciPy, Matplotlib and Pandas

This is where the fun begins!! Here is a short introduction to various libraries. Let's start practicing some common operations.

  • Practice the NumPy Tutorial thoroughly, especially NumPy arrays. This will form a good foundation for things to come..
  • Then, look at the Science fiction tutorials. Review the introduction and the basics and do the rest according to your needs.
  • If you guessed the Matplotlib tutorials below, you're wrong! They are too complete for our need here. Instead, watch this ipython notebook up to the line 68 (In other words, even animations)
  • In conclusion, let's take a look at pandas. Pandas provides DataFrame functionality (how R) for Python. This is also where you should have a good time practicing. Pandas would become the most effective tool for all mid-size data analysis. Start with a short introduction, 10 minutes for pandas. After, go to a more detailed description. tutorial on pandas.

You can also view exploratory data analysis with Pandas and data analysis with Pandas.

Additional Resources:

  • If you need a book on Pandas and NumPy, “Python for data analysis por Wes McKinney “
  • There are many tutorials as part of the Pandas documentation. You can take a look at them here

Assignment: Solve this CS109 course assignment the Harvard.

Paso 5: effective data visualization

Go through this conference form CS109. You can ignore the 2 initial minutes, But what follows is amazing!! Follow this conference with this assignment.

Paso 6: Aprenda Scikit-learn y Machine Learning

Now, we get to the heart of this whole procedure. Scikit-learn is the most useful library in Python for machine learning. Here is a brief description of the library. Pass the lesson 10 to the lesson 18 of Harvard CS109 course. You will go through an overview of machine learning, supervised learning algorithms as regressions, decision trees, set modeling and unsupervised learning algorithms such as clustering. Follow individual conferences with the assignments of those conferences.

You should also consult the ‘Introduction to data sciencecertainly to give yourself a boost in your quest for a data scientist position.

Additional Resources:

Paso 7: practice, practice and practice

Congratulations, you did!

You now have everything you need in technical skills. It's a matter of practice and what better place to practice than to compete with other data scientists in the world. Plataforma DataHack. And, immerse yourself in one of the live competitions taking place right now at DataHack and Kaggle and try everything you've learned.

Paso 8: deep learning

Now that you have learned most of the machine learning techniques, it's time to give deep learning a chance. Chances are you already know what deep learning is, but if you still need a short introduction, here it is.

I'm new to deep learning myself, so take these suggestions with caution. The most complete resource is deeplearning.net. Here you will find everything: conferences, data sets, challenges, tutorials. You can also try the Geoff Hinton course an attempt in an attempt to understand the basics of neural networks.

Get started with Python: A Complete Tutorial for Learning Data Science with Python from Scratch

PS In case you need to use Big Data libraries, try Pydoop and PyMongo. Not included here, since the Big Data learning path is a complete topic in itself.

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.