How To Get Started With Python for Data Science?

If you want to become a data scientist or maybe you are already a data scientist and want to expand your tool repository. You have landed at the right place. This article aims at providing a comprehensive learning path for people who wish to learn Python for data science. This path offers a sequence of steps you need to learn to use Python for data science.



First of all, ask yourself a question that how Python will help you in data science.

The next step is to Set up your machine. After making up your mind, set up your computer. The easiest way to start is to download Anaconda from It will have most of the things you will need ever. The major problem with this is that you will need to wait for Continuum to update their packages. If you are a starter, then it will not matter.

Ace the Basics of Python Language


If you are new to programming, it is advisable to start by understanding the basics of the word, libraries, and data structure. You should be comfortable and should have a beginner level of knowledge of the basic concepts of the language.

Learn Regular Expressions in Python

Regular expressions are used a lot for data cleansing if you are working on text data. The easiest way to learn Regular expressions is to go through the Google class.


Learn Scientific Libraries in Python

This is where the real stuff begins! Practice the NumPy tutorial and specifically NumPy arrays. Next, look at the SciPy tutorials. Go through the introduction and the basics. Finally, understand Pandas. Pandas provide DataFrame functionality (like R) for Python. This is where you should practice a lot. Pandas would become the most effective tool for all mid-size data analysis. Start with a short introduction, 10 minutes to pandas. Then move on to a more detailed tutorial on pandas.

You can have a look at Exploratory Data Analysis with Pandas and Data munging with Pandas. Additional Resources: If you need a good book on Pandas and NumPy, “Python for Data Analysis by Wes McKinney.”There are a lot of tutorials as part of Panda’s documentation you can choose to go through any.

Learn Scikit-Learn and Machine Learning

Now, we come to the essential part of the entire process. Scikit-learn is the most useful library on PythonPython for machine learning. You gave to have an overview of machine learning, Supervised learning algorithms such as regressions, decision trees, ensemble modelling, and non-supervised learning algorithms such as clustering.

Practice, Practice, and Practice

Congratulations, you are almost there!

You now have all that you need in technical skills at a beginner’s level. You need to practise the newly learned skills.

Deep Learning

Now that you have learned most of machine learning techniques, you should start  Deep Learning, there is a good chance that you already know what Deep Learning is, but if you still need a brief intro you can log on to Global Tech Council.