Skip to content
Snippets Groups Projects

Python for Data Science

This is an introductionary course for Using Python for Data Science applications. In the recent years Python has became extremely popular within the data science communities mainly due to its ease of use, open-source nature and the fact that it is completely free. This couse aims to introduce newcomers to the most popular packages used today - numpy, pandas and matplotlib. Note that it assumes basic knowledge of python (i.e. lists, dicts, indexing).

This is entirely self-contained and self-pased. You can do it in your own time and it shouldn't take more than 6 hours to go throught all of the material. However, this course is by no means a complete guide to using Python for data science applications. It serves the purpose of an introduction into the world of data analysis and make you comfortable with looking at seemingly random numbers and trying to extract meeting from them.

The course is based on the wonderful Jupyter Notebooks which you can install from here. Alternatively, if you are from the University of Edinburgh you can access the programming environment using Noteable which can be accessed through the accompanying learn course (search for Python for Data Science coourse on Learn).

Setting up in Noteable

If you are using Noteable, then the easiest way to get the necessary files in the course is by running the following command in a notebook within Noteable.

!git clone https://git.ecdf.ed.ac.uk/digital-skills/python-data-science

Structure

The course is split into 6 different notebooks:

Notebook 0

Short intro to Jupyter Notebooks.

Notebook 1

Python basic concepts refresher.

Notebook 2

Challenges you to do sentiment analysis on a sample text.

Notebook 3

Introduction to vectorised computing and dealing with large data.

Notebook 4

Fun exercise to test your understanding of how data flows through Python.

Notebook 5

Introduction to the numpy package and structing large data.

Notebook 6

Introduction to plotting in python using matplotlib.

Authors

This course was developed by Ignat Georgiev and Patrick Kinnear from the Digital Skills team at Information Services at the University of Edinburgh.

Credits

The development of this course wouldn't have been possible without the help and provided materials by:

  • Alisdair Tullo, School of Philosophy, Psychology and Language Sciences, the University of Edinburgh
  • Magnus Hagdorn, School of Geosciences, the University of Edinburgh