Skip to content
Snippets Groups Projects
user avatar
nbgitpuller authored
029fab78
History

Python for Data Science

This is an introductionary course for Using Python for Data Science applications. In the recent years Python has became extremely popular within the data science communities mainly due to its ease of use, open-source nature and the fact that it is completely free. This couse aims to introduce newcomers to the most popular packages used today - numpy, pandas and matplotlib. Note that it assumes basic knowledge of python (i.e. lists, dicts, indexing).

This is entirely self-contained and self-pased. You can do it in your own time and it shouldn't take more than 6 hours to go throught all of the material. However, this course is by no means a complete guide to using Python for data science applications. It serves the purpose of an introduction into the world of data analysis and make you comfortable with looking at seemingly random numbers and trying to extract meeting from them.

The course is based on the wonderful Jupyter Notebooks which you can install from here. Alternatively, if you are from the University of Edinburgh you can access the programming environment using Noteable which can be accessed through the accompanying learn course (search for Python for Data Science coourse on Learn).

Setting up in Noteable

If you are using Noteable, then the easiest way to get the necessary files in the course is by running the following command in a notebook within Noteable.

!git clone https://git.ecdf.ed.ac.uk/digital-skills/python-data-science

Structure

The course is split into 6 different notebooks:

Notebook 0

Short intro to Jupyter Notebooks.

Notebook 1

Python basic concepts refresher and some text analysis exercises.

Notebook 2

Introduction to vectorised computing and dealing with large data with numpy.

Notebook 3

Introduction to plotting in Python with matplotlib.

Notebook 4

Introduction to pandas and dealing with tabular data.

Extra notebooks

At the end of the course, there are notebooks starting with extra which over a wide range of applied data science topics. Usually students are expected to do one of their choice but feel free to go through all of them.

Authors

This course was developed by Ignat Georgiev and Patrick Kinnear from the Digital Skills team at Information Services at the University of Edinburgh.

Credits

The development of this course wouldn't have been possible without the help and provided materials by:

  • Alisdair Tullo, School of Philosophy, Psychology and Language Sciences, the University of Edinburgh
  • Magnus Hagdorn, School of Geosciences, the University of Edinburgh