Physics

Introduction to Data Science

By Jonathon Sumner
Cohort 2019-2020

The importance of data to the work of scientists and engineers can hardly be overstated. However, in a time of pandemic it is even more crucial for the scientific community to digest and to communicate complex ideas to policymakers and the general public alike in ways that can be easily understood.

As budding scientists and engineers, it seems imperative to practice and develop this skill. So, we shall deviate slightly from our initial course plan and take a look at how we can use Python to work with data.

Working with data using pandas

The first step is to get some data and have a look at it. Let’s take a look at some of the available data on COVID-19 using pandas to import and explore. Link

Solutions to the scavenger hunt

A review of the solutions to the data scavenger hunt from the previous meeting. Link

Visualizing data using matplotlib

We delve deeper into data science and start using matplotlib to make pretty pictures. Link

Solutions to making pretty pictures

A review of the solutions to the visualization tasks from the previous meeting. Link

Infographic challenge

Ask a question, explore some relevant data, make a visualization that reveals the answer. Link

Building models from data using scikit-learn

Can we use data to make predictions? Link

Solutions to predicting car accidents in Montreal

A review of the solutions to the machine learning tasks from the previous meeting. Link

Bixi challenge

Try to predict daily Bixi usage with machine learning. Link