STA 199: Introduction to Data Science and Statistical Thinking

This page contains an outline of the topics, content, and assignments for the semester. Note that this schedule will be updated as the semester progresses and the timeline of topics and assignments might be updated throughout the semester.

WEEK DATE PREPARE TOPIC MATERIALS DUE
1 Wed, Jan 10

Lab 0: Hello, World and STA 199!

πŸ’» lab 0



Thu, Jan 11

Welcome to STA 199

πŸ–₯️ slides 00
⌨️ ae 00


2 Mon, Jan 15

No lab - Martin Luther King Jr. Day holiday




Tue, Jan 16

πŸ“— r4ds - intro
πŸ“˜ ims - chp 1

Meet the toolkit

πŸ–₯️ slides 01
⌨️ ae 01



Thu, Jan 18

πŸ“— r4ds - chp 1
πŸŽ₯ Data and visualization
πŸŽ₯ Visualising data with ggplot2

Grammar of graphics

πŸ–₯️ slides 02
⌨️ ae 02


3 Mon, Jan 22

πŸ“— r4ds - chp 2

Lab 1: Data visualization

πŸ’» lab 1



Tue, Jan 23

πŸ“˜ ims - chp 4
πŸ“˜ ims - chp 5
πŸŽ₯ Visualizing numerical data
πŸŽ₯ Visualizing categorical data

Visualizing various types of data

πŸ–₯️ slides 03
⌨️ ae 02 (cont.)



Thu, Jan 25

πŸ“˜ ims - chp 6

Data visualization overview

πŸ–₯️ slides 04
⌨️ ae 03


4 Mon, Jan 29

πŸŽ₯ Grammar of data wrangling
πŸ“— r4ds - chp 3.1-3.5

Lab 2: Data wrangling

πŸ’» lab 2

Lab 1 at 8 am


Tue, Jan 30

πŸŽ₯ Working with a single data frame
πŸ“— r4ds - chp 3.6-3.7
πŸ“— r4ds - chp 4

Grammar of data wrangling

πŸ–₯️ slides 05
⌨️ ae 04



Thu, Feb 1

πŸŽ₯ Tidying data
πŸ“— r4ds - chp 5

Tidying data

πŸ–₯️ slides 06
⌨️ ae 05


5 Mon, Feb 5

πŸŽ₯ Working with multiple data frames

Lab 3: Data tidying and joining

πŸ’» lab 3

Lab 2 at 8 am


Tue, Feb 6

πŸ“— r4ds - chp 19.1-19.3

Working with multiple data frames




Thu, Feb 8

πŸŽ₯ Data types
πŸŽ₯ Data classes
πŸ“— r4ds - chp 16

Data types and classes



6 Mon, Feb 12

Work on Exam 1 Review


Lab 3 at 8 am


Tue, Feb 13

Importing and recoding data




Thu, Feb 15

Exam 1 - In-class + take-home released



7 Mon, Feb 19

Project milestone 1 - Working collaboratively


Exam 1 take-home at 8 am


Tue, Feb 20

Web scraping




Thu, Feb 22

Functions + iteration



8 Mon, Feb 26

Lab 4: Topic TBA

πŸ’» lab 4

Project milestone 1 at 8 am


Tue, Feb 27

Data science ethics - Misrepresentation




Thu, Feb 29

Data science ethics - Algorithmic bias + data privacy



9 Mon, Mar 4

Lab 5: Topic TBA

πŸ’» lab 5

Lab 4 at 8 am


Tue, Mar 5

The language of models




Thu, Mar 7

Linear regression with a single predictor



10 Mon, Mar 11

🌴 No lab - Spring Break




Tue, Mar 12

🌴 No lecture - Spring Break




Thu, Mar 14

🌴 No lecture - Spring Break



11 Mon, Mar 18

Project milestone 2 - Project proposals


Lab 5 at 8 am


Tue, Mar 19

Linear regression with multiple predictors




Thu, Mar 21

Overfitting and other concerns



12 Mon, Mar 25

Lab 6: Topic TBA

πŸ’» lab 6

Project milestone 2 at 8 am


Tue, Mar 26

Logistic regression




Thu, Mar 28

Modeling overview



13 Mon, Apr 1

Lab 7: Topic TBA

πŸ’» lab 7

Lab 6 at 8 am


Tue, Apr 2

Quantifying uncertainty with bootstrap intervals




Thu, Apr 4

Making decisions with randomization tests



14 Mon, Apr 8

Work on Exam 2 Review


Lab 7 at 8 am


Tue, Apr 9

Inference overview




Thu, Apr 11

Exam 2 - In-class + take-home released



15 Mon, Apr 15

Project milestone 3 - Peer review


Exam 2 take-home at 8 am
Project milestone 3 at the end of lab session


Tue, Apr 16

Communicating data science results effectively




Thu, Apr 18

Customizing Quarto reports and presentations



16 Mon, Apr 22

Project milestone 4 - Project presentations


Project presentations + writeup at the beginning of lab session


Tue, Apr 23

Looking further: Topic TBA