STATS 191 (Autumn 2019, Stanford)

Description

Course link.

Syllabus

Course Overview

Statistical tools for modern data analysis. Topics include regression and prediction, elements of the analysis of variance, bootstrap, and cross-validation. Emphasis is on conceptual rather than theoretical understanding. Student assignments require use of the software package R.

Expected outcomes

By the end of the course, students should be able to:

  • Enter tabular data using R.
  • Plot data using R, to help in exploratory data analysis.
  • Formulate regression models for the data, while understanding some of the limitations and assumptions implicit in using these models.
  • Fit models using R and interpret the output.
  • Test for associations in a given model.
  • Use diagnostic plots and tests to assess the adequacy of a particular model.
  • Find confidence intervals for the effects of different explanatory variables in the model.
  • Use some basic model selection procedures, as found in R, to find a best model in a class of models.
  • Fit simple ANOVA models in R, treating them as special cases of multiple regression models.
  • Fit simple logistic and Poisson regression models.

Course Information

  • Term: Autumn 2019
  • Units: 3

Prerequisites

An introductory statistics course, such as - STATS 60 or STATS 110 or STATS 141.

Textbook

Software

  • In this course, we will use R for computing and R Markdown for producing lecture slides, solutions for homework assignments. R Markdown is highly recommended to write the solutions for homework assignments. Install the following software:
    • R (required): https://www.r-project.org/.
    • R Studio is highly recommended for syntax highlighting, package management, document generation, and more: https://www.rstudio.com/.
      • The newest version of R Studio is highly recommended.
    • LaTeX, which will enable you to create PDFs directly from the R Markdown in RStudio.

Evaluation

The final letter grade for this course will be determined by each method of assessment weighted as follows:

  • 7 weekly homework assignments (55%)
  • Midterm examination (15%, Wednesday, 10/23/2019)
  • Final examination (30%, according to Stanford calendar: Wednesday, 12/11/2019 @ 3:30 PM, location TBD)

Lecture Notes

Course Schedule

Date Week Topic Reading Notes
09/23/2019 Week 1 Lecture 1 Course introduction and review Syllabus
09/25/2019 Week 1 Lecture 2 Review CH: 1
09/27/2019 Week 1 Lecture 3 Some tips on R Homework 1 posted
09/30/2019 Week 2 Lecture 4 Simple linear regression 1 (introduction, correlation, model, estimation) CH: 2.1-2.4
10/02/2019 Week 2 Lecture 5 Simple linear regression 2 (inference and prediction) CH: Chapter 2.5-2.8
10/04/2019 Week 2 Lecture 6 Diagnostics for simple linear regression CH: 2.9 Homework 2 posted, Homework 1 Due
10/07/2019 Week 3 Lecture 7 Multiple linear regression 1 (introduction, model, estimation, geometry of least squares) CH: 3.1-3.5
10/09/2019 Week 3 Lecture 8 Multiple linear regression 2 (interpretation, matrix formulation, estimation, inference) CH: 3.6-3.9
10/11/2019 Week 3 Lecture 9 Multiple linear regression 3 (prediction, contrasts, testing) CH: 3.10-3.11 Homework 3 posted, Homework 2 Due
10/14/2019 Week 4 Lecture 10 Diagnostics in multiple linear regression (types of residuals, influence) CH: 4
10/16/2019 Week 4 Lecture 11 Diagnostics in multiple linear regression (outlier detection, residual plots) CH: 4
10/18/2019 Week 4 Lecture 12 Interactions and qualitative variables (interactions) CH: 5 Homework 4 posted, Homework 3 Due
10/21/2019 Week 5 Lecture 13 Interactions and qualitative variables (visualization, ANOVA) CH: 5
10/23/2019 Midterm Examinations
10/25/2019 Week 5 Lecture 14 ANOVA models (one-way ANOVA, testing, contrasts) CH: 5
10/28/2019 Week 6 Lecture 15 ANOVA models (two-way ANOVA, testing, contrasts, mixed effects model) CH: 5
10/30/2019 Week 6 Lecture 16 Transformations and Weighted Least Squares CH: 6,7
11/01/2019 Week 6 Lecture 17 Correlated errors CH: Chapter 8,9 Homework 5 posted, Homework 4 Due
11/04/2019 Week 7 Lecture 18 Correlated errors CH: Chapter 8,9
11/06/2019 Week 7 Lecture 19 Bootstrapping regression An Introduction to the Bootstrap by Bradley Efron, Robert Tibshirani, Chapter 9
11/08/2019 Week 7 Lecture 20 Model selection CH: 11 Homework 6 posted, Homework 5 Due
11/11/2019 Week 8 Lecture 21 Selection CH: 11
11/13/2019 Week 8 Lecture 22 Selection CH: 11
11/15/2019 Week 8 Lecture 23 Penalized regression CH: 10 Homework 7 posted, Homework 6 Due
11/18/2019 Week 9 Lecture 24 Penalized regression CH: 10
11/20/2019 Week 9 Lecture 25 Penalized regression CH: 10
11/22/2019 Week 9 Lecture 26 Logistic regression CH: 12 Homework 7 Due
11/25/2019 Thanksgiving Recess (no classes)
11/27/2019 Thanksgiving Recess (no classes)
11/29/2019 Thanksgiving Recess (no classes)
12/02/2019 Week 10 Lecture 27 Logistic regression CH: 12
12/04/2019 Week 10 Lecture 28 Poisson regression CH: Chapter 13.3
12/06/2019 Week 10 Lecture 29 Final Review Review will be posted
12/11/2019 End-Quarter examinations

R Markdown files

R Markdown files to create the lecture slides and PDFs are available in https://github.com/PratheepaJ/STATS191.