scott cunningham

ben h. williams professor of economics
baylor university

Gov 51: Data Analysis and Politics

Spring 2026 — Harvard University

Instructor: Scott Cunningham

Email: anthony_cunningham@fas.harvard.edu

Office: CGIS Knafel Building, Room H402

Office Hours: Tue/Thu 3:00–5:00 PM (Calendly)

Lectures: Tue/Thu 12:00–1:15 PM, Sever 103

Sections:

  • D001: Tue 1:30–2:45 PM, CGIS S001
  • D002: Thu 3:00–4:15 PM, Sever 302

Teaching Fellow: George Yean (gyean@fas.harvard.edu)

TF Office Hours: Thu 2:00–3:00 PM, CGIS K455 (Calendly)

Course Assistant: Harrison Huang (Office Hours)

Course Goals

The goal of this course is to give you the ability to understand, explain, and perform social science research, with a special focus on data analysis and causal reasoning. By the end of the semester, students will be able to:

  • Evaluate claims about causality in social science research
  • Summarize and visualize data effectively
  • Apply linear regression to analyze political and social data
  • Understand and quantify uncertainty in data analysis
  • Use professional tools including R, RStudio, git, and GitHub

You will be able to read and understand the methodology of most academic articles in the social sciences, and have a foot in the door of the data science world.

Course Format

The course consists of two 75-minute lectures per week and one required weekly discussion section led by Teaching Fellows. Lectures introduce key ideas in statistical inference and causal reasoning, grounding them in practical social science research. Students will learn to program in R, work with real-world datasets, and build both intuitive and technical understanding of modern empirical methods.

Discussion sections provide hands-on practice with statistical software and space to work through problem sets under Teaching Fellow guidance. Students should expect an interactive learning environment that moves between conceptual foundations, implementation, and interpretation.

Reminder: You can attend any section in a given week regardless of which one you're officially registered for.

Required Text

Either edition is fine:

  • Imai, Kosuke and Nora Webb Williams. Quantitative Social Science: An Introduction in tidyverse. Princeton University Press, 2022. Publisher | Amazon
  • Imai, Kosuke. Quantitative Social Science: An Introduction. Princeton University Press, 2018. Publisher | Amazon

Supplemental Texts

If you're seeking extra help:

Assignments and Grading

Component Weight Description
Problem Sets (4) 40% Four applied data analysis assignments using real-world datasets. Due Thursdays at 11:59pm via Gradescope.
Midterm Exams (2) 40% Two in-class exams (no notes, no computers).
Final Project 20% Independent data analysis on a topic of your choice. Individual or groups up to 3.

Late Policy: Late submissions lose 10% per day (e.g., 1 day late = 90% max score). After 7 days, late work receives a zero. This applies to both problem sets and final project milestones.

Schedule

Course Roadmap

Part Topic Approximate Timing QSS Chapters
I R and Data Skills Weeks 1–2 Chapter 1
II Statistical Foundations Weeks 3–4 Chapters 3, 5–6
III Inference and Regression Weeks 5–7 Chapters 4, 7
Spring Break (Mar 14–22)
IV Prediction and Machine Learning Weeks 8–10 Chapter 4
V Causal Inference Weeks 11–13 Chapter 2

Schedule

Topics for future weeks will be posted as we progress through the course.

Dates Topic Reading Slides R Script Assignment
Part I: R and Data Skills
Jan 27, 29 Introduction to R QSS 1.1–1.4 Tue | Thu R Script
Feb 3, 5 Data Visualization; Descriptive Statistics QSS 1.3, 3.1–3.3 Thu R Script
Part II: Statistical Foundations
Feb 10, 12 Text as Data; Covariance and Correlation Card et al. (PNAS 2022); QSS 3.5–3.6 Tue | Thu PS 1 (due Thu Feb 13)
Feb 17, 19 Sampling and Uncertainty; When Data Lies QSS 3.1–3.6; LaCour & Green (2014, retracted); Broockman, Kalla & Aronow (2015); Broockman & Kalla (2016) Tue | Thu
Part III: Inference and Regression
Feb 24, 26 Hypothesis Testing: p-values, t-statistics, and Standard Errors QSS 6.1–6.3, 7.1–7.2 Slides
Mar 3, 5 Bivariate Regression QSS 4.2–4.3 PS 2 (Thu Mar 5)
Data: gay.csv | gayreshaped.csv | ccap2012.csv
Mar 10, 12 Multivariate Regression and Review Exam 1 (Thu Mar 12)
Mar 14–22 Spring Recess — No Classes
Part IV: Prediction and Machine Learning
Mar 24, 26 Prediction in Social Science: Overfitting and Underfitting QSS 4.1–4.2 Proposal (Thu Mar 26)
Mar 31, Apr 2 Regularization: LASSO, Ridge, and the Bias-Variance Tradeoff Supplemental PS 3 (Thu Apr 2)
Apr 7, 9 Forecasting and the Bridge to Causation Supplemental Draft Analyses (Wed Apr 9)
Part V: Causal Inference
Apr 14, 16 Experiments, Omitted Variable Bias, and Instrumental Variables QSS 2.1–2.6 PS 4 (Thu Apr 16)
Apr 21, 23 Difference-in-Differences and Review Exam 2 (Thu Apr 23)
Apr 28 Project Presentations Presentations
May 11 Final Report Due (Exam Period)

Sections

Weekly sections provide hands-on practice with R and reinforce concepts from lecture. Attendance is expected.

Week Topic Slides R Script
Feb 4 Git Setup, Installing R, IPUMS Data Download
Feb 11 Basic Statistics, Making Figures, Quarto Slides R Script
Feb 18 Correlation and Sampling Slides R Script
Feb 25 Detecting Fraud and Testing Hypotheses Slides

Final Project

You will select a dataset and research question, then conduct an independent data analysis applying methods from the course. Projects may be completed individually or in groups of up to three.

Milestone Description Due
Proposal Short project proposal with evidence of dataset or data collection plan Mar 26
Draft Analyses Preliminary analysis and at least one visualization Apr 9
Presentations Brief in-class presentation of findings Apr 28
Final Report Polished report with all components May 11

Course Policies

Technology Policy

  • No phones in class (put them away, turned off)
  • No laptops for taking notes
  • Tablets with stylus are permitted, as is analog note-taking
  • Laptops may be used when we code, but otherwise should be put away
  • Accommodations are available for students who need them

AI Policy

See the AI Policy page on Canvas. The goal of this course is for you to learn to think with data. Using AI to generate answers defeats that purpose and will leave you unprepared for exams, which are completed in-class without AI assistance.

Submission and Grading

All assignments and exams will be submitted through Gradescope, accessed directly through Canvas. Regrade requests must be submitted through Gradescope within one week of grades being posted with a clear explanation.

Getting Help

If you're struggling, please reach out early. Come to office hours, attend TF sections, and use the course discussion board. Learning statistics is hard—confusion is normal and expected. What matters most to me is what you actually learn.

Resources