Master’s level. University of Copenhagen. Department of Political Science. 2025
Here you can find the slides and other material used for the different lectures.
Learning the basics
We will spend our first weeks familiarizing with R and linear models (OLS).
Week 1: Introduction
Hermansen 2023, ch. 1-4, p. 19-70
We start by familiarizing with the topic of statistical models when the main assumptions underlying linear models (OLS) are not present. We then move on to R.
Slides:
The best way to follow the class, is to code along. You can find info on how to install R (the statistical software) and RStudio (the interface) here or here.
Week 2: Descriptive statistics and graphical display
Hermansen 2023, ch. 5-6, p. 73-119
You have had the time to familiarize with base R, functions and objects. Let’s piece this together, and explore two new dialects: ggplot2
for graphical display and tidyverse
for data manipulation/recoding.
Notebooks:
- Data manipulation: dialects and tidyverse pipes
- Descriptive statistics: numeric summaries and visuals using ggplot2
If you haven’t installed the RiPraksis
package, you may fetch the data here: kap6.rda.
Week 3: Linear models and non-linear effects
Hermansen 2023, ch. 7-9, p. 123-194; Gelman 2007, ch 3-4, p. 29-79; Berry 2012, p 653-671; King Tomz and Wittenberg 2000
We will spend the week familiarizing with non-linear effects in linear models (interaction effects) and how to interpret model results.
Slides:
R-notebook:
Problem set:
The data we will be working on are a subset of the replication data for “Blurred Lines betwen electoral and parliamentary representation: The use of constituency staff among Members of the European Parliament” European Union Politics (2023).
You can download the data here: MEP2014.rda
When data is structured
Week 4-5: Hierarchical/multilevel models
Gelman and Hill (2007), ch 11-13, p. 235-299
We start by going through the assumptions of the linear model in order to transition to instances where observations share common characteristics (they are not i.i.d.). We then go through the opportunities offered by hierarchical models: varying intercepts, varying slopes, 2-level regression and how these models pool information.
Slides:
- Day 1: Assumptions of the linear model and grouped variation
- Day 2-3: Overview over hierarchical models
R-notebook:
Problem set:
- Day 3: Problem set
- Day 4: Problem set
You can download the data I use to exemplify linear assumptions (MEP2014.rda) and hierarchical structures (MEP.rda) here.
Complete syllabus
Please familiarize with the syllabus.
Course plan
Week | Topic | Date | Reading |
---|---|---|---|
1 | Introduction to R as a statistics software | 03.02; 05.02 | @Hermansen2023, ch. 1-4, p. 19-70 |
2 | Descriptive statistics and graphical display | 10.02; 12.02 | @Hermansen2023, ch. 5-6, p. 73-119 |
3 | Linear regression | 17.02; 19.02 | @Hermansen2023, ch. 7-9, p. 123-194 |
@Gelman2007, ch 3-4, p. 29-79 | |||
@Berry2012 | |||
@King2000 | |||
4-5 | Hierarchical data structures | 24.02; 26.03 | @Gelman2007, ch 11-12, p. 235-278 |
03.03; 05.03 | @Gelman2007, ch 13, p. 279-300 | ||
@Gelman2007, ch 14-15, p. 301-342 | |||
6 | Binary outcomes (logistic regression) | 10.03; 12.03 | @Ward2018, ch. 3, p. 43-78 |
@Ward2018, ch. 6, p. 119-132 | |||
@Gelman2007, ch. 6, p. 109-134 (supplementary reading) | |||
7-8 | Categorical outcomes (multinomial and ordered logistic regression) | 17.03; 19.03; 24.03; 26.03 | @Ward2018, ch. 8-9, p. 141-189 |
Assignment 1 is given | 26.03 | ||
9 | Workshop week | 31.03; 02.04 | Assignment 1 presentation, Assignment helpdesk, Dynamic reporting |
10-11 | Count outcomes (poisson, negative binomial and hurdle models) | 07.04; 09.04 | Linear Digressions: podcast on poisson distribution |
@Ward2018, ch. 10, p. 190-216 | |||
@Gelman2007, ch. 6, p. 109-134 (supplementary reading) | |||
Assignment 1 due (optional) | 11.04 | ||
Spring break | |||
11-12 | Event history data (survival models) | 23.04; 28.04; 30.04 | @Ward2018, ch. 11, p. 190-216 |
Assignment 2 is given | 30.04 | ||
13 | Workshop week | 05.05; 07.05 | Practitioner visit (Epinion); helpdesk postponed SUBJECT TO CHANGE |
14 | Missing data | 12.05; 14.05 | @Ward2018, ch 12, p. 249-270 |
@Gelman2007, ch 25, p. 529-545 | |||
Assignment 2 is due | 16.05 | ||
15 | Recap | 19.05 | |
Deadline portfolio exam | 01.06 |
Literature
Berry, William D., Matt Golder, and Daniel Milton. 2012. “Improving Tests of Theories Positing Interaction”". The Journal of Politics.
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Leiden: Cambridge University Press.
Hermansen, Silje Synnøve Lyder. 2023. R i praksis - en introduktion for samfundsvidenskaberne. 1st ed. Copenhagen: DJØF Forlag.
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science. 44 (2): 341–55.
Ward, Michael D., and John S. Ahlquist. 2018. Maximum Likelihood for Social Science: Strategies for Analysis. Analytical Methods for Social Research. Cambridge: Cambridge University Press.