Statistical models beyond linear regression (2026)

2026-02-02

Master’s level. University of Copenhagen. Department of Political Science. 2026

Here you can find the slides and other material used for the different lectures.

Learning the basics

We will spend our first weeks familiarizing with R and linear models (OLS).

Week 1: Introduction

Hermansen 2023, ch. 1-4, p. 19-70

We start by familiarizing with the topic of statistical models when the main assumptions underlying linear models (OLS) are not present. We then move on to R.

Slides:

Day 1: Introduction to models beyond linear regressions
Day 2: Introduction to R

The best way to follow the class, is to code along. You can find info on how to install R (the statistical software) and RStudio (the interface) here or here.

Week 2: Descriptive statistics and graphical display

Hermansen 2023, ch. 5-6, p. 73-119

You have had the time to familiarize with base R, functions and objects. Let’s piece this together, and explore two new dialects: ggplot2 for graphical display and tidyverse for data manipulation/recoding.

Notebooks:

If you haven’t installed the RiPraksis package, you may fetch the data frame directly here: kap6.rda.

Week 3: Linear models and non-linear effects

Hermansen 2023, ch. 7-9, p. 123-194; Gelman 2007, ch 3-4, p. 29-79; Berry 2012, p 653-671; King Tomz and Wittenberg 2000

We will spend the week familiarizing with non-linear effects in linear models (interaction effects) and how to interpret model results.

Slides:

Slides: interpretation and uncertainty

R-notebook:

Problem set:

Problem set: interpretation and uncertainty

The data we will be working on are a subset of the replication data for “Blurred Lines between electoral and parliamentary representation: The use of constituency staff among Members of the European Parliament” European Union Politics (2023).

You can download the data here: MEP2014.rda

When data is structured

Week 4-5: Hierarchical/multilevel models

Gelman and Hill (2007), ch 11-13, p. 235-299

We start by going through the assumptions of the linear model in order to transition to instances where observations share common characteristics (they are not i.i.d.). We then go through the opportunities offered by hierarchical models: varying intercepts, varying slopes, 2-level regression and how these models pool information.

Slides:

Day 1: Assumptions of the linear model and grouped variation
Day 3: Overview over hierarchical models

R-notebook:

Day 1-4: R workflow for hierarchical models

Problem set:

Day 2: Problem set: fixed effects and Simpson paradox
Day 4: Problem set

You can download the data I use to exemplify linear assumptions (MEP2014.rda) and hierarchical structures (MEP.rda) here.

When outcomes are descrete

Week 6: Binary outcomes

Ward (2018), ch. 3, p. 43-78; ch. 6, p. 119-132

We will be working on how to model binary outcomes using a logit model. Our focus is on two different ways of understanding the logit model: As a regression on a latent variable or as a regression on a recoded dependent variable (logodds).

The data I will be exemplifying with are from “Shaping the Bench: The Effect of Ideology and Performance on Judicial Reappointments” (forthcoming). You can find the preprint here.

You can download the data here: Reappointments.rda

Slides:

Day 1-2: Binomial logistic regression: estimation and interpretation

Problem set:

Problem set: Intuitions from the binomial logistic model

R-notebook:

Day 2: R workflow for binomial logistic models

Week 7-8: Categorical outcomes

Ward 2018, ch. 8-9, p. 141-189

Week 10: Count outcomes

Ward 2018, ch. 10, p. 190-216

Week 11 and 12: Duration outcome

Ward (2018), ch. 11, p. 190-216

Week 14: Missing data

Our data frequently contains missing observations. This week, we take the time to reflect and explore what kind of missing observations we have and whether it may bias our results. It can be useful to have an active approach to how to address missing information. We will be using the MEP.rda data on the European Parliament.

Complete syllabus

Please familiarize with the syllabus.

Course plan

Week	Topic		Reading
1	Introduction to R as a statistics software	02.02-05.02	@Hermansen2023, ch. 1-4, p. 19-70
2	Descriptive statistics and graphical display	09.02-12.02	@Hermansen2023, ch. 5-6, p. 73-119
3	Linear regression	16.02-19.02	@Hermansen2023, ch. 7-9, p. 123-194
			@Gelman2007, ch 3-4, p. 29-79
			@Berry2012
			@King2000

4-5	Hierarchical data structures	23.02-05.03	@Gelman2007, ch 11-12, p. 235-278
			@Gelman2007, ch 13, p. 279-300
			@Gelman2007, ch 14-15, p. 301-342

6-7	Binary outcomes (logistic regression)	09.03-19.03	@Ward2018, ch. 3, p. 43-78
			@Ward2018, ch. 6, p. 119-132
			@Gelman2007, ch. 6, p. 109-134 (supplementary reading)

	Assignment 1 is given	19.03

8	Workshop week	23.03-26.03	Assignment 1 presentation, Assignment helpdesk

	Assignment 1 due (optional)

	Spring break


9-10	Categorical outcomes (multinomial and ordered logistic regression)	09.04-23.04	@Ward2018, ch. 8-9, p. 141-189

11-12	Count outcomes (poisson, negative binomial and hurdle models)	27.-07.05	Linear Digressions: podcast on poisson distribution
			@Ward2018, ch. 10, p. 190-216
			@Gelman2007, ch. 6, p. 109-134 (supplementary reading)


13	Missing data	11.05-14.05	@Ward2018, ch 12, p. 249-270
			@Gelman2007, ch 25, p. 529-545

	Assignment 2 is given	14.05

14	Workshop week / Recap	18.05-21.05	helpdesk; SUBJECT TO CHANGE

	Deadline portfolio exam	01.06

Literature

Berry, William D., Matt Golder, and Daniel Milton. 2012. “Improving Tests of Theories Positing Interaction”". The Journal of Politics.

Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Leiden: Cambridge University Press.

Hermansen, Silje Synnøve Lyder. 2023. R i praksis - en introduktion for samfundsvidenskaberne. 1st ed. Copenhagen: DJØF Forlag.

King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science. 44 (2): 341–55.

Ward, Michael D., and John S. Ahlquist. 2018. Maximum Likelihood for Social Science: Strategies for Analysis. Analytical Methods for Social Research. Cambridge: Cambridge University Press.

Silje Synnøve Lyder Hermansen

Assistant Professor

Silje’s research concerns democratic representation in courts and parliaments. She also teaches various courses in research methods and comparative politics.

Statistical models beyond linear regression (2026)

Learning the basics

Week 1: Introduction

Week 2: Descriptive statistics and graphical display

Week 3: Linear models and non-linear effects

When data is structured

Week 4-5: Hierarchical/multilevel models

When outcomes are descrete

Week 6: Binary outcomes

Week 7-8: Categorical outcomes

Week 10: Count outcomes

Week 11 and 12: Duration outcome

Week 14: Missing data

Complete syllabus

Course plan

Literature

Silje Synnøve Lyder Hermansen

Assistant Professor

Related