Master’s level. University of Copenhagen. Department of Political Science. 2023
Here you can find the slides and other material used for the different lectures.
Learning R
We will spend our first three weeks familiarizing with R.
Week 1 (theory)
Hermansen 2023, ch. 1-4, p. 19-70
We start by familiarizing with the topic of statistical models when the main assumptions underlying linear models (OLS) are not present. We then move on to R.
Slides:
Week 2 (lab week)
Hermansen 2023, ch. 5-6, p. 73-119
You have had the time to familiarize with base R, functions and objects. Let’s piece this together, and explore two new dialects: ggplot2
for plotting and tidyverse
for data manipulation.
Notebooks:
If you haven’t installed the RiPraksis
package, you may fetch the data here: kap6.rda
Week 3 (lab week)
Hermansen 2023, ch. 7-9, p. 123-194; Gelman (2007), ch 3-4, p. 29-79
We will be working on linear models (OLS) in R. Our focus is on interpretation. Using the R codes and concepts from last week, we will be making visual and textual interpretations of different model results.
The data we will be working on are a subset of the replication data for “Blurred Lines betwen electoral and parliamentary representation: The use of constituency staff among Members of the European Parliament” European Union Politics (2023).
You can download the data here: MEP2014.rda
Notebook:
Week 4 (theory week): Binary outcomes
Ward (2018), ch. 3, p. 43-78; ch. 6, p. 119-132
We will be working on how to model binary outcomes using a logit model. Our focus is on two different ways of understanding the logit model: As a regression on a latent variable or as a regression on a recoded dependent variable (logodds).
The data I will be examplifying with are a subset of the replication data for “Blurred Lines betwen electoral and parliamentary representation: The use of constituency staff among Members of the European Parliament” European Union Politics (2023).
You can download the data here: MEP2016.rda
Slides:
Week 5 (R week)
Ward (2018), ch. 3, p. 43-78; ch. 6, p. 119-132.
Suggested supplementary reading Hermansen (2023), ch 10; Gelman and Hill (2007), ch 5
We will continue our work on binary outcomes using the logit model. The notebook covers R codes for model interpretation, but our focus will be on model evaluation. Be prepared to share your answers to the problem set.
Notebook:
Week 6 (theory week): Categorical and ordered outcomes
Ward (2018), ch. 8-9, p. 141-189
Slides/notebook:
Week 7 (theory week): Event count outcomes
Ward (2018), ch. 10, p. 190-216
Suggested supplementary reading Gelman and Hill (2007), ch 6.2 p. 110-116
Slides/notebook:
Week 8 (R week): Event count outcomes
Ward (2018), ch. 10, p. 190-216
Suggested supplementary reading Gelman and Hill (2007), ch 6.2 p. 110-116
We’ll use yet another data set on MEPs; this time on the number of legislative proposals they handle during their tenure. You can download the data here: df_yoshinaka.rda
Slides/notebook:
Week 9 (theory week): Event history outcomes
Ward (2018), ch. 11, p. 190-216
Slides/notebook:
Week 10 (R week): Event history outcomes
Ward (2018), ch. 11, p. 190-216
We will be working with a classical dataset on duration models used by Box-Steffensmaier’s 1996 study of candidates’ campaign funding and the entry of challengers. You can find the R-version of the data for the notebook here: warchest.rda
Slides/notebook:
Week 11 (Theory week): Hierarchical models
Gelman and Hill (2007),ch 11-12, p. 235-278
You can find the data for the slides here: MEP.rda
Slides/notebook:
Week 12 (R week): Hierarchical models
Gelman and Hill (2007), ch 11-12, p. 235-278
You can find the data for the slides here: MEP.rda
Slides/notebook:
Week 13 (theory week): Missing data
Ward (2018), ch 12, p. 249-270; Gelman (2007), ch 25, p. 529-545
You can find the data for the all our activities here: MEP.rda
Slides/notebook:
Week 14 (R week): Missing data
Ward (2018), ch 12, p. 249-270; Gelman (2007), ch 25, p. 529-545
You can find the data for the all our activities here: MEP.rda
Slides/notebook:
Complete syllabus
Please familiarize with the syllabus before our first meeting.
Course plan
Week | Topic | Date | Reading |
---|---|---|---|
1 | Introduction to R as a statistics software | 09.02 | Hermansen (2023), ch. 1-4, p. 19-70 |
2 | Descriptive statistics and graphical display | 16.02 | Hermansen (2023), ch. 5-6, p. 73-119 |
3 | Linear regression | 23.02 | Hermansen (2023), ch. 7-9, p. 123-194 |
Gelman (2007), ch 3-4, p. 29-79 | |||
4-5 | Binary outcomes (logistic regression) | 02.03; 09.03 | Ward (2018), ch. 3, p. 43-78 |
Ward (2018), ch. 6, p. 119-132 | |||
6-7 | Categorical outcomes (multinomial and ordered logistic regression) | 09.03; 16.03 | Ward (2018), ch. 8-9, p. 141-189 |
8-9 | Count outcomes (poisson, negative binomial and hurdle models) | 23.03; 30.03 | Linear Digressions: episode on the poisson distribution |
Ward (2018), ch. 10, p. 190-216 | |||
10-11 | Event history data (survival models) | 13.04; 20.04 | Ward (2018), ch. 11, p. 190-216 |
12-13 | Hierarchical data structures | 27.04; 4.05 | Gelman (2007), ch 11-12, p. 235-278 |
Gelman (2007), ch 13, p. 279-300 | |||
Gelman (2007), ch 14-15, p. 301-342 | |||
14-15 | Missing data | 11.05; 16.05 | Ward (2018), ch 12, p. 249-270 |
Gelman (2007), ch 25, p. 529-545 |
Literature
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Leiden: Cambridge University Press.
Hermansen, Silje Synnøve Lyder. 2023. R i praksis - en introduktion for samfundsvidenskaberne. 1st ed. Copenhagen: DJØF Forlag.
Ward, Michael D., and John S. Ahlquist. 2018. Maximum Likelihood for Social Science: Strategies for Analysis. Analytical Methods for Social Research. Cambridge: Cambridge University Press.