We can think of hierarchical/multilevel models as a way to partition and leverage variation at different levels.

We begin to will work on cross-sectional time-series data on MEPs in the 2012-2017-period.

#libraries
library(lme4) #hierarchical models
library(stargazer) #regression tables
library(dplyr) #data wrangling + pipes
library(ggplot2) #graphics

#download
download.file(url = "https://siljehermansen.github.io/teaching/beyond-linear-models/MEP.rda",
              destfile = "MEP.rda")

#Load in
load("MEP.rda")

#Rename the data for convenience
df <- MEP

We will work on an unbalanced panel data (the same as in the R notebook). Each Member of the European Parliament (MEP) is observed every 6th month in the 2012-2017 period (identified as ID).

This time, we’re interested in whether MEPs use their parliamentary resources for electoral purposes. There are two empirical implications of this:

First, do MEPs with higher incentives to cultivate a personal vote have larger teams of local assistants? I measure this in two ways: At the national level (NationalPartyCentered) and at the European level (OpenList).
Second, do they gear up by hiring more assistants before elections? I als measure this at both levels (ProxNatElection and EPElection).

Our dependent variable is the local staff size of each MEP: ShareOfLocalAssistants.

Exercise 1: Descriptive statistics

Let’s start by exploring the data structure.

What is the size of the data? How many individual MEPs are there in the data set? How often are they observed?
How is the relevant variation here? How are the variables measured? What is the within-group and between group variation?
How could I model this? What would be the advantages and drawbacks? I.e. what variation would I leverage?

Exercise 2: Choice of covariates

I want to model the change in staff size as a function of the electoral calendar. How can I do this?
I’m considering a fixed-effects model at the individual level. Please advise me on the following covariates:
- electoral calendar
- electoral system
- labor cost (LaborCost)
- gender (Female)
- age (Age)
- lag of the dependent variable (ShareOfLocal.lag)
Now, I’m considering a fixed-effect on time-period. Would this be a good idea? What would be the effects of my two election variables? (EPElection; ProxNatElection).
How do you think the random-effects model would perform here?

Exercise 3: Fit and interpret the model

Fit the following model as a pooled linear regression, fixed-effects model and a random effects model (with varying individual intercepts): ShareOfLocalAssistants ~ ProxNatElection + NationalCandidateCentered + EPElection + OpenList + ShareOfLocal.lag. Present the results in stargazer().
What do you find? What happened?
Discuss my choice of variables.

Exercise 4: Interpretation

Interpret the results from the random-effects model.

What is the effect of the two measures of the electoral calendar and the electoral system?

Create two scenarios and interpret either the marginal effect or the first-difference. Justify your choice.
Descriptive statistics from a survey conducted among MEPs shows that some 31% claim that they envision staying in Parliament for 10 or more years (i.e. they will seek reelection). How does this change your understanding of the results?
Illustrate the effects. What plot would you opt for?
In your opinion, are the two hypotheses supported? How big are the electoral incentives?

Exercise 5: Interaction effect

I have a third hypothesis: I believe MEPs with higher incentives to cultivate a personal vote are more sensitive to the electoral calendar.

How could I model this?
Can you implement your suggestion using a random-intercept model?
Can you interpret the results following the Berry et al.s recommendation?

What is the effect of:

national electoral calendar when electoral system is candidate-centered
national electoral calendar when electoral system is party-centered
national electoral system when electoins are far away
national electoral system when electoins are tomorrow

Problem set 2: Hierarchical modeling

Silje Synnøve Lyder Hermansen

2025-03-05

Exercise 1: Descriptive statistics

Exercise 2: Choice of covariates

Exercise 3: Fit and interpret the model

Exercise 4: Interpretation

Exercise 5: Interaction effect

Literature