Models for nested data: Workflow for hierarchical models

In this notebook, we will be going through the workflow of when we analyze hierarchical data.

We will familiarize with:

the workflow for structured data analysis
descriptive statistics:
- within-group variation
how to estimate hierarchical models using the lme4 package
- with varying intercepts
- with varying intercepts and varying slopes
- cross-level interactions
- with level-2 variables
interpretation

New/repeated R codes:

dplyr package: lag(), filter(), arrange(), select()
ggplot2 package: facet_wrap(), geom_abline()
lme4 package: lmer(), ranef(), fixef(), coef()
merMod package: REsim() REplot()
sjPlot package: plot_model(), plot_models()
stargazer package: stargazer(., add.lines = , column.labels = )

We will use the data on Members of the European Parliament (MEPs). This time, the data includes several groupings; it is a time-series cross-sectional (imbalanced panel) data in which each MEP is observed every 6 months over a 5-year period. Thus, I have observations of individuals (ID) in up to 10 periods (Period). These individuals also have a nationality (Nationality). Thus, I have at least 3 “groupings” that are relevant for different reasons.

accounting for individual groupings allows me to adjust the standard error of my predictors.
accounting for nationality could capture potential confounders at the member state level, including the simpsons paradox
accounting for time may allow me to control away potential periodic changes

I begin by loading in my data. I also rename a few of my variables for ease.

load("MEP.rda")
df <- MEP


#Rename the two main variables
df <-
  df %>%
  mutate(
    #The number of local assistants employed by each MEP in 2014
    y = ShareOfLocalAssistants,
    #The vote share (in percentage points) that the national party got last election
    x = SeatsNatPal.prop)

I pick my models as part of my research design. That is, I ask myself what is the most relevant correlation/variation, given my theory.

in experiments: you can create that variation and randomize the rest (cut out confounders)
in observational studies: you’ll have to “hunt” for the variation you want and control away the rest

Work-flow

This hunt for variation determines my workflow.

empirical implications of theory I start as any political scientist and think about the data generating process and the causal mechanisms (if that’s what I’m interested in) to identify which variation I want.

May research question this time is whether MEPs compensate for lack of (national) party funding by hiring local assistants. I don’t have a good measure of party funding, so instead I rely on the MEP’s national party’s size in the national parliament. Often, state funding for parties is determined by the size of the party, although the rules may change between countries.

Ideally, if this hypothesis holds, I would see that the same MEP would change their hiring practices following national elections that change the party’s seat share in parliament. This is a within-MEP (within-group) variation that would hold both national-level rules and individual-level idiosynchracies constant.

To leverage this variation I would run a varying-intercept model; either as a fixed-effects model (without pooling) or a hierarchical model (with partial pooling).

feasabilty (summary statistics)

My next move is to check if I have enough variation in both my x and my y to actually draw on that type of variation. This means I will spend additional time on the descriptive statistics.

control for confounders

Last, I’d consider the potential confounders that could bias my results. They can either be controlled away by explicitly observing the variable (including it) or by locating where the confounderes might be located; i.e. I’d identify group identities and leverage the within-group variation.

If I draw on within-MEP variation, my biggest threat would come from time-varying events. In this data, there was a European election in 2014 (EPElection) and an administrative reform that capped the local spending implemented in 2016 (and onwards; Reform2016).

Run the regression/estimate your model

The estimation of the model itself is usually less challenging. However, when you have large data and/or many groupings, the estimation may take some time.

Interpretation

The same principles of interpretation applies for hierarchical models as for other models. However, this time, there are more parameters in the model, and we are more reliant on careful descriptive statistics for constructing relevant scenarios.

Feasability: descriptive statistics

Exploring the variables when our data has a hierarchical structure requires a few additional steps. In addition to exploring the distribution of the \(x\) and \(y\) in the pooled (entire) data set, I also explore their distribution within the groups to see whether I have within-group variation.

Univariate statistics

Pooled data

I begin by my usual histogram of x and y. Both my variables of interest are continuous.

df %>%
  ggplot +
  geom_histogram(aes(y)) +
  ggtitle("Pooled distribution of OUTCOME: Size of local staff among MEPs")

df %>%
  ggplot +
  geom_histogram(aes(x)) +
  ggtitle("Pooled distribution of PREDICTOR: Size of national party")+
  xlab("Proportion of seats in national parliament")

Within-group variation: non-pooled analysis

Second, I check the same distribution within my main grouping.

Visual inspection (few groups)

If I have few groups, then a visual inspection using a histogram or a barplot of the univariate distribuition might be efficient. In ggplot2, the facet_wrap() function allows you to create a grid of plots, one for each group.

df %>%
  ggplot +
  geom_histogram(aes(y)) +
  ggtitle("Size of local staff among MEPs") +
  facet_wrap(~Nationality)

Inspecting the spread on the x-axis, I see that there is at least some (and sometimes a lot) variation within the member states.

In numbers (few or many groups)

Often, I’ll also be interested in quantifying the variation. This is the case when I have many groups, but also to tell the reader how many times there is a change. In a fixed-effects model – leveraging only within-group variation – the effect of your treatment is driven only by the groups that have experienced a change. In other words, the size of your data set does not reveal the number of observations that you actually study.

Last, the within-group variation would be the most interesting to draw on when you construct scenarios for your interpretation.

A relevant question is whether staff size varies within MEPs. Will I be able to leverage within-individual changes to assess whether MEPs respond to different electoral incentives? Or will I have to rely on cross-sectional variation (between-individuals)?

Size of change I can check the variation by comparing each observation with the previous observation within the group. When data involves time, I’ll order the data chronologically, group by MEP identity, then calculate the difference between the observation and its lag.

df <-
  df %>%
  #Chronological
  arrange(Period) %>%
  #Within individuals
  group_by(ID) %>%
  #Difference in staff size between y (this observation) and lag of y (previous observation)
  mutate(change_y = y - lag(y),
         #Same for x, the party's seat share
         change_x = x - lag(x)) %>%
  #Ungroup data
  ungroup()

I can now make a frequency table to see the number of times there has been a change (change_x != 0) and the number of times there has not been a change (change_x == 0).

table(df$change_x != 0)

## 
## FALSE  TRUE 
##  5172   623

table(df$change_y != 0)

## 
## FALSE  TRUE 
##  3752  2217

There are more changes in my outcome variable than in my predictor, so I won’t explain everything that goes on with only one predictor. However, I have plenty of changes, so I can leverage within-group variation.

I can summarize this either as number of changes (count) or size of change (continuous).

Change measured by numbers We can even look at how many times each MEP has experienced a change.

df_change <- 
  df %>%
  #Group by individuals MEPs have individual ID numbers
  group_by(ID) %>%
  #Sum over the number of changes within each group/individual
  reframe(n_changes_x = sum(change_x != 0, na.rm = T),
          n_changes_y = sum(change_y != 0, na.rm = T))

I can then plot this new data frame as a bar plot.

df_change %>%
  ggplot +
  #Number of occurences of each n_changes_x
  geom_bar(
    aes(
      n_changes_x
    )
  ) +
  ggtitle("Number of changes in x") +
  xlab("Number of alteration in each MEPs' national party size")

Most of the MEPs that have experienced a change, have only experienced it once.

Your turn!

can you do the same for the dependent variable?

Change measured by size The size of the change is most useful for constructing scenarios later, because it tells me what the “typical” change might be. However, I’m more interested in what the likely size of a change is if there has been a change. I therefore subset the data to only observations with a change.

df %>%
  filter(change_x != 0) %>%
  ggplot +
  geom_histogram(aes(change_x )) +
  ggtitle("Typical change in party size after national elections")

#Summary stats in numbers
df %>%
  #Filter out the non-changes
  filter(change_x != 0) %>%
  #I specify which package to draw the select() function from; because several packages have functions with same name
  dplyr::select(change_x) %>%
  summary(change_x)

##     change_x        
##  Min.   :-0.470769  
##  1st Qu.:-0.053869  
##  Median :-0.003333  
##  Mean   :-0.007253  
##  3rd Qu.: 0.052754  
##  Max.   : 0.470769

Sometimes the change in x is positive – the MEPs’ party gained more seats in the national parliament – other times it is negative. From the summary statistics, we see that a typical change would be +/- 0.05 (5 percentage point change in either direction). This is useful for the scenarios later on where my interpretation will be “conditional on a national election, the typical change in party size…”.

Bivariate statistics

Let’s explore the bivariate relationship in a pooled data first, before we check the within-group correlation. This corresponds to a non-pooled analysis; i.e. I analyze the relationship separately for each group.

Pooled data

Here, I plot both a smoothed line and a bivariate linear regression.

df %>%
  ggplot +
  geom_smooth(aes(x = x,
                  y = y))+
  geom_smooth(aes(x = x,
                  y = y),
              method = "lm", 
              lty = 2,
              color = "black") +
  ylab("Size of local staff") +
  xlab("Party size (seat share in national parliament)") +
  ggtitle("Bivariate relationship between party size and local investment")

The loess line indicates first a drop, then an increase etc. What is going on? The liner model says I should expect more local investment among MEPs that received large support last election; not really what I expected.

Non-pooled data

Let’s check how this looks within member states. Member states have different financing laws for parties and vary by labor cost, so we may be tapping in to some cross-national variation unrelated to party funding.

With many groups, it is easier to get a sense of the relationship when I don’t plot the uncertainty. The same goes for linear predictions instead of loess lines (local regression). The larger the data, the longer it will take to calculate a loess line as well.

df %>%
  ggplot +
  geom_smooth(aes(x =x,
                  y = y,
                  color = Nationality),
              se = F, method = "lm") +
  ggtitle("Within-state relationship between party size and local staff size") +
  theme(legend.position = "none")

I can even run a separate regression (draw a separate line) per MEP, then parse them out by nationality

df %>%
  ggplot +
  geom_smooth(aes(x =x,
                  y = y,
                  group = Lastname),
              color = "black",
              alpha = 0.5, 
              lwd = 0.5,
              se = F, method = "lm") +
  ggtitle("Within-individual relationship between party size and local staff size") +
  #Crop the y-coordinates to zoom in on most of the observations
  coord_cartesian(ylim = c(0,7)) +
  #Remove the legend
  theme(legend.position = "none") + 
  #Create a separate grid per member state
  facet_wrap(~Nationality)

We get a sense that the effect might in general be negative, also within MEPs. However, there are exceptions.

Analysis

Let’s run some models instead. Running the models is usually the part of the research process that takes the least time (although, with large data and many intercepts, R might take a few seconds to arrive with an answer).

Random intercepts

I usually start with a simple pooled analysis with one predictor.

mod.pool <- lm(y ~ x ,
               df)

summary(mod.pool)

## 
## Call:
## lm(formula = y ~ x, data = df)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -2.561 -1.561 -0.519  0.652 40.470 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.56125    0.05743  44.596   <2e-16 ***
## x           -0.44420    0.18926  -2.347    0.019 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.803 on 6946 degrees of freedom
##   (195 observations deleted due to missingness)
## Multiple R-squared:  0.0007924,  Adjusted R-squared:  0.0006486 
## F-statistic: 5.509 on 1 and 6946 DF,  p-value: 0.01895

I already here find indications that MEPs hold a smaller local staff when their parties have many seats in the national parliament.

I then add individual intercepts to leverage within-individual variation. To do so, I use the lmer() function from the lme4 package.

mod.id.ran <- lmer(y ~ x + (1|ID),
                   df)

summary(mod.id.ran)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ x + (1 | ID)
##    Data: df
## 
## REML criterion at convergence: 28382.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -7.5855 -0.3152 -0.0619  0.2933 25.4528 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 6.545    2.558   
##  Residual             2.160    1.470   
## Number of obs: 6948, groups:  ID, 1152
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   2.5810     0.1101  23.433
## x            -0.3275     0.3232  -1.013
## 
## Correlation of Fixed Effects:
##   (Intr)
## x -0.707

I get a weaker and insignificant effect. Yet, I leverage the right kind of variation, don’t I?

Between-group variation I can also see from the R output that the spread in the between-MEP distribution is quite high, with a standard deviation of 2.558. It means that I should expect that the range in staff size among the 95% most “normal” MEPs is 2.558* 4 = 10.233 local employees (0 +/- 1.96 standard deviations covers 95% of my between-group variation).

The “residual variation” is what remains once the between-group variation has been removed. Here, I see that the within-MEP variation is lower (1.47).

We can compare the between- and within-group variation by estimating the intraclass correlation (Gelman and Hill 2007, 259). Numbers closer to 1 indicate that group-members are very similar (i.e. each MEP look like themselves), while numbers closer to 0 means there is little between-group variation, meaning the model is pooling.

\[\frac{\sigma^2_\alpha}{(\sigma^2_\alpha + \sigma^2_y)}\] \[\frac{2.558}{(2.558 + 1.47)} = 0.6350546\]

Judging by the intraclass correlation, my groupings make sense; adding information on the personal identity of my observations add to the estimation of the model.

Fixed effects to leverage within-group variation

I can always go for a fixed-effect model to leverage only within-individual variation (i.e. over time). This is equivalent to running separate regressions for each MEP, then aggregating/averaging the effect. There is no pooling of within- and between-MEP variation and all observations with within-group variation count equally.

mod.id.fix <- lm(y ~ x + 
                     as.factor(ID),
               df)

stargazer(mod.id.ran, mod.id.fix,
          type = "html",
          #Don't report coefficients from variables containing the characters "as.factor"
          omit = "as.factor")


	Dependent variable:

	y
	linear	OLS
	mixed-effects
	(1)	(2)

x	-0.328	-0.667
	(0.323)	(0.467)

Constant	2.581^***	1.731^***
	(0.110)	(0.472)


Observations	6,948	6,948
R²		0.772
Adjusted R²		0.727
Log Likelihood	-14,191.110
Akaike Inf. Crit.	28,390.230
Bayesian Inf. Crit.	28,417.610
Residual Std. Error		1.466 (df = 5795)
F Statistic		17.040^*** (df = 1152; 5795)

Note:	p<0.1; p<0.05; p<0.01

Nope. The effect is stronger, but still insignificant. If the true within-individual effect is masked, it is by something that varies over time (i.e. within groups/within MEPs).

The EP capped the local spending in 2016. This induced the MEPs that spent the most locally to hire fewer staffers. As we have theorized, this choice is also connected to party funding, so let’s control for that. A similar argument can be made for the EP election in 2014, since we also have seen that mobilization my vary by electoral system.

Here, I estimate a fixed-effects and a random-effects (both varying-intercept) model with time-dependent covariates.

mod.id.ran.t <- lmer(y ~ x 
                   + EPElection
                   + Reform2016
                   + (1|ID),
               df)


mod.id.fix.t <- lm(y ~ 
                     x +
                     EPElection +
                     Reform2016 +
                     as.factor(ID),
                   df)

stargazer(mod.id.ran.t, mod.id.fix.t,
          type = "html",
          column.labels = c("Random effects", "Fixed effects"),
          omit = "as.factor")


	Dependent variable:

	y
	linear	OLS
	mixed-effects
	Random effects	Fixed effects
	(1)	(2)

x	-0.650^**	-1.218^***
	(0.320)	(0.458)

EPElection	0.351^***	0.340^***
	(0.059)	(0.059)

Reform2016	-0.693^***	-0.696^***
	(0.047)	(0.047)

Constant	2.756^***	1.945^***
	(0.110)	(0.462)


Observations	6,948	6,948
R²		0.782
Adjusted R²		0.739
Log Likelihood	-14,063.500
Akaike Inf. Crit.	28,139.010
Bayesian Inf. Crit.	28,180.080
Residual Std. Error		1.434 (df = 5793)
F Statistic		18.010^*** (df = 1154; 5793)

Note:	p<0.1; p<0.05; p<0.01

Now, we’re talking! The strictly within-comparison of the fixed-effects model still gives the strongest estimate. This is because the hierarchical implementation clearly pools the results with between-MEP variation, so the result shrinks to the (between-group) mean. We can address this by adding more controls.

Several groups of random effects

Let’s leverage both within- and between-individual variation (for good sports) using random intercepts. However, I’d like to add yet another group variable: nationality. Nationality controls for other potential confounders. For example, both seats in parliament/party funding and staff size are reasonably linked to nationality: One because of the national laws, the other because of labor cost.

It is not a good idea to have this type of nestings in fixed-effects models; they would be non-identified. On the other hand, if this variation is strictly at the national level, it is subsumed in the MEP fixed effects (i.e. no MEP has two nationalities).

mod.ran <- lmer(y ~ 
                  x
                + Reform2016
                + EPElection
                + (1|ID)
                + (1|Nationality)
                ,
               df)

stargazer(mod.pool,  mod.id.ran.t, mod.id.fix.t, mod.ran,
          omit = "as.fa",
          #Give labels to the columns
          column.labels = c("Pooled", "Random effects", "Fixed effects", "Random effects"),
          #Add lines with additional info: this comes in a list where each vector == the number of columns in the table
          add.lines = list(c("Individual fixed/random effects","no","yes","yes","yes"),
                           c("National fixed/random effects","no","no","no","yes")),
          title = "Random effects (on individual MEPs and their nationality)",
          type = "html")

**Random effects (on individual MEPs and their nationality)**

	Dependent variable:

	y
	OLS	linear	OLS	linear
		mixed-effects		mixed-effects
	Pooled	Random effects	Fixed effects	Random effects
	(1)	(2)	(3)	(4)

x	-0.444^**	-0.650^**	-1.218^***	-1.489^***
	(0.189)	(0.320)	(0.458)	(0.288)

EPElection		0.351^***	0.340^***	0.351^***
		(0.059)	(0.059)	(0.059)

Reform2016		-0.693^***	-0.696^***	-0.697^***
		(0.047)	(0.047)	(0.047)

Constant	2.561^***	2.756^***	1.945^***	2.776^***
	(0.057)	(0.110)	(0.462)	(0.373)


Individual fixed/random effects	no	yes	yes	yes
National fixed/random effects	no	no	no	yes
Observations	6,948	6,948	6,948	6,948
R²	0.001		0.782
Adjusted R²	0.001		0.739
Log Likelihood		-14,063.500		-13,802.100
Akaike Inf. Crit.		28,139.010		27,618.210
Bayesian Inf. Crit.		28,180.080		27,666.130
Residual Std. Error	2.803 (df = 6946)		1.434 (df = 5793)
F Statistic	5.509^** (df = 1; 6946)		18.010^*** (df = 1154; 5793)

Note:	p<0.1; p<0.05; p<0.01

Interpretation

Often, our primary interest is in the main effects (according to the R output, the “fixed effects”) of the model. However, we may also draw information from the varying intercepts (\(\alpha\)).

\[y = a + bx_i + \alpha_j\]

There are two approaches to the interpretation.

Main (non-varying) effects (refer to the grand mean)

Refer to the grand mean (a) and construct a stylistic scenario while ignoring the varying intercepts (i.e. treat them as residuals). The interpretation of the effects then follow the exact same rules as in the pooled linear model.

\[y = a + bx_i + \epsilon_j\] The “fixed” (R vocabulary)/non-varying effects of the model can be interpreted and displayed as per usual.

fixef(mod.ran)

## (Intercept)           x  Reform2016  EPElection 
##   2.7758910  -1.4888207  -0.6970583   0.3505659

I can create scenarios and use the predict() function for point predictions, or I can use the ggeffects package, for example.

library(ggeffects)

#The varying part of the scenario; take inspiration from the descriptive statistics
vote_share <- seq(0, 0.5, 0.01)

eff <- ggpredict(mod.ran,
                 #What type of graphic would I like to illustrate: here, the "fixed"/non-varying effects
                 type = "fe",
                 terms = list(x = vote_share)
                 )
#Make the plot
eff %>% 
  plot +
  ylim(c(0,10)) +
  ylab("Local staff size") +
  xlab("Seat share in parliament") +
  ggtitle("Effect of party funding (size of party) on MEP's local spending")

The ggeffects package includes also variation from the groups, so the precision of the scenarios we make tend to be surrounded with more uncertainty than if we ran a pooled model.

Varying effects (refer to group intercepts)

We can also make the interpretation in relation to specific groups/intercepts.

Choose one or two groups/varying intercepts that you refer to in your scenarios. This is usually more interesting if your groupings are known quantities. In our example, there is little insight to gain from picking a single MEP. However, when my model includes national intercepts, then such scenarios may contribute to the reader’s understanding of your findings. This requires us to pull out the random effects from the model object.

We don’t see the random effects in the default R summary of the model, but they are there.

We can ask for just the random (varying) effects. They are reported as deviations from the grand mean.

ranef(mod.ran)$Nationality %>% head

Here, we see that MEPs from Austria have an average staff size of -0.44 employees fewer than the grand mean/the mean of means among MEPs.

We can also ask for all the model coefficients. Here, the two intercepts (the grand mean and the varying intercept) are already added together.

coef(mod.ran)$Nationality %>% head

We can then fill in the equation for two national groups: Bulgaria and Sweden, for example. I start by extracting the coeffients/parameters.

coef(mod.ran)$Nationality[c(3,27),]

\[y = (a + \alpha_1) + b_1x_{1i}+ b_2x_{2i}+ b_3x_{3i}\]

\[y_1 = (2.78 + 1.53) + -1.49x_1 + -0.7x_2 + 0.35x_3\] \[y_1 = 4.3 + -1.49x_1 + -0.7x_2 + 0.35x_3\]

I then construct my scenario. I can draw inspiration from the actual data or use the summary statistics.

#Actual data
df %>%
  filter(Nationality == "Bulgaria"| Nationality == "Sweden") %>%
  arrange(ID) %>%
  dplyr::select(Nationality, y, x, EPElection, Reform2016) %>%
  head()

I then fill in and make my predictions for two identical scenarios for two separate MEPs. Here, I use the predict() function instead of filling in by hand.

scenario <- data.frame(
  #Summary statistics showed that 0.05 is a typical change: i.e. here I take a baseline of 10 percent seat share, then increase to 0.15
  x = c(0.1, 0.15),
  EPElection = 0,
  Reform2016 = 0,
  #Change of x in both countries == 4 scenarios
  Nationality = c("Bulgaria", "Bulgaria", "Sweden", "Sweden"),
  #Model contains MEP incercepts, so I need a variable on that; here, I invent an ID that doesnt exist
  ID = c("new")
)  
scenario

#Predict
preds <- predict(mod.ran, 
                 newdata = scenario,
                 #Since I "invented" an MEP ID, I need to let the prediction draw randomly from the distribution of ID-intercepts 
                 allow.new.levels = T)
preds

##         1         2         3         4 
## 4.1559480 4.0815069 0.6732192 0.5987782

diff(preds)

##           2           3           4 
## -0.07444104 -3.40828773 -0.07444104

In pre-reform EP outside of election year, Bulgarian MEPs from a party holding 10% of the seats in the national parliament had an average predicted staff size of 4.16 local employees, while Swedish MEPs in the same situation held substantially fewer staffers (0.67). The staff size is furthermore predicted to decrease by -0.07 staffers for both MEPs if their party were to gain 5 percentage points more of the seat share in the next national election.

Distributions

Remember, the group means/varying intercepts are assumed to be normally distributed in the hierarchical model. Together, their means also constitute a normal distribution. In other words, we have 28 distributions at the member-state level and more than 1100 at the MEP level.

This is a lot of distributions (and assumptions), so there are some controversy over whether we can reliably estimate them. That’s why the lme4 package doesn’t report the random intercepts with any p-values. However, R has other packages.

arm has two twin functions: se.fixef() and se.ranef() that are useful.
merTools allows us to also simulate the random/varying coefficients.

REsim() simulates the distribution form the reported random intercept (point estimate/regression coefficient) and its standard error (within-group variation).

library(merTools)
re <- REsim(mod.ran,
            #Simulate 100 times
            n.sims = 100)
re %>% head

The best way is still to display these intercepts graphically. You can use the simulations to construct your own plots: coefplots or density plots are nice possibilities.

plotREsim() is a ready-made function for plotting.

plotREsim(re)

It gives us a sense of the variation between the groups. Here, we see that there is still quite some individual-level variation that we haven’t accounted for with other predictors. The groups marked in black have a 95% confidence interval that does not overlap zero. We can also see there are member states that have higher/lower level of local staff than what we might expect given the predictors in the model.

sjPlot reports effects from all kinds of models. It also gives a nice labelling of our random intercepts.

Here, I only ask for the varying intercept for nationality.

library(sjPlot)
plot_model(mod.ran,
           type = "re",
           terms = "Nationality")

## [[1]]

## 
## [[2]]

Red are states below average; blue are states above average. The color scheme is in relation to the point estimate, not the confidence interval.

Level-2 regression

I can add group-level predictors to my model. The random intercepts are a way to cluster the residuals. The level-two variables are then regressed on these group means.

In R it looks the same, though. Here, I add electoral system to the mix so that I regress the MEP-level residuals on the type of electoral system they compete in.

mod.ran.t.open <- lmer(y ~ 
                         x 
                       + OpenList
                       + EPElection
                       + Reform2016
                       + (1|ID)
                       ,
                       df)

summary(mod.ran.t.open)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ x + OpenList + EPElection + Reform2016 + (1 | ID)
##    Data: df
## 
## REML criterion at convergence: 28107.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -7.8842 -0.3212 -0.0456  0.3052 25.9205 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 6.414    2.533   
##  Residual             2.067    1.438   
## Number of obs: 6948, groups:  ID, 1152
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.38447    0.13602  17.530
## x           -0.56735    0.31900  -1.779
## OpenList     0.71073    0.15431   4.606
## EPElection   0.35047    0.05909   5.931
## Reform2016  -0.69270    0.04722 -14.670
## 
## Correlation of Fixed Effects:
##            (Intr) x      OpnLst EPElct
## x          -0.598                     
## OpenList   -0.591  0.053              
## EPElection -0.041 -0.020 -0.004       
## Reform2016 -0.106  0.055  0.003  0.091

The clustering means that the standard errors are also corrected to account for the correlation within the group (i.e. that observations are not i.i.d.). Here, it also means that the uncertainty of electoral system accounts for the fact that MEPs always competed under the same electoral rules during the period of study.

The interpretation of these models is the same as before, though.

Cross-level interactions: varying effects by predictors

We can make interactions across levels. I could formulate an expectation that MEPs from party-centered systems are more disciplined, i.e. they rely on their parties to give them a safe seat on the electoral list next election and are willing to chip in to the common pot by hiring more local assistants when their party is in need.

mod.ran.t.open.i <- lmer(y ~ 
                           x * OpenList
                         + Reform2016
                         + (1|ID),
                         df)

summary(mod.ran.t.open.i)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ x * OpenList + Reform2016 + (1 | ID)
##    Data: df
## 
## REML criterion at convergence: 28121.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -7.8233 -0.3115 -0.0612  0.3035 25.9173 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 6.258    2.502   
##  Residual             2.083    1.443   
## Number of obs: 6948, groups:  ID, 1152
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.72722    0.15456  17.645
## x           -1.75529    0.43690  -4.018
## OpenList     0.08607    0.21583   0.399
## Reform2016  -0.71099    0.04723 -15.055
## x:OpenList   2.61478    0.63564   4.114
## 
## Correlation of Fixed Effects:
##            (Intr) x      OpnLst Rf2016
## x          -0.718                     
## OpenList   -0.711  0.513              
## Reform2016 -0.072  0.016 -0.024       
## x:OpenList  0.490 -0.687 -0.707  0.037

Indeed. There is an interaction effect, but what does it mean? I usually construct scenarios here. Since I am interested in the main regression coefficients (in R lingo, the “fixed effects”), I can use the usual techniques for interpretation. Here, I use ggpredict() to construct my scenario, then plot it.

eff <- ggpredict(mod.ran.t.open.i,
                 terms = list(
                   #Scenarios where x is sequenced from 0 to 60% seat share, passing by 1% increments
                   x = seq(0, 0.6, 0.01), 
                   #Party- and candidate-centered electoral systems are "moderators"
                   OpenList = c(0, 1)))

eff %>% 
  plot + 
  ggtitle("Effect of party size conditional on the electoral system")

Considering the plot, we can see clearly that the effect of party funding on MEPs local investment is different depending on the electoral system. More party funding/larger seat share in party-centered systems (red line) lead MEPs to invest less. More party funding in candidate-centered systems lead MEPs to invest more.

According to Berry, Golder, and Milton (2012) I should interpret both of these effects and compare with my theory. What does this mean? And how robust is this? From the graphic, it seems that – while positive – the blue line is imprecise. Comparing the minimum and maximum of x, we see that the confidence intervals overlap. The effect in the red line is clearly distinct between the two extreme scenarios.

Another way of interpreting is in text and numbers. I can construct scenarios and calculate first-differences. When I do that, my focus is less on statistical significance, but instead with an aim to communicate the substantive meaning of my findings.

first_diff <- 
  eff %>% 
  as.data.frame() %>%
  filter(x %in% c(0, 0.05)) %>%
  arrange(group, desc(x)) %>% 
  group_by(group) %>%
  mutate(diff_staff = predicted - lag(predicted),
         eff_MEP = 1/diff_staff)
first_diff

If we consider a typical scenario where the national party loses 5 percentage points of the seats in the national parliament, we’d expect an average increase among MEPs’ local staff size of 0.09 in the party-centered systems in the EU. A reasonable interpretation is that one in 11 MEPs from party-centered systems would react to the party’s defeat by hiring an additional local assistant.

Since the marginal effect of x is not different from zero in the candidate-centered systems, we might point out that here.

Random slopes: varying effects by groups

We can estimate separate slopes for each country. This is a form of interaction effect. However, we specify it differently. It means we will estimate mini-models within each group. It requires a lot of variation (and will “explain” a lot of variation).

Here, I estimate a different slope for party seat share per member state.

mod.slope <- lmer(y ~ 
                  + EPElection
                  + Reform2016
                + (1|ID)
                #x has varying slopes by nationality
                + (x |Nationality),
               df)

summary(mod.slope)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ +EPElection + Reform2016 + (1 | ID) + (x | Nationality)
##    Data: df
## 
## REML criterion at convergence: 27512.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -7.3179 -0.3262 -0.0320  0.3163 22.9263 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev. Corr 
##  ID          (Intercept)  3.751   1.937         
##  Nationality (Intercept)  8.821   2.970         
##              x           69.142   8.315    -0.88
##  Residual                 2.006   1.416         
## Number of obs: 6948, groups:  ID, 1152; Nationality, 28
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.19920    0.28348   7.758
## EPElection   0.35359    0.05826   6.070
## Reform2016  -0.70453    0.04666 -15.099
## 
## Correlation of Fixed Effects:
##            (Intr) EPElct
## EPElection -0.023       
## Reform2016 -0.030  0.092

We can’t see the random effects, but they are there. Let’s check them out.

Interpretation

The random intercepts are stored, but not immediately reported.

We can always take a look at the coefficients using the coefficients() function. It reports the coefficients for all groupings: Their (varying) intercepts, and their slopes regardless of whether they are “fixed” or “vary” by group.

Here, I look at the first coefficients for the 10 first MEPs.

coefficients(mod.slope)$ID %>% head

And here I can see the varying slopes i estimated.

coefficients(mod.slope)$Nationality %>% head

The intercepts are interpreted as deviations from the (grand) mean. Here, we see that Austria has a higer estimated number of local staff that the adjusted average (grand mean).

I can separate out the “fixed” effects (those that are common for all groups) and the “random” effects (the varying slopes and intercepts) using two functions.

fixef(mod.slope) %>% head

## (Intercept)  EPElection  Reform2016 
##   2.1992021   0.3535862  -0.7045293

And here are the varying ones

ranef(mod.slope) %>% head

## $ID
##          (Intercept)
## 840      0.451494431
## 988     -0.816094099
## 997     -0.996768158
## 1023     1.558450395
## 1037    -0.125861566
## 1038    -1.457848844
## 1055    -0.223846838
## 1059    -0.577320000
## 1073     2.615850453
## 1122    -0.526169667
## 1129     0.450340471
## 1164     0.908069794
## 1179    -1.047151145
## 1183    -0.636678449
## 1185    -0.633301531
## 1186    -0.012527499
## 1191     1.035992945
## 1204    -0.033626526
## 1253    -1.312769816
## 1263     1.294336010
## 1309     0.242699713
## 1351    -0.653229782
## 1394     2.657319019
## 1403    -0.762275828
## 1405     0.081354880
## 1407    -0.660670580
## 1419     0.442702997
## 1431    -1.071664450
## 1473    -0.806503165
## 1654     1.536774233
## 1665     2.101859740
## 1746    -0.635420041
## 1832    -1.101966856
## 1849    -1.258572267
## 1852    -0.220784797
## 1854    -0.679323334
## 1892    -0.553327789
## 1902    -0.456004390
## 1906     0.808714019
## 1907    -1.178700625
## 1909    -0.078142868
## 1911    -1.153939490
## 1913    -0.394553640
## 1914    -0.094656273
## 1917    -0.261618973
## 1922     0.918293978
## 1927    -1.359863573
## 1928    -1.078155919
## 1929     0.493970769
## 1930     2.120037296
## 1933    -0.722505953
## 1941     0.899446897
## 1945     1.441469072
## 1946     0.300335389
## 1956    -1.044465581
## 1965    -0.729670405
## 1985    -1.382541639
## 1993    -0.969599255
## 2002    -0.969599255
## 2025    -1.172705301
## 2037    -0.460667296
## 2054     0.134379907
## 2073    -0.287659671
## 2081    -1.022018697
## 2097     2.249443582
## 2109     1.584342273
## 2119     0.483276045
## 2125     0.201578234
## 2128     1.177500828
## 2155     1.392600609
## 2173    -0.841344638
## 2187    -1.555376266
## 2212     2.588426094
## 2224     4.288125998
## 2229     0.347043872
## 2247    -0.237981774
## 2278     0.288430847
## 2295    -0.315013021
## 2307    -1.190726669
## 2309    -0.970896644
## 2315    -0.016907207
## 2319     0.450340471
## 2323     2.623261248
## 2327     0.376802948
## 2338    -0.023882423
## 2341    -0.030938335
## 4246     1.427509623
## 4253    -1.007858214
## 4260    -1.287739778
## 4262    -0.616041180
## 4265    -0.630166924
## 4267    -0.014860715
## 4270    -0.647127240
## 4271     2.300711355
## 4274    -0.285888347
## 4275     2.300711355
## 4282    -0.505554491
## 4289     2.744283352
## 4293    -1.233214297
## 4294    -0.541831895
## 4308     3.005230033
## 4318    -1.175726056
## 4319    -0.500295321
## 4326     0.837141330
## 4328     0.282758539
## 4334    -1.279844981
## 4335     1.463527164
## 4342    -1.365615346
## 4345     0.217567264
## 4346    -1.523982047
## 4347     0.441125239
## 4348    -1.523982047
## 4393    -1.683447604
## 4412    -0.642919112
## 4423    -0.907293689
## 4429    -0.247661772
## 4432    -0.043358938
## 4436     1.066220592
## 4440    -0.067244099
## 4462     2.549089625
## 4465     0.178105649
## 4482     0.735076135
## 4507     3.075445284
## 4513    -1.508708973
## 4514    -0.414139977
## 4516    -1.459478609
## 4519    -1.141968753
## 4521    -1.034092816
## 4525     0.196265161
## 4529    -0.594814035
## 4531    -0.354826726
## 4532     0.062025654
## 4533     1.197192058
## 4536     0.442702997
## 4537    -1.364037589
## 4538     0.623377055
## 4540    -1.375543337
## 4542    -1.129016047
## 4545    -0.382582902
## 4546    -1.364037589
## 4550    -0.314429107
## 4553    -1.129016047
## 4554     1.573274667
## 4555    -2.173171592
## 4556     0.484678885
## 4560    -0.187648229
## 4746     0.176750663
## 4751    -0.890469849
## 5537    -0.066228962
## 5565     0.868144171
## 5729     0.659352002
## 5737     0.079777122
## 5846     0.851419104
## 5956    -0.639230672
## 21331   -0.860247332
## 21817   -1.940001697
## 21818   -0.365271513
## 21918   -0.085637022
## 22097    0.637415093
## 22418   -0.215525628
## 23413    0.197368682
## 23691    0.346722932
## 23693   -0.886890164
## 23695    2.517563391
## 23699    0.040531588
## 23704   -0.075163715
## 23705   -2.179962130
## 23706    0.582670591
## 23707   -1.291257089
## 23712   -1.457265896
## 23746    3.746134311
## 23768    3.663533973
## 23781   -3.937481681
## 23782   -0.770721383
## 23784    3.612694758
## 23785    3.196235009
## 23787    1.582540077
## 23788   -3.182790166
## 23791    2.240128010
## 23792   -2.731105020
## 23805   -1.164583493
## 23808   -1.706605668
## 23813   -0.849600054
## 23816    2.209936422
## 23819   -0.849600054
## 23820    0.595792414
## 23821   -0.637760517
## 23840    0.876831613
## 23866   -1.714750215
## 23868   -1.708016203
## 23873    0.995360663
## 23894    0.840754357
## 23938   -1.344323889
## 24030    2.749689648
## 24505    0.226782496
## 24594   -1.047151145
## 24922    0.622504602
## 24942   -0.430758702
## 25704   -0.233787876
## 25718   -0.164095799
## 25919   -0.556326421
## 26829    0.554519088
## 26837    0.386477523
## 26851    0.585159556
## 28112   -0.554706264
## 28114    0.201578234
## 28115    0.258968628
## 28117    1.252034322
## 28120    1.177690113
## 28122   -0.267476370
## 28123    0.669859761
## 28124    0.476452060
## 28125   -0.450462726
## 28126   -0.486340714
## 28130   -0.630166924
## 28131   -0.469315320
## 28132   -1.698555435
## 28135   -0.637760517
## 28139    1.159698093
## 28141   -0.126903820
## 28145   -0.849600054
## 28148   -0.453029822
## 28150   -0.732683748
## 28153   -1.015544874
## 28154   -0.754336884
## 28155   -1.706605668
## 28156    1.716972234
## 28161    0.078608704
## 28165    0.817581202
## 28166   -1.106669259
## 28167   -0.164821550
## 28170   -1.528156105
## 28171   -0.365889937
## 28174   -0.727238054
## 28177   -0.453029822
## 28178    0.271214970
## 28182   -0.707217933
## 28192    2.300456095
## 28193   -1.365615346
## 28206   -0.481747055
## 28208    0.039678695
## 28210   -0.340014231
## 28214   -1.205514827
## 28219    1.863875398
## 28220    3.386297992
## 28221    2.481385413
## 28223   -0.315708029
## 28224    2.718184479
## 28225    0.534950159
## 28226   -0.790324185
## 28227    0.948500813
## 28228    1.130588420
## 28229   -0.578029743
## 28233   -0.213804523
## 28238   -1.163036836
## 28240   -1.795858378
## 28241   -1.995837519
## 28242   -1.512530330
## 28243   -1.995837519
## 28244   -2.357185636
## 28246   -1.273141285
## 28247    1.256295535
## 28248   -0.021116157
## 28251    1.211625682
## 28252   -0.648704493
## 28253   -1.776167019
## 28255    3.961765365
## 28256   -1.101966856
## 28257    0.524723883
## 28266    0.489617090
## 28269   -3.169717519
## 28275    2.421186982
## 28276   -2.035435562
## 28278   -1.523982047
## 28279    5.462537874
## 28280    1.397367320
## 28284   -3.034111388
## 28288   -0.037250887
## 28291   -0.186137365
## 28292   -1.473085421
## 28295   -1.523982047
## 28297   -2.663460284
## 28298    6.975082165
## 28299    1.216693261
## 28300    0.823524244
## 28301    0.184236655
## 28306   -0.101028639
## 28307   -0.539845970
## 28308   -0.831989890
## 28310    0.889573236
## 28314    1.506660403
## 28316   -0.825325401
## 28320    3.059585567
## 28321   -0.054997634
## 28323    0.530148748
## 28324    1.614193099
## 28330    3.676954300
## 28331   -0.553170073
## 28336   -0.449907406
## 28338   -1.523982047
## 28340   -1.336235182
## 28341   -0.400336236
## 28342   -0.801285813
## 28346   -1.983504415
## 28349   -1.523982047
## 28352   -1.245547215
## 28353    2.420802068
## 28354   -0.342176922
## 28358    6.212260878
## 28362    2.643881916
## 28365   -1.526081645
## 28367   -0.894439713
## 28370    1.608242768
## 28372    5.579827336
## 28377   -1.721319298
## 28379    0.316076347
## 28380   -0.998623064
## 28389   -1.373353161
## 28390    0.069244066
## 28393    0.069244066
## 28394   -2.672763271
## 28397    0.638783454
## 28398   -0.972007968
## 28399    0.354013760
## 28400   -0.974911478
## 28404   -1.512847156
## 28407   -0.620072052
## 28409    0.766964335
## 28422    2.344839365
## 28424    0.202552343
## 28429   -1.374702207
## 28451    2.387519755
## 28463   -0.155219303
## 28474    0.009406608
## 28477   -0.052791859
## 28481   -0.274706966
## 28493   -1.735688227
## 28497   -1.749661081
## 28506    0.432400475
## 28508   -0.688345043
## 28513    0.251726416
## 28514   -1.753613993
## 28572   -1.071664450
## 28584    6.281285378
## 28586    1.431265493
## 28615   -0.178496514
## 28617    0.631089242
## 28618   -1.161834738
## 28619   -1.189517542
## 29074    2.525513424
## 29579   -0.861169238
## 30190   -0.525702510
## 30475   20.475115983
## 33569   -0.606938275
## 33570   -0.052791859
## 33775    1.434832561
## 33982    1.256358308
## 33984   -0.604631083
## 33989   -0.224938158
## 33990   -1.367312995
## 33991   -1.967846953
## 33997   -0.797270072
## 34231    2.481631381
## 34232    1.795484235
## 34234   -0.812656520
## 34235   -0.451308402
## 34249    0.061712768
## 34250   -1.051272058
## 34254    0.646285690
## 34728    1.029294829
## 35135    1.564546232
## 35743   -0.524967749
## 36281    2.065494117
## 36392   -0.154768231
## 36396    0.192275104
## 37229   -0.289821314
## 37312   -0.754336884
## 37324    1.743082881
## 37448   -0.382975428
## 37524   -0.965876629
## 37676    0.764574484
## 38398    0.952774896
## 38420    0.095135871
## 38511    1.882937221
## 38596    0.013835090
## 38601    0.086232412
## 38602    0.355701044
## 38605    0.605147678
## 38610   -0.979780837
## 38613    2.885137864
## 39317   -1.857762391
## 39319   -1.857762391
## 39321   -1.523982047
## 39711    0.439524461
## 39713    0.593989797
## 39714    0.751617055
## 39717   -1.009690827
## 39721    0.781742151
## 39722   -2.500515999
## 39724   -0.126564382
## 39725   -1.075796695
## 39726    2.707164628
## 39916   -0.186137365
## 40224   -1.770938491
## 40599   -0.730143061
## 41007   -0.726619630
## 58758    0.841925698
## 58789   -0.413077244
## 72775   -0.603165914
## 72779   -0.143780853
## 88882    5.275349792
## 93624   -1.011554643
## 94282    0.768804671
## 94283    0.251726416
## 95017   -2.500515999
## 95074    0.094914795
## 95281   -2.313249247
## 96603   -0.076402968
## 96639    0.602958563
## 96646   -0.590655544
## 96648    1.377329906
## 96650   -0.440503580
## 96651   -0.490432917
## 96652   -1.714750215
## 96653   -0.405362327
## 96654    0.814686605
## 96655   -0.167547918
## 96656    0.496914701
## 96658   -1.071664450
## 96660   -2.127983557
## 96661    2.381768402
## 96662   -1.127464680
## 96663    0.303909083
## 96664   -1.345035090
## 96668   -0.310570760
## 96670    0.131220562
## 96671    1.235658010
## 96672    0.195147299
## 96673   -0.469315320
## 96674    0.071333270
## 96675   -0.647042492
## 96676    0.793500675
## 96677   -0.513288766
## 96678   -0.648206325
## 96680    1.401231622
## 96681    0.119767999
## 96682    0.164426054
## 96683    0.013395877
## 96684   -0.658251330
## 96685    0.606187473
## 96686   -0.545668730
## 96692  -10.024896143
## 96693    3.301455460
## 96694   -4.177205193
## 96695   -1.132065270
## 96696    1.578045608
## 96697   -1.283347543
## 96698   -2.355128146
## 96701   -1.523982047
## 96702   -1.597145706
## 96703    2.901281347
## 96705   -0.545501762
## 96706    2.076381419
## 96707   -0.126903820
## 96709   -0.009062145
## 96710   -0.915692335
## 96713   -1.276591838
## 96714    1.293033923
## 96715   -1.872734941
## 96717   -1.113121772
## 96718    1.418853789
## 96719   -2.179962130
## 96725   -0.042367923
## 96726   -0.686945352
## 96728    0.213350017
## 96729    0.164167298
## 96730   -1.020651989
## 96731   -1.987224363
## 96733    0.718772838
## 96734   -0.261266139
## 96736    1.264908691
## 96739   -1.036472527
## 96741   -0.537285484
## 96746    0.441125239
## 96747   -0.318396220
## 96748   -0.111003162
## 96749    2.690826207
## 96750    1.323707684
## 96751    2.148804031
## 96752   -0.403650986
## 96753    3.565456022
## 96754    0.456009985
## 96755    0.357424721
## 96756    0.724796622
## 96757   -0.184597454
## 96758   -1.987224363
## 96760   -2.261486568
## 96761    0.063984897
## 96762    3.860914148
## 96763    0.733706685
## 96764    2.148645091
## 96765    0.069244066
## 96767    4.430453536
## 96768   -1.473085421
## 96769   -1.440655777
## 96770    2.842759788
## 96771    0.633307764
## 96772    4.145683842
## 96773   -2.983525971
## 96774   -1.998997665
## 96775    0.604617620
## 96776   -2.568537053
## 96777    0.313322969
## 96778    2.300737612
## 96779   -3.422846134
## 96780   -0.008490355
## 96781   -3.992385522
## 96782   -4.203528495
## 96783   -2.263341474
## 96784    0.436144109
## 96785    2.529361190
## 96786    6.031586819
## 96787   -1.133853113
## 96788   -3.661506319
## 96789   -0.275926830
## 96790    8.214614182
## 96791   -2.093920896
## 96792   -1.312743558
## 96793   -1.737397698
## 96794   -2.938810085
## 96795   -1.014701464
## 96796    6.434136418
## 96798   -0.832317935
## 96799   -0.590047324
## 96800   -1.285712551
## 96801    0.928589003
## 96802    2.056757531
## 96803   -2.821442049
## 96805    1.319523396
## 96806   -0.974911478
## 96807   -0.951175731
## 96808    0.656467272
## 96809    2.474025268
## 96810    1.080120955
## 96811    0.828629917
## 96812   -1.473085421
## 96813   -0.154768231
## 96814   -0.513101355
## 96815    0.526034697
## 96816   -0.361391915
## 96817   -0.545945572
## 96818    3.248209658
## 96819   -1.268641806
## 96820   -0.575596821
## 96823   -1.055123530
## 96824   -0.513101355
## 96825   -2.477469198
## 96826   -0.258067591
## 96827   -0.409399523
## 96829   -0.068221129
## 96831    0.269908979
## 96832   -1.706605668
## 96833    1.946886065
## 96834   -0.312081876
## 96835   -0.357336624
## 96836   -1.059016259
## 96837   -0.236348255
## 96838   -0.513101355
## 96839   -0.109783946
## 96840    0.269908979
## 96842    0.364832210
## 96843    2.525513424
## 96844   -1.533632416
## 96845    0.700921729
## 96846   -1.793937024
## 96847   -0.182476594
## 96848    0.371002106
## 96849    0.251143168
## 96850    1.001701068
## 96851   -0.495732313
## 96852    1.497655544
## 96854   -0.970348470
## 96855   -1.592250145
## 96857   -1.174170471
## 96859   -0.825290820
## 96861   -0.496477137
## 96862    0.782971355
## 96863    0.393830819
## 96864   -1.146388720
## 96865    0.681758955
## 96866   -0.559845935
## 96867   -0.639230672
## 96870    0.101164602
## 96871    2.853938310
## 96872   -0.911793168
## 96873   -0.550445051
## 96874   -1.634489402
## 96875   -1.995837519
## 96876   -0.434231369
## 96877    1.066763726
## 96878   -1.973020996
## 96880    2.081001144
## 96882   -1.230340853
## 96884    3.003916876
## 96886   -0.818532775
## 96887   -0.921194052
## 96888   -0.198497818
## 96889    0.718772838
## 96890    6.667116407
## 96891   -0.909080641
## 96892   -0.387002869
## 96896   -1.386403194
## 96897    0.006418698
## 96898    0.193053959
## 96899    1.250213605
## 96900   -1.071664450
## 96901    1.423633324
## 96902    0.264139620
## 96903   -0.553327789
## 96904   -0.470064837
## 96905   -0.708901869
## 96906    0.809513852
## 96907   -0.773020143
## 96908   -0.100896936
## 96910   -0.727238054
## 96911   -0.903848689
## 96912    0.147034797
## 96914   -0.131237927
## 96915   -1.463216227
## 96916    0.343524358
## 96917   -0.102233175
## 96918   -0.287659671
## 96919    0.010062728
## 96920   -1.034092816
## 96922   -1.098997649
## 96925   -1.857762391
## 96930   -0.917167349
## 96931    0.346066399
## 96932   -0.489476871
## 96933    0.137527280
## 96934   -0.633629168
## 96936   -0.310448859
## 96937   -0.600381152
## 96940   -0.554538454
## 96944    0.081354880
## 96945   -0.269768760
## 96946    0.736490600
## 96947   -0.466936046
## 96948   -0.308713152
## 96949    0.199909191
## 96950    0.631014529
## 96951    2.555159182
## 96952   -0.403636384
## 96953    1.673951243
## 96954   -3.542428813
## 96955    0.690492481
## 96956   -0.986631200
## 96957    0.294832422
## 96959    1.211926550
## 96960    2.058467002
## 96973    0.083465562
## 96974    0.264139620
## 96975   -0.917167349
## 96976   -0.013797057
## 96977   -1.325897756
## 96978   -0.766739722
## 96979   -0.740519993
## 96981    0.412861832
## 96986   -1.523982047
## 96987   -1.523982047
## 96988    1.724953355
## 96989   -1.523982047
## 96990   -1.523982047
## 96991   -1.473085421
## 96992   -3.345530919
## 96993   -0.848384537
## 96994   -3.345530919
## 96996    0.990646486
## 96997    0.534950159
## 96998   -1.042777209
## 96999   -0.623809406
## 97007   -1.857762391
## 97008   -1.857762391
## 97009   -1.857762391
## 97011   -1.071664450
## 97014   -1.264264747
## 97017    0.192827609
## 97018    0.012379901
## 97019   -0.920303610
## 97020    0.524198417
## 97022   -0.624783925
## 97025    0.661693162
## 97026   -1.356400115
## 97050    1.657552875
## 97058   -0.959521828
## 97076   -1.386188886
## 97078   -1.279844981
## 97086    0.254801260
## 97125    3.166851912
## 97130    0.435475318
## 97131   -1.732613384
## 97133   -0.063879290
## 97137   -0.719707156
## 97156   -1.643890286
## 97170   -1.398409353
## 97193    0.990646486
## 97196    0.177613223
## 97197    0.899446897
## 97198   -2.319694622
## 97199    3.428883716
## 97203   -1.315671037
## 97228   -2.274635560
## 97229    1.158171553
## 97230   -1.178721515
## 97280    0.448624311
## 97293   -0.813924294
## 97296   -0.365857835
## 97308   -0.348851205
## 97344   -0.077994562
## 97513   -0.388753733
## 97968   -0.768076670
## 98341   -0.403858367
## 99325   -1.857762391
## 99416   -2.127983557
## 99419    7.003981612
## 99650   -0.239509257
## 101580  -0.033626526
## 101954  -4.745550670
## 102886  -0.637760517
## 102887   1.147766395
## 102931   0.784252049
## 103132  -0.708901869
## 103246  -0.405412843
## 103381  -1.039498664
## 103488   6.500466008
## 105192  -1.523982047
## 105624  -0.462245053
## 105849  -0.100896936
## 106202   0.754198065
## 107041  -1.325264085
## 107212  -1.346559702
## 107385   0.072874654
## 107386  -0.799099202
## 107973  -0.054997634
## 107977  -1.857762391
## 108080   1.105452250
## 108329  -0.106546857
## 108570   0.612826616
## 109337  -0.282428065
## 109649   0.255163968
## 110365   0.255939856
## 110977   1.281038076
## 110984   2.449831340
## 110987   4.062105481
## 111011  -0.701861507
## 111014  -1.137700440
## 111017  -1.523982047
## 111018   0.203058627
## 111019   0.656467272
## 111024  -0.469315320
## 111026  -0.647327383
## 111027   1.129589390
## 111068   0.441125239
## 111126  -1.720053178
## 111496  -0.969599255
## 111589   2.549025008
## 111823   0.220779203
## 112014   0.127882199
## 112071  -0.437751085
## 112611  -0.350450136
## 112620  -1.859532445
## 112744  -2.573060835
## 112748  -0.980456655
## 112753   1.837902281
## 112755  -0.522691095
## 112760   0.450817970
## 112788  -0.125861566
## 113487  -1.456691486
## 113597  -1.547987054
## 113883  -0.845998411
## 113890  -0.237422693
## 113892  -0.118866690
## 113959  -0.882143766
## 114268   0.025224970
## 114279   0.333360048
## 114840   2.039850272
## 114873  -1.055320285
## 115868  -0.706130601
## 116663   3.524308664
## 116816   1.569693393
## 116823  -1.580284679
## 117119  -1.128422915
## 117477   0.045710314
## 117491  -0.829605041
## 118144  -0.938608382
## 118658  -0.036424894
## 118708   1.311408681
## 118709  -0.317437379
## 118710   3.628372214
## 118858  -2.801771284
## 118859   0.407290320
## 118860   3.231601810
## 118999   2.955405510
## 119431   1.022837975
## 119434   1.202630101
## 119435  -1.186276170
## 119436   8.103914883
## 119438  -0.740391074
## 119439  -2.166520560
## 119440  -2.955528450
## 119441  -2.955528450
## 119875   1.395536636
## 120478  -1.550737498
## 122317  -2.434167875
## 122404  -0.916742201
## 122885  -0.633846343
## 124287  -0.080974606
## 124688   0.659306111
## 124689   0.101596358
## 124691  -2.360160329
## 124692   0.354017154
## 124695   0.114679703
## 124696  -0.804055184
## 124697  -0.524109191
## 124698  -0.137486981
## 124699  -0.608016531
## 124700   0.520035169
## 124701  -0.580400148
## 124702  -0.580400148
## 124703   0.288534980
## 124704  -0.023141348
## 124705   0.107860922
## 124706  -0.785833722
## 124707  -0.785833722
## 124708  -0.785833722
## 124709   0.840232805
## 124710   0.510310226
## 124711  -0.544968425
## 124712  -0.544968425
## 124713   2.497724870
## 124714  -0.544968425
## 124715   1.081098102
## 124720   0.193897638
## 124721   3.902016151
## 124722  -0.976183430
## 124723   2.095275565
## 124725   0.949478730
## 124726  -0.373003921
## 124727  -0.009718322
## 124728  -0.685803028
## 124729   0.940263498
## 124730  -0.413957004
## 124732   1.120937557
## 124733  -0.413957004
## 124734   0.308739230
## 124735  -0.887901356
## 124736   0.511227876
## 124737  -0.539845970
## 124738  -0.137551098
## 124739  -0.539845970
## 124741   0.363524322
## 124742  -0.300253859
## 124743   0.992437359
## 124744   0.226782496
## 124747   0.363524322
## 124748  -0.525702510
## 124749   5.036291405
## 124750  -0.422460191
## 124751  -0.498899215
## 124752  -0.525916817
## 124753  -0.498899215
## 124754   1.499370842
## 124757  -0.318225156
## 124758  -0.318225156
## 124760   0.223797019
## 124761  -0.498899215
## 124763   3.975727304
## 124764  -0.476213621
## 124765  -0.137551098
## 124766  -5.780671857
## 124767  -0.498899215
## 124768  -3.106511896
## 124769   0.196993725
## 124770  -0.074017363
## 124771   0.404471078
## 124772   1.154734612
## 124776  -0.471219747
## 124777  -1.083200929
## 124778  -2.167245281
## 124779  -2.889941515
## 124780  -1.625223105
## 124781  -0.041208437
## 124784  -0.506220081
## 124785   5.536237392
## 124786  -0.757562917
## 124787  -1.802830357
## 124788  -1.080134122
## 124789  -1.409590373
## 124790   0.687829552
## 124791  -0.899460064
## 124793  -0.686894139
## 124794  -2.170657112
## 124796  -0.541178754
## 124797  -2.167245281
## 124799  -0.652890800
## 124800  -0.176763830
## 124801  -1.260808181
## 124802   0.116015637
## 124803   0.726606463
## 124806  -1.456228169
## 124807  -0.372183818
## 124808   0.531186475
## 124809   1.615230826
## 124811  -1.986571222
## 124812  -0.360504695
## 124813  -1.986571222
## 124814  -1.260808181
## 124815  -1.003517149
## 124816  -1.455202296
## 124817   1.810650814
## 124818  -1.093854178
## 124819  -1.622156298
## 124820  -0.371157944
## 124821   0.184584287
## 124822  -0.355201974
## 124823  -0.371157944
## 124824   3.422997285
## 124825  -1.123966520
## 124826  -0.114303228
## 124827  -2.087561500
## 124828   4.226193607
## 124829   1.254908582
## 124830  -0.039922169
## 124831  -0.179830637
## 124832   0.622549378
## 124833  -0.902526871
## 124834  -0.100146857
## 124835  -0.179830637
## 124836  -0.475651345
## 124837  -0.114303228
## 124838  -0.716550091
## 124839   0.427718947
## 124840  -1.379021638
## 124841   1.150415181
## 124842   3.614324592
## 124843  -2.690116875
## 124844  -1.986571222
## 124846  -0.447774918
## 124847  -1.080134122
## 124848  -1.310509752
## 124849   1.268628638
## 124850  -1.307228660
## 124851   6.146828219
## 124852  -1.239474239
## 124853   6.394282309
## 124854   3.503497373
## 124855  -1.691159386
## 124856  -0.721852812
## 124858   0.135745502
## 124859  -1.986571222
## 124860   0.274921317
## 124861  -0.899460064
## 124864   0.184584287
## 124866  -0.960935028
## 124867   0.365258346
## 124868  -1.441482240
## 124869   2.103886794
## 124870  -0.058889425
## 124872   0.545676195
## 124873   2.298725267
## 124874   2.240128010
## 124875  -0.122438403
## 124877   3.143498302
## 124878  -0.846848953
## 124879   5.067225758
## 124880  -0.973797838
## 124881  -1.227370803
## 124882   4.731919414
## 124883  -0.519995956
## 124884   3.504846419
## 124886   0.550374294
## 124887   1.878779892
## 124888   0.066370830
## 124889  -1.408044861
## 124890  -4.118155739
## 124892   0.009111791
## 124893  -0.504674569
## 124894   2.177200493
## 124895  -2.133401278
## 124896  -1.046696744
## 124897  -2.638071571
## 124898   0.218021665
## 124899   5.363263345
## 124900   1.246420688
## 124901   0.772233238
## 124902   2.538548611
## 124903  -2.672763271
## 124926  -1.389935413
## 124927   0.419711379
## 124928   2.044593895
## 124929   0.776969279
## 124930  -1.396286014
## 124932   1.111688773
## 124933   0.161633613
## 124934  -0.205573400
## 124935  -1.289617751
## 124936  -0.831146569
## 124937  -0.193528202
## 124939   5.602285301
## 124940  -0.781531433
## 124941  -1.323553609
## 124942  -0.027889510
## 124943  -0.423549899
## 124944  -0.790222586
## 124945  -1.413890638
## 124947  -0.134565621
## 124948   0.474495824
## 124949  -0.609548527
## 124950   2.380264591
## 124951   4.458016264
## 124952  -1.151570703
## 124953  -2.054940995
## 124954  -0.790222586
## 124955   0.212175889
## 124956   0.931505538
## 124957  -0.149172228
## 124958   0.573524006
## 124960  -0.965572075
## 124961  -0.248200410
## 124962   0.293821766
## 124963  -0.790222586
## 124964  -0.058835199
## 124965   2.210664987
## 124966   3.464308941
## 124967   0.663861035
## 124968  -0.316635726
## 124969   1.197192058
## 124970  -1.305926847
## 124971  -0.088947542
## 124972  -0.373094052
## 124973   0.055839023
## 124984  -1.022851094
## 124986   0.061193257
## 124987   0.061193257
## 124988  -0.638671757
## 124989  -0.329160208
## 124990  -0.330324040
## 124991  -0.263113915
## 124992  -0.280959343
## 124993   0.755466059
## 124994  -0.258684066
## 124995  -0.143248904
## 124996   0.940795447
## 124997  -0.333427593
## 125001   1.882937221
## 125002   0.433571841
## 125003  -0.238183160
## 125004   0.484513074
## 125012   7.782616015
## 125013   3.548388312
## 125016  -0.428371327
## 125018  -1.489104082
## 125019  -0.378584924
## 125020  -0.109147255
## 125021  -0.289821314
## 125022   0.371241381
## 125023  -0.376591977
## 125024   2.424959396
## 125025   0.007086704
## 125026   0.411053227
## 125027   0.559374178
## 125028   0.007086704
## 125029  -0.463992308
## 125030  -0.352390373
## 125031   3.831486152
## 125032  -0.886018290
## 125033   0.785454357
## 125035   5.613620434
## 125036   1.179958302
## 125037  -0.388753733
## 125038  -1.180291708
## 125039  -1.276868423
## 125041  -1.279844981
## 125042  -1.232697283
## 125043  -1.279844981
## 125044  -1.279844981
## 125045  -1.089491359
## 125046  -1.279844981
## 125047   2.133230133
## 125048  -0.396206686
## 125049   0.160174710
## 125050   2.313904192
## 125051  -1.256039716
## 125053  -0.700887421
## 125061   0.708569259
## 125062   0.233697386
## 125063   0.520759908
## 125064   0.159411791
## 125065  -1.014969590
## 125067  -0.382610385
## 125068  -0.292273356
## 125070  -0.579719346
## 125071  -0.579719346
## 125072  -0.760393404
## 125090   3.367268973
## 125092   1.611939551
## 125093   3.728617090
## 125099   0.182850264
## 125100   1.746285376
## 125101  -0.300253859
## 125103  -0.003782443
## 125104   0.357565674
## 125105  -0.913291248
## 125107  -0.174121843
## 125109  -1.108639889
## 125110  -0.566617713
## 125112  -1.095274949
## 125113  -1.095274949
## 125128  -1.683631263
## 125192  -0.203905169
## 125193  -0.745927345
## 125214  -7.627438019
## 125237  -1.986571222
## 125670  -0.109983798
## 125684  -0.498899215
## 125997   2.133230133
## 127330   7.601035078
## 128717   0.052456905
## 129073  -0.808579033
## 129164  -0.709294417
## 129256   5.918710300
## 130100  -0.677211584
## 130833   2.108816965
## 131051  -1.080718685
## 131507  -0.755280450
## 131749  -0.203991734
## 131750   0.825206467
## 132366  -0.165968947
## 132925   2.914927606
## 133316  -0.315396220
## 133326   1.583776389
## 134243  -0.150337205
## 135490  -0.186686911
## 135540  -3.679853994
## 135541  -3.679853994
## 136236  -0.347624774
## 183022   0.066675103
## 183338   0.038435984
## 183793  -1.277934867
## 185341  -2.372414842
## 185619  -0.458560677
## 185637  -1.352604471
## 
## $Nationality
##                (Intercept)            x
## Austria        -0.42196300   0.99652054
## Belgium        -1.20648169   1.36397023
## Bulgaria        1.22966358   2.00000973
## Croatia        -2.17067463  11.23974403
## Cyprus          1.43678583  -4.61414807
## Czech Republic  0.10295552   0.21083028
## Denmark        -0.41627716  -2.96761166
## Estonia        -0.58155144  -1.17393695
## Finland        -1.31627516   2.14472569
## France         -0.73521848  -0.06844123
## Germany         0.39346874  -0.60265202
## Greece          0.08568560  -2.71932350
## Hungary        -0.08585265  -1.83779695
## Ireland         0.85080433  -3.23232074
## Italy           1.50832564  -1.30999359
## Latvia          0.79936481  -0.93630177
## Lithuania      13.39751215 -38.29414226
## Luxembourg     -0.70901095  -1.32348608
## Malta           0.38274312   3.79585421
## Netherlands    -1.50258715   0.15533824
## Poland          4.47336409  -0.64471393
## Portugal       -0.58501966  -2.08127264
## Romania         0.55418426  -0.31597943
## Slovakia       -0.37174957  -0.08936310
## Slovenia       -0.25372979  -2.11202960
## Spain           0.30530168  -2.82617070
## Sweden         -1.54829747  -0.07493743
## United Kingdom  1.65142810  -2.99808154

Coefplot

Our random slopes can also be plotted as coefficient plots (coefplots) useful for hypothesis testing.

We can report the national-level effects of vote shares together with their intercept in sjPlot.

library(sjPlot)
plot_model(mod.slope,
           #Plot random effects
           type = "re",
           #Variable to be plotted
           terms = "Nationality")

## [[1]]

## 
## [[2]]

Some additional R codes: We can also make the plot our own way:

#simulate
re <- 
  #Simulate the distribution from the estimated point estimate/regression coefficient and the standard error
  REsim(mod.slope) %>%
  #Transform it to a data frame; have a look at the data frame to get an idea
  as.data.frame() %>%
  #select national variation; there is a variable that labels the grouping
  filter(groupFctr == "Nationality")

tab_slope <-
  re %>%
  #Select my variables
  dplyr::select(groupID, term, mean, sd) %>%
  #Pivot to wide format using function from tidyr package
  tidyr::pivot_wider(values_from = c(mean, sd),
                     names_from = term) %>%
  #Select variables 1 to 5
  dplyr::select(1,2,4,3,5) %>%
  #Give the variables new names; e.g. variable 2 is now named "alpha"
  rename("alpha" = 2,
         "se(alpha)" = 3,
         "beta" = 4,
         "se(beta)" = 5) %>%
  as.data.frame()

tab_slope

tab_slope %>%
  #Arrange according to size of the effect
  arrange(beta) %>%
  ggplot +
  #Points indicate regression coefficient
  geom_point(aes(y = 1:nrow(tab_slope),
                  x = beta)) +
  #Errorbars indicate the spread/uncertainty of the coefficient; here, 95% conf. interval
  geom_errorbar(aes(y = 1:nrow(tab_slope),
                    xmin = beta - `se(beta)`*1.96,
                    xmax = beta + `se(beta)`*1.96),
                #size of dents on whiskers
                width = 0
                ) +
  #Add nationality labels along the y axis
  geom_text(aes(y = 1:nrow(tab_slope),
                x = -50,
                label = groupID),
            hjust = 0
                ) +
  #Vertical line indicate null-effect; do the confidence intervals overlap 0?
  geom_vline(aes(xintercept = 0),
             lty = 2) +
  #Remove y axis label
  ylab("") +
  #Title
  ggtitle("Effect of vote share within member state delegations")

Effect plots

We can illustrate the effect of the random slopes. However, taking the uncertainty into the estimation in a scenario is more than this class can cover.

Here, I use the simulated effect of x for the first group (Austria; as indicated by my indexation of the table) and plot the effect by using the geom_abline(). It needs the intercept and the slope to create a linear effect.

tab_slope %>%
  #Add the grand mean to the data
  mutate(a = fixef(mod.slope)[1]) %>%
  filter(groupID == "Austria") %>%
  ggplot +
  geom_abline(aes(intercept = a + alpha,
                  slope = beta)) +
  #A realistic y axis
  ylim(c(0,10)) +
  ggtitle("Effect of party seat share in Austria")

We can also illustrate the effect in all countries by using the color grouping or facet plot in ggplot.

tab_slope %>%  
  #Add the grand mean to the data
  mutate(a = fixef(mod.slope)[1]) %>%
  ggplot +
  geom_abline(aes(intercept = a + alpha,
                  slope = beta)) +
  ylim(c(0,10)) +
  #Nationality is stored by REsims in the groupID variable
  facet_wrap(~groupID) +
  ggtitle("Effect of changes in party seat share on MEPs local staff in different countries")

Since we have simulated the random intercepts, we also have an estimation of their standard error. I can then create confidence intervals.

tab_slope %>%  
  #Add the grand mean to the data
  mutate(a = fixef(mod.slope)[1]) %>%
  ggplot +
  geom_abline(aes(intercept = a + alpha,
                  slope = beta)) +
  #Upper confidence interval
  geom_abline(aes(intercept = a + alpha + 1.96 * `se(alpha)`,
                  slope = beta + 1.96 * `se(beta)`),
              lty = 2) +
  #Lower confidence interval
  geom_abline(aes(intercept = a + alpha - 1.96 * `se(alpha)`,
                  slope = beta - 1.96 * `se(beta)`),
              lty = 2) +
  ylim(c(-10,10)) +
  #Nationality is stored by REsims in the groupID variable
  facet_wrap(~groupID) +
  ggtitle("Effect of changes in party seat share on MEPs local staff in different countries")

From this plot, we see that even with more than 7000 observations, the varying-slope model yields very uncertain results. The model regularly predicts that a likely outcome is a negative staff size (not possible).

Literature

Berry, William D., Matt Golder, and Daniel Milton. 2012. “Improving Tests of Theories Positing Interaction.” The Journal of Politics 74 (3): 653–71. https://doi.org/10.1017/S0022381612000199.

Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Analytical Methods for Social Research. Cambridge ; New York: Cambridge University Press.