Models for nested data: Workflow for hierarchical models
In this notebook, we will be going through the workflow of when we analyze hierarchical data.
We will familiarize with:
- the workflow for structured data analysis
- descriptive statistics:
- within-group variation
- how to estimate hierarchical models using the
lme4
package- with varying intercepts
- with varying intercepts and varying slopes
- cross-level interactions
- with level-2 variables
- interpretation
New/repeated R codes:
- dplyr package:
lag()
,filter()
,arrange()
,select()
- ggplot2 package:
facet_wrap()
,geom_abline()
- lme4 package:
lmer()
,ranef()
,fixef()
,coef()
- merMod package:
REsim()
REplot()
- sjPlot package:
plot_model()
,plot_models()
- stargazer package:
stargazer(., add.lines = , column.labels = )
We will use the data on Members of the European Parliament (MEPs).
This time, the data includes several groupings; it is a time-series
cross-sectional (imbalanced panel) data in which each MEP is observed
every 6 months over a 5-year period. Thus, I have observations of
individuals (ID
) in up to 10 periods (Period
).
These individuals also have a nationality (Nationality
).
Thus, I have at least 3 “groupings” that are relevant for different
reasons.
- accounting for individual groupings allows me to adjust the standard error of my predictors.
- accounting for nationality could capture potential confounders at the member state level, including the simpsons paradox
- accounting for time may allow me to control away potential periodic changes
I begin by loading in my data. I also rename a few of my variables for ease.
load("MEP.rda")
<- MEP
df
#Rename the two main variables
<-
df %>%
df mutate(
#The number of local assistants employed by each MEP in 2014
y = ShareOfLocalAssistants,
#The vote share (in percentage points) that the national party got last election
x = SeatsNatPal.prop)
I pick my models as part of my research design. That is, I ask myself what is the most relevant correlation/variation, given my theory.
- in experiments: you can create that variation and randomize the rest (cut out confounders)
- in observational studies: you’ll have to “hunt” for the variation you want and control away the rest
Work-flow
This hunt for variation determines my workflow.
- empirical implications of theory I start as any political scientist and think about the data generating process and the causal mechanisms (if that’s what I’m interested in) to identify which variation I want.
May research question this time is whether MEPs compensate for lack of (national) party funding by hiring local assistants. I don’t have a good measure of party funding, so instead I rely on the MEP’s national party’s size in the national parliament. Often, state funding for parties is determined by the size of the party, although the rules may change between countries.
Ideally, if this hypothesis holds, I would see that the same MEP would change their hiring practices following national elections that change the party’s seat share in parliament. This is a within-MEP (within-group) variation that would hold both national-level rules and individual-level idiosynchracies constant.
To leverage this variation I would run a varying-intercept model; either as a fixed-effects model (without pooling) or a hierarchical model (with partial pooling).
- feasabilty (summary statistics)
My next move is to check if I have enough variation in both my x and my y to actually draw on that type of variation. This means I will spend additional time on the descriptive statistics.
- control for confounders
Last, I’d consider the potential confounders that could bias my results. They can either be controlled away by explicitly observing the variable (including it) or by locating where the confounderes might be located; i.e. I’d identify group identities and leverage the within-group variation.
If I draw on within-MEP variation, my biggest threat would come from
time-varying events. In this data, there was a European election in 2014
(EPElection
) and an administrative reform that capped the
local spending implemented in 2016 (and onwards;
Reform2016
).
- Run the regression/estimate your model
The estimation of the model itself is usually less challenging. However, when you have large data and/or many groupings, the estimation may take some time.
- Interpretation
The same principles of interpretation applies for hierarchical models as for other models. However, this time, there are more parameters in the model, and we are more reliant on careful descriptive statistics for constructing relevant scenarios.
Feasability: descriptive statistics
Exploring the variables when our data has a hierarchical structure requires a few additional steps. In addition to exploring the distribution of the \(x\) and \(y\) in the pooled (entire) data set, I also explore their distribution within the groups to see whether I have within-group variation.
Univariate statistics
Pooled data
I begin by my usual histogram of x and y. Both my variables of interest are continuous.
%>%
df +
ggplot geom_histogram(aes(y)) +
ggtitle("Pooled distribution of OUTCOME: Size of local staff among MEPs")
%>%
df +
ggplot geom_histogram(aes(x)) +
ggtitle("Pooled distribution of PREDICTOR: Size of national party")+
xlab("Proportion of seats in national parliament")
Within-group variation: non-pooled analysis
Second, I check the same distribution within my main grouping.
Visual inspection (few groups)
If I have few groups, then a visual inspection using a histogram or a
barplot of the univariate distribuition might be efficient. In ggplot2,
the facet_wrap()
function allows you to create a grid of
plots, one for each group.
%>%
df +
ggplot geom_histogram(aes(y)) +
ggtitle("Size of local staff among MEPs") +
facet_wrap(~Nationality)
Inspecting the spread on the x-axis, I see that there is at least some (and sometimes a lot) variation within the member states.
In numbers (few or many groups)
Often, I’ll also be interested in quantifying the variation. This is the case when I have many groups, but also to tell the reader how many times there is a change. In a fixed-effects model – leveraging only within-group variation – the effect of your treatment is driven only by the groups that have experienced a change. In other words, the size of your data set does not reveal the number of observations that you actually study.
Last, the within-group variation would be the most interesting to draw on when you construct scenarios for your interpretation.
A relevant question is whether staff size varies within MEPs. Will I be able to leverage within-individual changes to assess whether MEPs respond to different electoral incentives? Or will I have to rely on cross-sectional variation (between-individuals)?
Size of change I can check the variation by comparing each observation with the previous observation within the group. When data involves time, I’ll order the data chronologically, group by MEP identity, then calculate the difference between the observation and its lag.
<-
df %>%
df #Chronological
arrange(Period) %>%
#Within individuals
group_by(ID) %>%
#Difference in staff size between y (this observation) and lag of y (previous observation)
mutate(change_y = y - lag(y),
#Same for x, the party's seat share
change_x = x - lag(x)) %>%
#Ungroup data
ungroup()
I can now make a frequency table to see the number of times there has been a change (change_x != 0) and the number of times there has not been a change (change_x == 0).
table(df$change_x != 0)
##
## FALSE TRUE
## 5172 623
table(df$change_y != 0)
##
## FALSE TRUE
## 3752 2217
There are more changes in my outcome variable than in my predictor, so I won’t explain everything that goes on with only one predictor. However, I have plenty of changes, so I can leverage within-group variation.
I can summarize this either as number of changes (count) or size of change (continuous).
Change measured by numbers We can even look at how many times each MEP has experienced a change.
<-
df_change %>%
df #Group by individuals MEPs have individual ID numbers
group_by(ID) %>%
#Sum over the number of changes within each group/individual
reframe(n_changes_x = sum(change_x != 0, na.rm = T),
n_changes_y = sum(change_y != 0, na.rm = T))
I can then plot this new data frame as a bar plot.
%>%
df_change +
ggplot #Number of occurences of each n_changes_x
geom_bar(
aes(
n_changes_x
)+
) ggtitle("Number of changes in x") +
xlab("Number of alteration in each MEPs' national party size")
Most of the MEPs that have experienced a change, have only experienced it once.
Your turn!
- can you do the same for the dependent variable?
Change measured by size The size of the change is most useful for constructing scenarios later, because it tells me what the “typical” change might be. However, I’m more interested in what the likely size of a change is if there has been a change. I therefore subset the data to only observations with a change.
%>%
df filter(change_x != 0) %>%
+
ggplot geom_histogram(aes(change_x )) +
ggtitle("Typical change in party size after national elections")
#Summary stats in numbers
%>%
df #Filter out the non-changes
filter(change_x != 0) %>%
#I specify which package to draw the select() function from; because several packages have functions with same name
::select(change_x) %>%
dplyrsummary(change_x)
## change_x
## Min. :-0.470769
## 1st Qu.:-0.053869
## Median :-0.003333
## Mean :-0.007253
## 3rd Qu.: 0.052754
## Max. : 0.470769
Sometimes the change in x is positive – the MEPs’ party gained more seats in the national parliament – other times it is negative. From the summary statistics, we see that a typical change would be +/- 0.05 (5 percentage point change in either direction). This is useful for the scenarios later on where my interpretation will be “conditional on a national election, the typical change in party size…”.
Bivariate statistics
Let’s explore the bivariate relationship in a pooled data first, before we check the within-group correlation. This corresponds to a non-pooled analysis; i.e. I analyze the relationship separately for each group.
Pooled data
Here, I plot both a smoothed line and a bivariate linear regression.
%>%
df +
ggplot geom_smooth(aes(x = x,
y = y))+
geom_smooth(aes(x = x,
y = y),
method = "lm",
lty = 2,
color = "black") +
ylab("Size of local staff") +
xlab("Party size (seat share in national parliament)") +
ggtitle("Bivariate relationship between party size and local investment")
The loess line indicates first a drop, then an increase etc. What is going on? The liner model says I should expect more local investment among MEPs that received large support last election; not really what I expected.
Non-pooled data
Let’s check how this looks within member states. Member states have different financing laws for parties and vary by labor cost, so we may be tapping in to some cross-national variation unrelated to party funding.
With many groups, it is easier to get a sense of the relationship when I don’t plot the uncertainty. The same goes for linear predictions instead of loess lines (local regression). The larger the data, the longer it will take to calculate a loess line as well.
%>%
df +
ggplot geom_smooth(aes(x =x,
y = y,
color = Nationality),
se = F, method = "lm") +
ggtitle("Within-state relationship between party size and local staff size") +
theme(legend.position = "none")
I can even run a separate regression (draw a separate line) per MEP, then parse them out by nationality
%>%
df +
ggplot geom_smooth(aes(x =x,
y = y,
group = Lastname),
color = "black",
alpha = 0.5,
lwd = 0.5,
se = F, method = "lm") +
ggtitle("Within-individual relationship between party size and local staff size") +
#Crop the y-coordinates to zoom in on most of the observations
coord_cartesian(ylim = c(0,7)) +
#Remove the legend
theme(legend.position = "none") +
#Create a separate grid per member state
facet_wrap(~Nationality)
We get a sense that the effect might in general be negative, also within MEPs. However, there are exceptions.
Analysis
Let’s run some models instead. Running the models is usually the part of the research process that takes the least time (although, with large data and many intercepts, R might take a few seconds to arrive with an answer).
Random intercepts
I usually start with a simple pooled analysis with one predictor.
<- lm(y ~ x ,
mod.pool
df)
summary(mod.pool)
##
## Call:
## lm(formula = y ~ x, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.561 -1.561 -0.519 0.652 40.470
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.56125 0.05743 44.596 <2e-16 ***
## x -0.44420 0.18926 -2.347 0.019 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.803 on 6946 degrees of freedom
## (195 observations deleted due to missingness)
## Multiple R-squared: 0.0007924, Adjusted R-squared: 0.0006486
## F-statistic: 5.509 on 1 and 6946 DF, p-value: 0.01895
I already here find indications that MEPs hold a smaller local staff when their parties have many seats in the national parliament.
I then add individual intercepts to leverage within-individual
variation. To do so, I use the lmer()
function from the
lme4 package.
<- lmer(y ~ x + (1|ID),
mod.id.ran
df)
summary(mod.id.ran)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ x + (1 | ID)
## Data: df
##
## REML criterion at convergence: 28382.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -7.5855 -0.3152 -0.0619 0.2933 25.4528
##
## Random effects:
## Groups Name Variance Std.Dev.
## ID (Intercept) 6.545 2.558
## Residual 2.160 1.470
## Number of obs: 6948, groups: ID, 1152
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.5810 0.1101 23.433
## x -0.3275 0.3232 -1.013
##
## Correlation of Fixed Effects:
## (Intr)
## x -0.707
I get a weaker and insignificant effect. Yet, I leverage the right kind of variation, don’t I?
Between-group variation I can also see from the R output that the spread in the between-MEP distribution is quite high, with a standard deviation of 2.558. It means that I should expect that the range in staff size among the 95% most “normal” MEPs is 2.558* 4 = 10.233 local employees (0 +/- 1.96 standard deviations covers 95% of my between-group variation).
The “residual variation” is what remains once the between-group variation has been removed. Here, I see that the within-MEP variation is lower (1.47).
We can compare the between- and within-group variation by estimating the intraclass correlation (Gelman and Hill 2007, 259). Numbers closer to 1 indicate that group-members are very similar (i.e. each MEP look like themselves), while numbers closer to 0 means there is little between-group variation, meaning the model is pooling.
\[\frac{\sigma^2_\alpha}{(\sigma^2_\alpha + \sigma^2_y)}\] \[\frac{2.558}{(2.558 + 1.47)} = 0.6350546\]
Judging by the intraclass correlation, my groupings make sense; adding information on the personal identity of my observations add to the estimation of the model.
Fixed effects to leverage within-group variation
I can always go for a fixed-effect model to leverage only within-individual variation (i.e. over time). This is equivalent to running separate regressions for each MEP, then aggregating/averaging the effect. There is no pooling of within- and between-MEP variation and all observations with within-group variation count equally.
<- lm(y ~ x +
mod.id.fix as.factor(ID),
df)
stargazer(mod.id.ran, mod.id.fix,
type = "html",
#Don't report coefficients from variables containing the characters "as.factor"
omit = "as.factor")
Dependent variable: | ||
y | ||
linear | OLS | |
mixed-effects | ||
(1) | (2) | |
x | -0.328 | -0.667 |
(0.323) | (0.467) | |
Constant | 2.581*** | 1.731*** |
(0.110) | (0.472) | |
Observations | 6,948 | 6,948 |
R2 | 0.772 | |
Adjusted R2 | 0.727 | |
Log Likelihood | -14,191.110 | |
Akaike Inf. Crit. | 28,390.230 | |
Bayesian Inf. Crit. | 28,417.610 | |
Residual Std. Error | 1.466 (df = 5795) | |
F Statistic | 17.040*** (df = 1152; 5795) | |
Note: | p<0.1; p<0.05; p<0.01 |
Nope. The effect is stronger, but still insignificant. If the true within-individual effect is masked, it is by something that varies over time (i.e. within groups/within MEPs).
The EP capped the local spending in 2016. This induced the MEPs that spent the most locally to hire fewer staffers. As we have theorized, this choice is also connected to party funding, so let’s control for that. A similar argument can be made for the EP election in 2014, since we also have seen that mobilization my vary by electoral system.
Here, I estimate a fixed-effects and a random-effects (both varying-intercept) model with time-dependent covariates.
<- lmer(y ~ x
mod.id.ran.t + EPElection
+ Reform2016
+ (1|ID),
df)
<- lm(y ~
mod.id.fix.t +
x +
EPElection +
Reform2016 as.factor(ID),
df)
stargazer(mod.id.ran.t, mod.id.fix.t,
type = "html",
column.labels = c("Random effects", "Fixed effects"),
omit = "as.factor")
Dependent variable: | ||
y | ||
linear | OLS | |
mixed-effects | ||
Random effects | Fixed effects | |
(1) | (2) | |
x | -0.650** | -1.218*** |
(0.320) | (0.458) | |
EPElection | 0.351*** | 0.340*** |
(0.059) | (0.059) | |
Reform2016 | -0.693*** | -0.696*** |
(0.047) | (0.047) | |
Constant | 2.756*** | 1.945*** |
(0.110) | (0.462) | |
Observations | 6,948 | 6,948 |
R2 | 0.782 | |
Adjusted R2 | 0.739 | |
Log Likelihood | -14,063.500 | |
Akaike Inf. Crit. | 28,139.010 | |
Bayesian Inf. Crit. | 28,180.080 | |
Residual Std. Error | 1.434 (df = 5793) | |
F Statistic | 18.010*** (df = 1154; 5793) | |
Note: | p<0.1; p<0.05; p<0.01 |
Now, we’re talking! The strictly within-comparison of the fixed-effects model still gives the strongest estimate. This is because the hierarchical implementation clearly pools the results with between-MEP variation, so the result shrinks to the (between-group) mean. We can address this by adding more controls.
Several groups of random effects
Let’s leverage both within- and between-individual variation (for good sports) using random intercepts. However, I’d like to add yet another group variable: nationality. Nationality controls for other potential confounders. For example, both seats in parliament/party funding and staff size are reasonably linked to nationality: One because of the national laws, the other because of labor cost.
It is not a good idea to have this type of nestings in fixed-effects models; they would be non-identified. On the other hand, if this variation is strictly at the national level, it is subsumed in the MEP fixed effects (i.e. no MEP has two nationalities).
<- lmer(y ~
mod.ran
x+ Reform2016
+ EPElection
+ (1|ID)
+ (1|Nationality)
, df)
stargazer(mod.pool, mod.id.ran.t, mod.id.fix.t, mod.ran,
omit = "as.fa",
#Give labels to the columns
column.labels = c("Pooled", "Random effects", "Fixed effects", "Random effects"),
#Add lines with additional info: this comes in a list where each vector == the number of columns in the table
add.lines = list(c("Individual fixed/random effects","no","yes","yes","yes"),
c("National fixed/random effects","no","no","no","yes")),
title = "Random effects (on individual MEPs and their nationality)",
type = "html")
Dependent variable: | ||||
y | ||||
OLS | linear | OLS | linear | |
mixed-effects | mixed-effects | |||
Pooled | Random effects | Fixed effects | Random effects | |
(1) | (2) | (3) | (4) | |
x | -0.444** | -0.650** | -1.218*** | -1.489*** |
(0.189) | (0.320) | (0.458) | (0.288) | |
EPElection | 0.351*** | 0.340*** | 0.351*** | |
(0.059) | (0.059) | (0.059) | ||
Reform2016 | -0.693*** | -0.696*** | -0.697*** | |
(0.047) | (0.047) | (0.047) | ||
Constant | 2.561*** | 2.756*** | 1.945*** | 2.776*** |
(0.057) | (0.110) | (0.462) | (0.373) | |
Individual fixed/random effects | no | yes | yes | yes |
National fixed/random effects | no | no | no | yes |
Observations | 6,948 | 6,948 | 6,948 | 6,948 |
R2 | 0.001 | 0.782 | ||
Adjusted R2 | 0.001 | 0.739 | ||
Log Likelihood | -14,063.500 | -13,802.100 | ||
Akaike Inf. Crit. | 28,139.010 | 27,618.210 | ||
Bayesian Inf. Crit. | 28,180.080 | 27,666.130 | ||
Residual Std. Error | 2.803 (df = 6946) | 1.434 (df = 5793) | ||
F Statistic | 5.509** (df = 1; 6946) | 18.010*** (df = 1154; 5793) | ||
Note: | p<0.1; p<0.05; p<0.01 |
Interpretation
Often, our primary interest is in the main effects (according to the R output, the “fixed effects”) of the model. However, we may also draw information from the varying intercepts (\(\alpha\)).
\[y = a + bx_i + \alpha_j\]
There are two approaches to the interpretation.
Main (non-varying) effects (refer to the grand mean)
Refer to the grand mean (a) and construct a stylistic scenario while ignoring the varying intercepts (i.e. treat them as residuals). The interpretation of the effects then follow the exact same rules as in the pooled linear model.
\[y = a + bx_i + \epsilon_j\] The “fixed” (R vocabulary)/non-varying effects of the model can be interpreted and displayed as per usual.
fixef(mod.ran)
## (Intercept) x Reform2016 EPElection
## 2.7758910 -1.4888207 -0.6970583 0.3505659
I can create scenarios and use the predict()
function
for point predictions, or I can use the ggeffects package, for
example.
library(ggeffects)
#The varying part of the scenario; take inspiration from the descriptive statistics
<- seq(0, 0.5, 0.01)
vote_share
<- ggpredict(mod.ran,
eff #What type of graphic would I like to illustrate: here, the "fixed"/non-varying effects
type = "fe",
terms = list(x = vote_share)
)#Make the plot
%>%
eff +
plot ylim(c(0,10)) +
ylab("Local staff size") +
xlab("Seat share in parliament") +
ggtitle("Effect of party funding (size of party) on MEP's local spending")
The ggeffects package includes also variation from the groups, so the precision of the scenarios we make tend to be surrounded with more uncertainty than if we ran a pooled model.
Varying effects (refer to group intercepts)
We can also make the interpretation in relation to specific groups/intercepts.
Choose one or two groups/varying intercepts that you refer to in your scenarios. This is usually more interesting if your groupings are known quantities. In our example, there is little insight to gain from picking a single MEP. However, when my model includes national intercepts, then such scenarios may contribute to the reader’s understanding of your findings. This requires us to pull out the random effects from the model object.
We don’t see the random effects in the default R summary of the model, but they are there.
We can ask for just the random (varying) effects. They are reported as deviations from the grand mean.
ranef(mod.ran)$Nationality %>% head
Here, we see that MEPs from Austria have an average staff size of -0.44 employees fewer than the grand mean/the mean of means among MEPs.
We can also ask for all the model coefficients. Here, the two intercepts (the grand mean and the varying intercept) are already added together.
coef(mod.ran)$Nationality %>% head
We can then fill in the equation for two national groups: Bulgaria and Sweden, for example. I start by extracting the coeffients/parameters.
coef(mod.ran)$Nationality[c(3,27),]
\[y = (a + \alpha_1) + b_1x_{1i}+ b_2x_{2i}+ b_3x_{3i}\]
\[y_1 = (2.78 + 1.53) + -1.49x_1 + -0.7x_2 + 0.35x_3\] \[y_1 = 4.3 + -1.49x_1 + -0.7x_2 + 0.35x_3\]
I then construct my scenario. I can draw inspiration from the actual data or use the summary statistics.
#Actual data
%>%
df filter(Nationality == "Bulgaria"| Nationality == "Sweden") %>%
arrange(ID) %>%
::select(Nationality, y, x, EPElection, Reform2016) %>%
dplyrhead()
I then fill in and make my predictions for two identical scenarios
for two separate MEPs. Here, I use the predict()
function
instead of filling in by hand.
<- data.frame(
scenario #Summary statistics showed that 0.05 is a typical change: i.e. here I take a baseline of 10 percent seat share, then increase to 0.15
x = c(0.1, 0.15),
EPElection = 0,
Reform2016 = 0,
#Change of x in both countries == 4 scenarios
Nationality = c("Bulgaria", "Bulgaria", "Sweden", "Sweden"),
#Model contains MEP incercepts, so I need a variable on that; here, I invent an ID that doesnt exist
ID = c("new")
) scenario
#Predict
<- predict(mod.ran,
preds newdata = scenario,
#Since I "invented" an MEP ID, I need to let the prediction draw randomly from the distribution of ID-intercepts
allow.new.levels = T)
preds
## 1 2 3 4
## 4.1559480 4.0815069 0.6732192 0.5987782
diff(preds)
## 2 3 4
## -0.07444104 -3.40828773 -0.07444104
In pre-reform EP outside of election year, Bulgarian MEPs from a party holding 10% of the seats in the national parliament had an average predicted staff size of 4.16 local employees, while Swedish MEPs in the same situation held substantially fewer staffers (0.67). The staff size is furthermore predicted to decrease by -0.07 staffers for both MEPs if their party were to gain 5 percentage points more of the seat share in the next national election.
Distributions
Remember, the group means/varying intercepts are assumed to be normally distributed in the hierarchical model. Together, their means also constitute a normal distribution. In other words, we have 28 distributions at the member-state level and more than 1100 at the MEP level.
This is a lot of distributions (and assumptions), so there are some
controversy over whether we can reliably estimate them. That’s why the
lme4
package doesn’t report the random intercepts with any
p-values. However, R has other packages.
arm
has two twin functions:se.fixef()
andse.ranef()
that are useful.merTools
allows us to also simulate the random/varying coefficients.
REsim()
simulates the distribution form the reported
random intercept (point estimate/regression coefficient) and its
standard error (within-group variation).
library(merTools)
<- REsim(mod.ran,
re #Simulate 100 times
n.sims = 100)
%>% head re
The best way is still to display these intercepts graphically. You can use the simulations to construct your own plots: coefplots or density plots are nice possibilities.
plotREsim()
is a ready-made function for plotting.
plotREsim(re)
It gives us a sense of the variation between the groups. Here, we see that there is still quite some individual-level variation that we haven’t accounted for with other predictors. The groups marked in black have a 95% confidence interval that does not overlap zero. We can also see there are member states that have higher/lower level of local staff than what we might expect given the predictors in the model.
sjPlot
reports effects from all kinds of models. It also gives a nice labelling of our random intercepts.
Here, I only ask for the varying intercept for nationality.
library(sjPlot)
plot_model(mod.ran,
type = "re",
terms = "Nationality")
## [[1]]
##
## [[2]]
Red are states below average; blue are states above average. The color scheme is in relation to the point estimate, not the confidence interval.
Level-2 regression
I can add group-level predictors to my model. The random intercepts are a way to cluster the residuals. The level-two variables are then regressed on these group means.
In R it looks the same, though. Here, I add electoral system to the mix so that I regress the MEP-level residuals on the type of electoral system they compete in.
<- lmer(y ~
mod.ran.t.open
x + OpenList
+ EPElection
+ Reform2016
+ (1|ID)
,
df)
summary(mod.ran.t.open)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ x + OpenList + EPElection + Reform2016 + (1 | ID)
## Data: df
##
## REML criterion at convergence: 28107.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -7.8842 -0.3212 -0.0456 0.3052 25.9205
##
## Random effects:
## Groups Name Variance Std.Dev.
## ID (Intercept) 6.414 2.533
## Residual 2.067 1.438
## Number of obs: 6948, groups: ID, 1152
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.38447 0.13602 17.530
## x -0.56735 0.31900 -1.779
## OpenList 0.71073 0.15431 4.606
## EPElection 0.35047 0.05909 5.931
## Reform2016 -0.69270 0.04722 -14.670
##
## Correlation of Fixed Effects:
## (Intr) x OpnLst EPElct
## x -0.598
## OpenList -0.591 0.053
## EPElection -0.041 -0.020 -0.004
## Reform2016 -0.106 0.055 0.003 0.091
The clustering means that the standard errors are also corrected to account for the correlation within the group (i.e. that observations are not i.i.d.). Here, it also means that the uncertainty of electoral system accounts for the fact that MEPs always competed under the same electoral rules during the period of study.
The interpretation of these models is the same as before, though.
Cross-level interactions: varying effects by predictors
We can make interactions across levels. I could formulate an expectation that MEPs from party-centered systems are more disciplined, i.e. they rely on their parties to give them a safe seat on the electoral list next election and are willing to chip in to the common pot by hiring more local assistants when their party is in need.
<- lmer(y ~
mod.ran.t.open.i * OpenList
x + Reform2016
+ (1|ID),
df)
summary(mod.ran.t.open.i)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ x * OpenList + Reform2016 + (1 | ID)
## Data: df
##
## REML criterion at convergence: 28121.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -7.8233 -0.3115 -0.0612 0.3035 25.9173
##
## Random effects:
## Groups Name Variance Std.Dev.
## ID (Intercept) 6.258 2.502
## Residual 2.083 1.443
## Number of obs: 6948, groups: ID, 1152
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.72722 0.15456 17.645
## x -1.75529 0.43690 -4.018
## OpenList 0.08607 0.21583 0.399
## Reform2016 -0.71099 0.04723 -15.055
## x:OpenList 2.61478 0.63564 4.114
##
## Correlation of Fixed Effects:
## (Intr) x OpnLst Rf2016
## x -0.718
## OpenList -0.711 0.513
## Reform2016 -0.072 0.016 -0.024
## x:OpenList 0.490 -0.687 -0.707 0.037
Indeed. There is an interaction effect, but what does it mean? I usually construct scenarios here. Since I am interested in the main regression coefficients (in R lingo, the “fixed effects”), I can use the usual techniques for interpretation. Here, I use ggpredict() to construct my scenario, then plot it.
<- ggpredict(mod.ran.t.open.i,
eff terms = list(
#Scenarios where x is sequenced from 0 to 60% seat share, passing by 1% increments
x = seq(0, 0.6, 0.01),
#Party- and candidate-centered electoral systems are "moderators"
OpenList = c(0, 1)))
%>%
eff +
plot ggtitle("Effect of party size conditional on the electoral system")
Considering the plot, we can see clearly that the effect of party funding on MEPs local investment is different depending on the electoral system. More party funding/larger seat share in party-centered systems (red line) lead MEPs to invest less. More party funding in candidate-centered systems lead MEPs to invest more.
According to Berry, Golder, and Milton (2012) I should interpret both of these effects and compare with my theory. What does this mean? And how robust is this? From the graphic, it seems that – while positive – the blue line is imprecise. Comparing the minimum and maximum of x, we see that the confidence intervals overlap. The effect in the red line is clearly distinct between the two extreme scenarios.
Another way of interpreting is in text and numbers. I can construct scenarios and calculate first-differences. When I do that, my focus is less on statistical significance, but instead with an aim to communicate the substantive meaning of my findings.
<-
first_diff %>%
eff as.data.frame() %>%
filter(x %in% c(0, 0.05)) %>%
arrange(group, desc(x)) %>%
group_by(group) %>%
mutate(diff_staff = predicted - lag(predicted),
eff_MEP = 1/diff_staff)
first_diff
If we consider a typical scenario where the national party loses 5 percentage points of the seats in the national parliament, we’d expect an average increase among MEPs’ local staff size of 0.09 in the party-centered systems in the EU. A reasonable interpretation is that one in 11 MEPs from party-centered systems would react to the party’s defeat by hiring an additional local assistant.
Since the marginal effect of x is not different from zero in the candidate-centered systems, we might point out that here.
Random slopes: varying effects by groups
We can estimate separate slopes for each country. This is a form of interaction effect. However, we specify it differently. It means we will estimate mini-models within each group. It requires a lot of variation (and will “explain” a lot of variation).
Here, I estimate a different slope for party seat share per member state.
<- lmer(y ~
mod.slope + EPElection
+ Reform2016
+ (1|ID)
#x has varying slopes by nationality
+ (x |Nationality),
df)
summary(mod.slope)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ +EPElection + Reform2016 + (1 | ID) + (x | Nationality)
## Data: df
##
## REML criterion at convergence: 27512.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -7.3179 -0.3262 -0.0320 0.3163 22.9263
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## ID (Intercept) 3.751 1.937
## Nationality (Intercept) 8.821 2.970
## x 69.142 8.315 -0.88
## Residual 2.006 1.416
## Number of obs: 6948, groups: ID, 1152; Nationality, 28
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.19920 0.28348 7.758
## EPElection 0.35359 0.05826 6.070
## Reform2016 -0.70453 0.04666 -15.099
##
## Correlation of Fixed Effects:
## (Intr) EPElct
## EPElection -0.023
## Reform2016 -0.030 0.092
We can’t see the random effects, but they are there. Let’s check them out.
Interpretation
The random intercepts are stored, but not immediately reported.
We can always take a look at the coefficients using the
coefficients()
function. It reports the coefficients for
all groupings: Their (varying) intercepts, and their slopes regardless
of whether they are “fixed” or “vary” by group.
Here, I look at the first coefficients for the 10 first MEPs.
coefficients(mod.slope)$ID %>% head
And here I can see the varying slopes i estimated.
coefficients(mod.slope)$Nationality %>% head
The intercepts are interpreted as deviations from the (grand) mean. Here, we see that Austria has a higer estimated number of local staff that the adjusted average (grand mean).
I can separate out the “fixed” effects (those that are common for all groups) and the “random” effects (the varying slopes and intercepts) using two functions.
fixef(mod.slope) %>% head
## (Intercept) EPElection Reform2016
## 2.1992021 0.3535862 -0.7045293
And here are the varying ones
ranef(mod.slope) %>% head
## $ID
## (Intercept)
## 840 0.451494431
## 988 -0.816094099
## 997 -0.996768158
## 1023 1.558450395
## 1037 -0.125861566
## 1038 -1.457848844
## 1055 -0.223846838
## 1059 -0.577320000
## 1073 2.615850453
## 1122 -0.526169667
## 1129 0.450340471
## 1164 0.908069794
## 1179 -1.047151145
## 1183 -0.636678449
## 1185 -0.633301531
## 1186 -0.012527499
## 1191 1.035992945
## 1204 -0.033626526
## 1253 -1.312769816
## 1263 1.294336010
## 1309 0.242699713
## 1351 -0.653229782
## 1394 2.657319019
## 1403 -0.762275828
## 1405 0.081354880
## 1407 -0.660670580
## 1419 0.442702997
## 1431 -1.071664450
## 1473 -0.806503165
## 1654 1.536774233
## 1665 2.101859740
## 1746 -0.635420041
## 1832 -1.101966856
## 1849 -1.258572267
## 1852 -0.220784797
## 1854 -0.679323334
## 1892 -0.553327789
## 1902 -0.456004390
## 1906 0.808714019
## 1907 -1.178700625
## 1909 -0.078142868
## 1911 -1.153939490
## 1913 -0.394553640
## 1914 -0.094656273
## 1917 -0.261618973
## 1922 0.918293978
## 1927 -1.359863573
## 1928 -1.078155919
## 1929 0.493970769
## 1930 2.120037296
## 1933 -0.722505953
## 1941 0.899446897
## 1945 1.441469072
## 1946 0.300335389
## 1956 -1.044465581
## 1965 -0.729670405
## 1985 -1.382541639
## 1993 -0.969599255
## 2002 -0.969599255
## 2025 -1.172705301
## 2037 -0.460667296
## 2054 0.134379907
## 2073 -0.287659671
## 2081 -1.022018697
## 2097 2.249443582
## 2109 1.584342273
## 2119 0.483276045
## 2125 0.201578234
## 2128 1.177500828
## 2155 1.392600609
## 2173 -0.841344638
## 2187 -1.555376266
## 2212 2.588426094
## 2224 4.288125998
## 2229 0.347043872
## 2247 -0.237981774
## 2278 0.288430847
## 2295 -0.315013021
## 2307 -1.190726669
## 2309 -0.970896644
## 2315 -0.016907207
## 2319 0.450340471
## 2323 2.623261248
## 2327 0.376802948
## 2338 -0.023882423
## 2341 -0.030938335
## 4246 1.427509623
## 4253 -1.007858214
## 4260 -1.287739778
## 4262 -0.616041180
## 4265 -0.630166924
## 4267 -0.014860715
## 4270 -0.647127240
## 4271 2.300711355
## 4274 -0.285888347
## 4275 2.300711355
## 4282 -0.505554491
## 4289 2.744283352
## 4293 -1.233214297
## 4294 -0.541831895
## 4308 3.005230033
## 4318 -1.175726056
## 4319 -0.500295321
## 4326 0.837141330
## 4328 0.282758539
## 4334 -1.279844981
## 4335 1.463527164
## 4342 -1.365615346
## 4345 0.217567264
## 4346 -1.523982047
## 4347 0.441125239
## 4348 -1.523982047
## 4393 -1.683447604
## 4412 -0.642919112
## 4423 -0.907293689
## 4429 -0.247661772
## 4432 -0.043358938
## 4436 1.066220592
## 4440 -0.067244099
## 4462 2.549089625
## 4465 0.178105649
## 4482 0.735076135
## 4507 3.075445284
## 4513 -1.508708973
## 4514 -0.414139977
## 4516 -1.459478609
## 4519 -1.141968753
## 4521 -1.034092816
## 4525 0.196265161
## 4529 -0.594814035
## 4531 -0.354826726
## 4532 0.062025654
## 4533 1.197192058
## 4536 0.442702997
## 4537 -1.364037589
## 4538 0.623377055
## 4540 -1.375543337
## 4542 -1.129016047
## 4545 -0.382582902
## 4546 -1.364037589
## 4550 -0.314429107
## 4553 -1.129016047
## 4554 1.573274667
## 4555 -2.173171592
## 4556 0.484678885
## 4560 -0.187648229
## 4746 0.176750663
## 4751 -0.890469849
## 5537 -0.066228962
## 5565 0.868144171
## 5729 0.659352002
## 5737 0.079777122
## 5846 0.851419104
## 5956 -0.639230672
## 21331 -0.860247332
## 21817 -1.940001697
## 21818 -0.365271513
## 21918 -0.085637022
## 22097 0.637415093
## 22418 -0.215525628
## 23413 0.197368682
## 23691 0.346722932
## 23693 -0.886890164
## 23695 2.517563391
## 23699 0.040531588
## 23704 -0.075163715
## 23705 -2.179962130
## 23706 0.582670591
## 23707 -1.291257089
## 23712 -1.457265896
## 23746 3.746134311
## 23768 3.663533973
## 23781 -3.937481681
## 23782 -0.770721383
## 23784 3.612694758
## 23785 3.196235009
## 23787 1.582540077
## 23788 -3.182790166
## 23791 2.240128010
## 23792 -2.731105020
## 23805 -1.164583493
## 23808 -1.706605668
## 23813 -0.849600054
## 23816 2.209936422
## 23819 -0.849600054
## 23820 0.595792414
## 23821 -0.637760517
## 23840 0.876831613
## 23866 -1.714750215
## 23868 -1.708016203
## 23873 0.995360663
## 23894 0.840754357
## 23938 -1.344323889
## 24030 2.749689648
## 24505 0.226782496
## 24594 -1.047151145
## 24922 0.622504602
## 24942 -0.430758702
## 25704 -0.233787876
## 25718 -0.164095799
## 25919 -0.556326421
## 26829 0.554519088
## 26837 0.386477523
## 26851 0.585159556
## 28112 -0.554706264
## 28114 0.201578234
## 28115 0.258968628
## 28117 1.252034322
## 28120 1.177690113
## 28122 -0.267476370
## 28123 0.669859761
## 28124 0.476452060
## 28125 -0.450462726
## 28126 -0.486340714
## 28130 -0.630166924
## 28131 -0.469315320
## 28132 -1.698555435
## 28135 -0.637760517
## 28139 1.159698093
## 28141 -0.126903820
## 28145 -0.849600054
## 28148 -0.453029822
## 28150 -0.732683748
## 28153 -1.015544874
## 28154 -0.754336884
## 28155 -1.706605668
## 28156 1.716972234
## 28161 0.078608704
## 28165 0.817581202
## 28166 -1.106669259
## 28167 -0.164821550
## 28170 -1.528156105
## 28171 -0.365889937
## 28174 -0.727238054
## 28177 -0.453029822
## 28178 0.271214970
## 28182 -0.707217933
## 28192 2.300456095
## 28193 -1.365615346
## 28206 -0.481747055
## 28208 0.039678695
## 28210 -0.340014231
## 28214 -1.205514827
## 28219 1.863875398
## 28220 3.386297992
## 28221 2.481385413
## 28223 -0.315708029
## 28224 2.718184479
## 28225 0.534950159
## 28226 -0.790324185
## 28227 0.948500813
## 28228 1.130588420
## 28229 -0.578029743
## 28233 -0.213804523
## 28238 -1.163036836
## 28240 -1.795858378
## 28241 -1.995837519
## 28242 -1.512530330
## 28243 -1.995837519
## 28244 -2.357185636
## 28246 -1.273141285
## 28247 1.256295535
## 28248 -0.021116157
## 28251 1.211625682
## 28252 -0.648704493
## 28253 -1.776167019
## 28255 3.961765365
## 28256 -1.101966856
## 28257 0.524723883
## 28266 0.489617090
## 28269 -3.169717519
## 28275 2.421186982
## 28276 -2.035435562
## 28278 -1.523982047
## 28279 5.462537874
## 28280 1.397367320
## 28284 -3.034111388
## 28288 -0.037250887
## 28291 -0.186137365
## 28292 -1.473085421
## 28295 -1.523982047
## 28297 -2.663460284
## 28298 6.975082165
## 28299 1.216693261
## 28300 0.823524244
## 28301 0.184236655
## 28306 -0.101028639
## 28307 -0.539845970
## 28308 -0.831989890
## 28310 0.889573236
## 28314 1.506660403
## 28316 -0.825325401
## 28320 3.059585567
## 28321 -0.054997634
## 28323 0.530148748
## 28324 1.614193099
## 28330 3.676954300
## 28331 -0.553170073
## 28336 -0.449907406
## 28338 -1.523982047
## 28340 -1.336235182
## 28341 -0.400336236
## 28342 -0.801285813
## 28346 -1.983504415
## 28349 -1.523982047
## 28352 -1.245547215
## 28353 2.420802068
## 28354 -0.342176922
## 28358 6.212260878
## 28362 2.643881916
## 28365 -1.526081645
## 28367 -0.894439713
## 28370 1.608242768
## 28372 5.579827336
## 28377 -1.721319298
## 28379 0.316076347
## 28380 -0.998623064
## 28389 -1.373353161
## 28390 0.069244066
## 28393 0.069244066
## 28394 -2.672763271
## 28397 0.638783454
## 28398 -0.972007968
## 28399 0.354013760
## 28400 -0.974911478
## 28404 -1.512847156
## 28407 -0.620072052
## 28409 0.766964335
## 28422 2.344839365
## 28424 0.202552343
## 28429 -1.374702207
## 28451 2.387519755
## 28463 -0.155219303
## 28474 0.009406608
## 28477 -0.052791859
## 28481 -0.274706966
## 28493 -1.735688227
## 28497 -1.749661081
## 28506 0.432400475
## 28508 -0.688345043
## 28513 0.251726416
## 28514 -1.753613993
## 28572 -1.071664450
## 28584 6.281285378
## 28586 1.431265493
## 28615 -0.178496514
## 28617 0.631089242
## 28618 -1.161834738
## 28619 -1.189517542
## 29074 2.525513424
## 29579 -0.861169238
## 30190 -0.525702510
## 30475 20.475115983
## 33569 -0.606938275
## 33570 -0.052791859
## 33775 1.434832561
## 33982 1.256358308
## 33984 -0.604631083
## 33989 -0.224938158
## 33990 -1.367312995
## 33991 -1.967846953
## 33997 -0.797270072
## 34231 2.481631381
## 34232 1.795484235
## 34234 -0.812656520
## 34235 -0.451308402
## 34249 0.061712768
## 34250 -1.051272058
## 34254 0.646285690
## 34728 1.029294829
## 35135 1.564546232
## 35743 -0.524967749
## 36281 2.065494117
## 36392 -0.154768231
## 36396 0.192275104
## 37229 -0.289821314
## 37312 -0.754336884
## 37324 1.743082881
## 37448 -0.382975428
## 37524 -0.965876629
## 37676 0.764574484
## 38398 0.952774896
## 38420 0.095135871
## 38511 1.882937221
## 38596 0.013835090
## 38601 0.086232412
## 38602 0.355701044
## 38605 0.605147678
## 38610 -0.979780837
## 38613 2.885137864
## 39317 -1.857762391
## 39319 -1.857762391
## 39321 -1.523982047
## 39711 0.439524461
## 39713 0.593989797
## 39714 0.751617055
## 39717 -1.009690827
## 39721 0.781742151
## 39722 -2.500515999
## 39724 -0.126564382
## 39725 -1.075796695
## 39726 2.707164628
## 39916 -0.186137365
## 40224 -1.770938491
## 40599 -0.730143061
## 41007 -0.726619630
## 58758 0.841925698
## 58789 -0.413077244
## 72775 -0.603165914
## 72779 -0.143780853
## 88882 5.275349792
## 93624 -1.011554643
## 94282 0.768804671
## 94283 0.251726416
## 95017 -2.500515999
## 95074 0.094914795
## 95281 -2.313249247
## 96603 -0.076402968
## 96639 0.602958563
## 96646 -0.590655544
## 96648 1.377329906
## 96650 -0.440503580
## 96651 -0.490432917
## 96652 -1.714750215
## 96653 -0.405362327
## 96654 0.814686605
## 96655 -0.167547918
## 96656 0.496914701
## 96658 -1.071664450
## 96660 -2.127983557
## 96661 2.381768402
## 96662 -1.127464680
## 96663 0.303909083
## 96664 -1.345035090
## 96668 -0.310570760
## 96670 0.131220562
## 96671 1.235658010
## 96672 0.195147299
## 96673 -0.469315320
## 96674 0.071333270
## 96675 -0.647042492
## 96676 0.793500675
## 96677 -0.513288766
## 96678 -0.648206325
## 96680 1.401231622
## 96681 0.119767999
## 96682 0.164426054
## 96683 0.013395877
## 96684 -0.658251330
## 96685 0.606187473
## 96686 -0.545668730
## 96692 -10.024896143
## 96693 3.301455460
## 96694 -4.177205193
## 96695 -1.132065270
## 96696 1.578045608
## 96697 -1.283347543
## 96698 -2.355128146
## 96701 -1.523982047
## 96702 -1.597145706
## 96703 2.901281347
## 96705 -0.545501762
## 96706 2.076381419
## 96707 -0.126903820
## 96709 -0.009062145
## 96710 -0.915692335
## 96713 -1.276591838
## 96714 1.293033923
## 96715 -1.872734941
## 96717 -1.113121772
## 96718 1.418853789
## 96719 -2.179962130
## 96725 -0.042367923
## 96726 -0.686945352
## 96728 0.213350017
## 96729 0.164167298
## 96730 -1.020651989
## 96731 -1.987224363
## 96733 0.718772838
## 96734 -0.261266139
## 96736 1.264908691
## 96739 -1.036472527
## 96741 -0.537285484
## 96746 0.441125239
## 96747 -0.318396220
## 96748 -0.111003162
## 96749 2.690826207
## 96750 1.323707684
## 96751 2.148804031
## 96752 -0.403650986
## 96753 3.565456022
## 96754 0.456009985
## 96755 0.357424721
## 96756 0.724796622
## 96757 -0.184597454
## 96758 -1.987224363
## 96760 -2.261486568
## 96761 0.063984897
## 96762 3.860914148
## 96763 0.733706685
## 96764 2.148645091
## 96765 0.069244066
## 96767 4.430453536
## 96768 -1.473085421
## 96769 -1.440655777
## 96770 2.842759788
## 96771 0.633307764
## 96772 4.145683842
## 96773 -2.983525971
## 96774 -1.998997665
## 96775 0.604617620
## 96776 -2.568537053
## 96777 0.313322969
## 96778 2.300737612
## 96779 -3.422846134
## 96780 -0.008490355
## 96781 -3.992385522
## 96782 -4.203528495
## 96783 -2.263341474
## 96784 0.436144109
## 96785 2.529361190
## 96786 6.031586819
## 96787 -1.133853113
## 96788 -3.661506319
## 96789 -0.275926830
## 96790 8.214614182
## 96791 -2.093920896
## 96792 -1.312743558
## 96793 -1.737397698
## 96794 -2.938810085
## 96795 -1.014701464
## 96796 6.434136418
## 96798 -0.832317935
## 96799 -0.590047324
## 96800 -1.285712551
## 96801 0.928589003
## 96802 2.056757531
## 96803 -2.821442049
## 96805 1.319523396
## 96806 -0.974911478
## 96807 -0.951175731
## 96808 0.656467272
## 96809 2.474025268
## 96810 1.080120955
## 96811 0.828629917
## 96812 -1.473085421
## 96813 -0.154768231
## 96814 -0.513101355
## 96815 0.526034697
## 96816 -0.361391915
## 96817 -0.545945572
## 96818 3.248209658
## 96819 -1.268641806
## 96820 -0.575596821
## 96823 -1.055123530
## 96824 -0.513101355
## 96825 -2.477469198
## 96826 -0.258067591
## 96827 -0.409399523
## 96829 -0.068221129
## 96831 0.269908979
## 96832 -1.706605668
## 96833 1.946886065
## 96834 -0.312081876
## 96835 -0.357336624
## 96836 -1.059016259
## 96837 -0.236348255
## 96838 -0.513101355
## 96839 -0.109783946
## 96840 0.269908979
## 96842 0.364832210
## 96843 2.525513424
## 96844 -1.533632416
## 96845 0.700921729
## 96846 -1.793937024
## 96847 -0.182476594
## 96848 0.371002106
## 96849 0.251143168
## 96850 1.001701068
## 96851 -0.495732313
## 96852 1.497655544
## 96854 -0.970348470
## 96855 -1.592250145
## 96857 -1.174170471
## 96859 -0.825290820
## 96861 -0.496477137
## 96862 0.782971355
## 96863 0.393830819
## 96864 -1.146388720
## 96865 0.681758955
## 96866 -0.559845935
## 96867 -0.639230672
## 96870 0.101164602
## 96871 2.853938310
## 96872 -0.911793168
## 96873 -0.550445051
## 96874 -1.634489402
## 96875 -1.995837519
## 96876 -0.434231369
## 96877 1.066763726
## 96878 -1.973020996
## 96880 2.081001144
## 96882 -1.230340853
## 96884 3.003916876
## 96886 -0.818532775
## 96887 -0.921194052
## 96888 -0.198497818
## 96889 0.718772838
## 96890 6.667116407
## 96891 -0.909080641
## 96892 -0.387002869
## 96896 -1.386403194
## 96897 0.006418698
## 96898 0.193053959
## 96899 1.250213605
## 96900 -1.071664450
## 96901 1.423633324
## 96902 0.264139620
## 96903 -0.553327789
## 96904 -0.470064837
## 96905 -0.708901869
## 96906 0.809513852
## 96907 -0.773020143
## 96908 -0.100896936
## 96910 -0.727238054
## 96911 -0.903848689
## 96912 0.147034797
## 96914 -0.131237927
## 96915 -1.463216227
## 96916 0.343524358
## 96917 -0.102233175
## 96918 -0.287659671
## 96919 0.010062728
## 96920 -1.034092816
## 96922 -1.098997649
## 96925 -1.857762391
## 96930 -0.917167349
## 96931 0.346066399
## 96932 -0.489476871
## 96933 0.137527280
## 96934 -0.633629168
## 96936 -0.310448859
## 96937 -0.600381152
## 96940 -0.554538454
## 96944 0.081354880
## 96945 -0.269768760
## 96946 0.736490600
## 96947 -0.466936046
## 96948 -0.308713152
## 96949 0.199909191
## 96950 0.631014529
## 96951 2.555159182
## 96952 -0.403636384
## 96953 1.673951243
## 96954 -3.542428813
## 96955 0.690492481
## 96956 -0.986631200
## 96957 0.294832422
## 96959 1.211926550
## 96960 2.058467002
## 96973 0.083465562
## 96974 0.264139620
## 96975 -0.917167349
## 96976 -0.013797057
## 96977 -1.325897756
## 96978 -0.766739722
## 96979 -0.740519993
## 96981 0.412861832
## 96986 -1.523982047
## 96987 -1.523982047
## 96988 1.724953355
## 96989 -1.523982047
## 96990 -1.523982047
## 96991 -1.473085421
## 96992 -3.345530919
## 96993 -0.848384537
## 96994 -3.345530919
## 96996 0.990646486
## 96997 0.534950159
## 96998 -1.042777209
## 96999 -0.623809406
## 97007 -1.857762391
## 97008 -1.857762391
## 97009 -1.857762391
## 97011 -1.071664450
## 97014 -1.264264747
## 97017 0.192827609
## 97018 0.012379901
## 97019 -0.920303610
## 97020 0.524198417
## 97022 -0.624783925
## 97025 0.661693162
## 97026 -1.356400115
## 97050 1.657552875
## 97058 -0.959521828
## 97076 -1.386188886
## 97078 -1.279844981
## 97086 0.254801260
## 97125 3.166851912
## 97130 0.435475318
## 97131 -1.732613384
## 97133 -0.063879290
## 97137 -0.719707156
## 97156 -1.643890286
## 97170 -1.398409353
## 97193 0.990646486
## 97196 0.177613223
## 97197 0.899446897
## 97198 -2.319694622
## 97199 3.428883716
## 97203 -1.315671037
## 97228 -2.274635560
## 97229 1.158171553
## 97230 -1.178721515
## 97280 0.448624311
## 97293 -0.813924294
## 97296 -0.365857835
## 97308 -0.348851205
## 97344 -0.077994562
## 97513 -0.388753733
## 97968 -0.768076670
## 98341 -0.403858367
## 99325 -1.857762391
## 99416 -2.127983557
## 99419 7.003981612
## 99650 -0.239509257
## 101580 -0.033626526
## 101954 -4.745550670
## 102886 -0.637760517
## 102887 1.147766395
## 102931 0.784252049
## 103132 -0.708901869
## 103246 -0.405412843
## 103381 -1.039498664
## 103488 6.500466008
## 105192 -1.523982047
## 105624 -0.462245053
## 105849 -0.100896936
## 106202 0.754198065
## 107041 -1.325264085
## 107212 -1.346559702
## 107385 0.072874654
## 107386 -0.799099202
## 107973 -0.054997634
## 107977 -1.857762391
## 108080 1.105452250
## 108329 -0.106546857
## 108570 0.612826616
## 109337 -0.282428065
## 109649 0.255163968
## 110365 0.255939856
## 110977 1.281038076
## 110984 2.449831340
## 110987 4.062105481
## 111011 -0.701861507
## 111014 -1.137700440
## 111017 -1.523982047
## 111018 0.203058627
## 111019 0.656467272
## 111024 -0.469315320
## 111026 -0.647327383
## 111027 1.129589390
## 111068 0.441125239
## 111126 -1.720053178
## 111496 -0.969599255
## 111589 2.549025008
## 111823 0.220779203
## 112014 0.127882199
## 112071 -0.437751085
## 112611 -0.350450136
## 112620 -1.859532445
## 112744 -2.573060835
## 112748 -0.980456655
## 112753 1.837902281
## 112755 -0.522691095
## 112760 0.450817970
## 112788 -0.125861566
## 113487 -1.456691486
## 113597 -1.547987054
## 113883 -0.845998411
## 113890 -0.237422693
## 113892 -0.118866690
## 113959 -0.882143766
## 114268 0.025224970
## 114279 0.333360048
## 114840 2.039850272
## 114873 -1.055320285
## 115868 -0.706130601
## 116663 3.524308664
## 116816 1.569693393
## 116823 -1.580284679
## 117119 -1.128422915
## 117477 0.045710314
## 117491 -0.829605041
## 118144 -0.938608382
## 118658 -0.036424894
## 118708 1.311408681
## 118709 -0.317437379
## 118710 3.628372214
## 118858 -2.801771284
## 118859 0.407290320
## 118860 3.231601810
## 118999 2.955405510
## 119431 1.022837975
## 119434 1.202630101
## 119435 -1.186276170
## 119436 8.103914883
## 119438 -0.740391074
## 119439 -2.166520560
## 119440 -2.955528450
## 119441 -2.955528450
## 119875 1.395536636
## 120478 -1.550737498
## 122317 -2.434167875
## 122404 -0.916742201
## 122885 -0.633846343
## 124287 -0.080974606
## 124688 0.659306111
## 124689 0.101596358
## 124691 -2.360160329
## 124692 0.354017154
## 124695 0.114679703
## 124696 -0.804055184
## 124697 -0.524109191
## 124698 -0.137486981
## 124699 -0.608016531
## 124700 0.520035169
## 124701 -0.580400148
## 124702 -0.580400148
## 124703 0.288534980
## 124704 -0.023141348
## 124705 0.107860922
## 124706 -0.785833722
## 124707 -0.785833722
## 124708 -0.785833722
## 124709 0.840232805
## 124710 0.510310226
## 124711 -0.544968425
## 124712 -0.544968425
## 124713 2.497724870
## 124714 -0.544968425
## 124715 1.081098102
## 124720 0.193897638
## 124721 3.902016151
## 124722 -0.976183430
## 124723 2.095275565
## 124725 0.949478730
## 124726 -0.373003921
## 124727 -0.009718322
## 124728 -0.685803028
## 124729 0.940263498
## 124730 -0.413957004
## 124732 1.120937557
## 124733 -0.413957004
## 124734 0.308739230
## 124735 -0.887901356
## 124736 0.511227876
## 124737 -0.539845970
## 124738 -0.137551098
## 124739 -0.539845970
## 124741 0.363524322
## 124742 -0.300253859
## 124743 0.992437359
## 124744 0.226782496
## 124747 0.363524322
## 124748 -0.525702510
## 124749 5.036291405
## 124750 -0.422460191
## 124751 -0.498899215
## 124752 -0.525916817
## 124753 -0.498899215
## 124754 1.499370842
## 124757 -0.318225156
## 124758 -0.318225156
## 124760 0.223797019
## 124761 -0.498899215
## 124763 3.975727304
## 124764 -0.476213621
## 124765 -0.137551098
## 124766 -5.780671857
## 124767 -0.498899215
## 124768 -3.106511896
## 124769 0.196993725
## 124770 -0.074017363
## 124771 0.404471078
## 124772 1.154734612
## 124776 -0.471219747
## 124777 -1.083200929
## 124778 -2.167245281
## 124779 -2.889941515
## 124780 -1.625223105
## 124781 -0.041208437
## 124784 -0.506220081
## 124785 5.536237392
## 124786 -0.757562917
## 124787 -1.802830357
## 124788 -1.080134122
## 124789 -1.409590373
## 124790 0.687829552
## 124791 -0.899460064
## 124793 -0.686894139
## 124794 -2.170657112
## 124796 -0.541178754
## 124797 -2.167245281
## 124799 -0.652890800
## 124800 -0.176763830
## 124801 -1.260808181
## 124802 0.116015637
## 124803 0.726606463
## 124806 -1.456228169
## 124807 -0.372183818
## 124808 0.531186475
## 124809 1.615230826
## 124811 -1.986571222
## 124812 -0.360504695
## 124813 -1.986571222
## 124814 -1.260808181
## 124815 -1.003517149
## 124816 -1.455202296
## 124817 1.810650814
## 124818 -1.093854178
## 124819 -1.622156298
## 124820 -0.371157944
## 124821 0.184584287
## 124822 -0.355201974
## 124823 -0.371157944
## 124824 3.422997285
## 124825 -1.123966520
## 124826 -0.114303228
## 124827 -2.087561500
## 124828 4.226193607
## 124829 1.254908582
## 124830 -0.039922169
## 124831 -0.179830637
## 124832 0.622549378
## 124833 -0.902526871
## 124834 -0.100146857
## 124835 -0.179830637
## 124836 -0.475651345
## 124837 -0.114303228
## 124838 -0.716550091
## 124839 0.427718947
## 124840 -1.379021638
## 124841 1.150415181
## 124842 3.614324592
## 124843 -2.690116875
## 124844 -1.986571222
## 124846 -0.447774918
## 124847 -1.080134122
## 124848 -1.310509752
## 124849 1.268628638
## 124850 -1.307228660
## 124851 6.146828219
## 124852 -1.239474239
## 124853 6.394282309
## 124854 3.503497373
## 124855 -1.691159386
## 124856 -0.721852812
## 124858 0.135745502
## 124859 -1.986571222
## 124860 0.274921317
## 124861 -0.899460064
## 124864 0.184584287
## 124866 -0.960935028
## 124867 0.365258346
## 124868 -1.441482240
## 124869 2.103886794
## 124870 -0.058889425
## 124872 0.545676195
## 124873 2.298725267
## 124874 2.240128010
## 124875 -0.122438403
## 124877 3.143498302
## 124878 -0.846848953
## 124879 5.067225758
## 124880 -0.973797838
## 124881 -1.227370803
## 124882 4.731919414
## 124883 -0.519995956
## 124884 3.504846419
## 124886 0.550374294
## 124887 1.878779892
## 124888 0.066370830
## 124889 -1.408044861
## 124890 -4.118155739
## 124892 0.009111791
## 124893 -0.504674569
## 124894 2.177200493
## 124895 -2.133401278
## 124896 -1.046696744
## 124897 -2.638071571
## 124898 0.218021665
## 124899 5.363263345
## 124900 1.246420688
## 124901 0.772233238
## 124902 2.538548611
## 124903 -2.672763271
## 124926 -1.389935413
## 124927 0.419711379
## 124928 2.044593895
## 124929 0.776969279
## 124930 -1.396286014
## 124932 1.111688773
## 124933 0.161633613
## 124934 -0.205573400
## 124935 -1.289617751
## 124936 -0.831146569
## 124937 -0.193528202
## 124939 5.602285301
## 124940 -0.781531433
## 124941 -1.323553609
## 124942 -0.027889510
## 124943 -0.423549899
## 124944 -0.790222586
## 124945 -1.413890638
## 124947 -0.134565621
## 124948 0.474495824
## 124949 -0.609548527
## 124950 2.380264591
## 124951 4.458016264
## 124952 -1.151570703
## 124953 -2.054940995
## 124954 -0.790222586
## 124955 0.212175889
## 124956 0.931505538
## 124957 -0.149172228
## 124958 0.573524006
## 124960 -0.965572075
## 124961 -0.248200410
## 124962 0.293821766
## 124963 -0.790222586
## 124964 -0.058835199
## 124965 2.210664987
## 124966 3.464308941
## 124967 0.663861035
## 124968 -0.316635726
## 124969 1.197192058
## 124970 -1.305926847
## 124971 -0.088947542
## 124972 -0.373094052
## 124973 0.055839023
## 124984 -1.022851094
## 124986 0.061193257
## 124987 0.061193257
## 124988 -0.638671757
## 124989 -0.329160208
## 124990 -0.330324040
## 124991 -0.263113915
## 124992 -0.280959343
## 124993 0.755466059
## 124994 -0.258684066
## 124995 -0.143248904
## 124996 0.940795447
## 124997 -0.333427593
## 125001 1.882937221
## 125002 0.433571841
## 125003 -0.238183160
## 125004 0.484513074
## 125012 7.782616015
## 125013 3.548388312
## 125016 -0.428371327
## 125018 -1.489104082
## 125019 -0.378584924
## 125020 -0.109147255
## 125021 -0.289821314
## 125022 0.371241381
## 125023 -0.376591977
## 125024 2.424959396
## 125025 0.007086704
## 125026 0.411053227
## 125027 0.559374178
## 125028 0.007086704
## 125029 -0.463992308
## 125030 -0.352390373
## 125031 3.831486152
## 125032 -0.886018290
## 125033 0.785454357
## 125035 5.613620434
## 125036 1.179958302
## 125037 -0.388753733
## 125038 -1.180291708
## 125039 -1.276868423
## 125041 -1.279844981
## 125042 -1.232697283
## 125043 -1.279844981
## 125044 -1.279844981
## 125045 -1.089491359
## 125046 -1.279844981
## 125047 2.133230133
## 125048 -0.396206686
## 125049 0.160174710
## 125050 2.313904192
## 125051 -1.256039716
## 125053 -0.700887421
## 125061 0.708569259
## 125062 0.233697386
## 125063 0.520759908
## 125064 0.159411791
## 125065 -1.014969590
## 125067 -0.382610385
## 125068 -0.292273356
## 125070 -0.579719346
## 125071 -0.579719346
## 125072 -0.760393404
## 125090 3.367268973
## 125092 1.611939551
## 125093 3.728617090
## 125099 0.182850264
## 125100 1.746285376
## 125101 -0.300253859
## 125103 -0.003782443
## 125104 0.357565674
## 125105 -0.913291248
## 125107 -0.174121843
## 125109 -1.108639889
## 125110 -0.566617713
## 125112 -1.095274949
## 125113 -1.095274949
## 125128 -1.683631263
## 125192 -0.203905169
## 125193 -0.745927345
## 125214 -7.627438019
## 125237 -1.986571222
## 125670 -0.109983798
## 125684 -0.498899215
## 125997 2.133230133
## 127330 7.601035078
## 128717 0.052456905
## 129073 -0.808579033
## 129164 -0.709294417
## 129256 5.918710300
## 130100 -0.677211584
## 130833 2.108816965
## 131051 -1.080718685
## 131507 -0.755280450
## 131749 -0.203991734
## 131750 0.825206467
## 132366 -0.165968947
## 132925 2.914927606
## 133316 -0.315396220
## 133326 1.583776389
## 134243 -0.150337205
## 135490 -0.186686911
## 135540 -3.679853994
## 135541 -3.679853994
## 136236 -0.347624774
## 183022 0.066675103
## 183338 0.038435984
## 183793 -1.277934867
## 185341 -2.372414842
## 185619 -0.458560677
## 185637 -1.352604471
##
## $Nationality
## (Intercept) x
## Austria -0.42196300 0.99652054
## Belgium -1.20648169 1.36397023
## Bulgaria 1.22966358 2.00000973
## Croatia -2.17067463 11.23974403
## Cyprus 1.43678583 -4.61414807
## Czech Republic 0.10295552 0.21083028
## Denmark -0.41627716 -2.96761166
## Estonia -0.58155144 -1.17393695
## Finland -1.31627516 2.14472569
## France -0.73521848 -0.06844123
## Germany 0.39346874 -0.60265202
## Greece 0.08568560 -2.71932350
## Hungary -0.08585265 -1.83779695
## Ireland 0.85080433 -3.23232074
## Italy 1.50832564 -1.30999359
## Latvia 0.79936481 -0.93630177
## Lithuania 13.39751215 -38.29414226
## Luxembourg -0.70901095 -1.32348608
## Malta 0.38274312 3.79585421
## Netherlands -1.50258715 0.15533824
## Poland 4.47336409 -0.64471393
## Portugal -0.58501966 -2.08127264
## Romania 0.55418426 -0.31597943
## Slovakia -0.37174957 -0.08936310
## Slovenia -0.25372979 -2.11202960
## Spain 0.30530168 -2.82617070
## Sweden -1.54829747 -0.07493743
## United Kingdom 1.65142810 -2.99808154
Coefplot
Our random slopes can also be plotted as coefficient plots (coefplots) useful for hypothesis testing.
We can report the national-level effects of vote shares together with
their intercept in sjPlot
.
library(sjPlot)
plot_model(mod.slope,
#Plot random effects
type = "re",
#Variable to be plotted
terms = "Nationality")
## [[1]]
##
## [[2]]
Some additional R codes: We can also make the plot our own way:
#simulate
<-
re #Simulate the distribution from the estimated point estimate/regression coefficient and the standard error
REsim(mod.slope) %>%
#Transform it to a data frame; have a look at the data frame to get an idea
as.data.frame() %>%
#select national variation; there is a variable that labels the grouping
filter(groupFctr == "Nationality")
<-
tab_slope %>%
re #Select my variables
::select(groupID, term, mean, sd) %>%
dplyr#Pivot to wide format using function from tidyr package
::pivot_wider(values_from = c(mean, sd),
tidyrnames_from = term) %>%
#Select variables 1 to 5
::select(1,2,4,3,5) %>%
dplyr#Give the variables new names; e.g. variable 2 is now named "alpha"
rename("alpha" = 2,
"se(alpha)" = 3,
"beta" = 4,
"se(beta)" = 5) %>%
as.data.frame()
tab_slope
%>%
tab_slope #Arrange according to size of the effect
arrange(beta) %>%
+
ggplot #Points indicate regression coefficient
geom_point(aes(y = 1:nrow(tab_slope),
x = beta)) +
#Errorbars indicate the spread/uncertainty of the coefficient; here, 95% conf. interval
geom_errorbar(aes(y = 1:nrow(tab_slope),
xmin = beta - `se(beta)`*1.96,
xmax = beta + `se(beta)`*1.96),
#size of dents on whiskers
width = 0
+
) #Add nationality labels along the y axis
geom_text(aes(y = 1:nrow(tab_slope),
x = -50,
label = groupID),
hjust = 0
+
) #Vertical line indicate null-effect; do the confidence intervals overlap 0?
geom_vline(aes(xintercept = 0),
lty = 2) +
#Remove y axis label
ylab("") +
#Title
ggtitle("Effect of vote share within member state delegations")
Effect plots
We can illustrate the effect of the random slopes. However, taking the uncertainty into the estimation in a scenario is more than this class can cover.
Here, I use the simulated effect of x for the first group (Austria;
as indicated by my indexation of the table) and plot the effect by using
the geom_abline()
. It needs the intercept and the slope to
create a linear effect.
%>%
tab_slope #Add the grand mean to the data
mutate(a = fixef(mod.slope)[1]) %>%
filter(groupID == "Austria") %>%
+
ggplot geom_abline(aes(intercept = a + alpha,
slope = beta)) +
#A realistic y axis
ylim(c(0,10)) +
ggtitle("Effect of party seat share in Austria")
We can also illustrate the effect in all countries by using the color grouping or facet plot in ggplot.
%>%
tab_slope #Add the grand mean to the data
mutate(a = fixef(mod.slope)[1]) %>%
+
ggplot geom_abline(aes(intercept = a + alpha,
slope = beta)) +
ylim(c(0,10)) +
#Nationality is stored by REsims in the groupID variable
facet_wrap(~groupID) +
ggtitle("Effect of changes in party seat share on MEPs local staff in different countries")
Since we have simulated the random intercepts, we also have an estimation of their standard error. I can then create confidence intervals.
%>%
tab_slope #Add the grand mean to the data
mutate(a = fixef(mod.slope)[1]) %>%
+
ggplot geom_abline(aes(intercept = a + alpha,
slope = beta)) +
#Upper confidence interval
geom_abline(aes(intercept = a + alpha + 1.96 * `se(alpha)`,
slope = beta + 1.96 * `se(beta)`),
lty = 2) +
#Lower confidence interval
geom_abline(aes(intercept = a + alpha - 1.96 * `se(alpha)`,
slope = beta - 1.96 * `se(beta)`),
lty = 2) +
ylim(c(-10,10)) +
#Nationality is stored by REsims in the groupID variable
facet_wrap(~groupID) +
ggtitle("Effect of changes in party seat share on MEPs local staff in different countries")
From this plot, we see that even with more than 7000 observations, the varying-slope model yields very uncertain results. The model regularly predicts that a likely outcome is a negative staff size (not possible).