Statistics in general – and logit models in particular – are based on comparisons. All statements are done by comparison to a reference group. Since all the GLMs in this class draw from a probability distribution in the exponential family, it also means that the effect size of one variable depends on the value of the other variables in the regression. These exercices are intended to help you see that.

We will be working on the likelihood that judges at the Court of Justice of the European Union (CJEU) leaves office at the expiry of their mandate. They serve 6 year-terms that can be renewed. The question is whether their government will want to renew their mandate. The dependent variable exit reports whether a judge leaves or remains for another term. The data is the same as in the R notebook.

The first exercise focuses on interpretation of the results from a model that is already estimated (Ward and Ahlquist (2018), ch 3; Hermansen (2023), ch 7 and 10).. The second exercise allows you to see how those estimates are calculated.

mod <- glm(exit ~
             #Political difference between governments
             free_economy_diff
           #Performance of judge 
           + performance
           + age 
           + tenure
           + attendance
           + court,
           #Recoding strategy (logit-transformation) and binomial probability distribution
           family = binomial(link = "logit"),
           #Data
           df)

free_economy_diff captures the political distance between the government that appointed the judge at the beginning of the term and the government that might re-appoint the judge today.

performance captures the share of salient/high-impact assignment the judge received in the previous term compared to the share of high-impact assignments on the Court.

**Exit decisions among judges at the CJEU (a binomial logit)**

	Dependent variable

	exit
	(1)	(2)

free_economy_diff	1.505^***	1.490^***
	(0.386)	(0.443)

performance	-0.184	-0.556^*
	(0.272)	(0.307)

age		0.132^***
		(0.027)

tenure		0.045
		(0.036)

attendance		-0.015^*
		(0.009)

courtGC		0.809^**
		(0.376)

Constant	-1.383^***	-9.746^***
	(0.308)	(1.759)


Observations	250	235
Log Likelihood	-137.610	-110.922
Akaike Inf. Crit.	281.220	235.844

Note:	p<0.1; p<0.05; p<0.01

Exercise 1: Interpretation (the logistic transformation)

Replicate model 2.

What is the marginal effect of a one-unit increase in the following predictors on the probab the probability that a judge is replaced (exit == 1)?
- Ideology: free_economy_diff
- Past performance/influence: performance
What would you say is a good increment for a “high-performing” judge? Can you make a partial scenario and report the marginal effect of judges’ performance on governments’ decision to replace a judge?
Can you illustrate the two effects graphically?
What is the marginal effect of a left-right overturn in government during a judge’s mandate on their probability of being replaced? To find a good increment, draw on the descriptive statistics.
- use the filter() function to find the appointing and reappointing prime minister’s party family (you might want to eyeball the data using the View() function first) (family_id, family_id_ren).
- use the group_by and reframe() functions to calculate a measure for a “typical” left-right shift
- fill in the partial scenario and make a catchy sentence!
Calculate the first difference: Consider four scenarios and compare the predicted probabilities that a judge exits the Court for high and low levels of both performance and political distance.
- what are the predicted probabilities for the two groups?
- what is the effect of ideology among high-performing judges? What is the effect among low-performing judges?
- who stands the most to win by performing better?
Explore the compensation threshold: How much better must a judge perform to compensate for a median political distance in government preferences (i.e. free_economy_diff == 0.24)? Given the distribution of performance, how realistic is it for judges to survive based on merit alone?

You can follow the three steps:

Step 1: The logistic regression model

The log-odds of exit are given by:

\[ \log\left(\frac{P(\text{exit} = 1)}{1 - P(\text{exit} = 1)}\right) = \beta_0 + \beta_1 \cdot \text{free_economy_diff} + \beta_2 \cdot \text{performance} + \dots \]

We are interested in finding how much performance is needed to offset a given level of free_economy_diff.

Step 2: Neutralize the effect

To “compensate,” we set the combined effect of free_economy_diff and performance to zero, meaning they cancel each other out:

\[ \beta_1 \cdot \text{free_economy_diff} + \beta_2 \cdot \text{performance} = 0 \]

Step 3: Solve for performance Rearrange for performance:

\[ \text{performance} = -\frac{\beta_1}{\beta_2} \cdot \text{free_economy_diff} \]

This is the compensation equation: for each unit increase in free_economy_diff, performance must increase by \(-\beta_1 / \beta_2\) to keep the probability of exit unchanged.

The recoding of the dependent variable (logit transformation)

To obtain a continuous and unbounded dependent variable on which to run a regression, we recode the 0s and 1s. We do this by making comparisons between observations that have a successful outcome (1s) and those that have a failure (0s). That is, we sum over the number of successes and the number of failures, then compare them. All the regression coefficients are in fact the result of such comparisons.

In this exercise, you will calculate the regression coefficients of two simple binomial logistic regressions. Remember that all coefficients are reported as logodds and change in logodds (oddsratio).

Download the data from my website: https://siljehermansen.github.io/teaching/beyond-linear-models/Reappointments.rda

Exercise 2: Base-line model without predictors.

We’ll start out by calculating the intercept in an intercept-only model. \(y\) reports whether each judge exited the Court or not, while \(z\) reports the logodds that the judge exited.

\[y = \alpha\] \[z = \alpha\]

\[ z = logit(p) = \frac{p}{1-p}\]

In an intercept-only model, there is no change between groups, so the only comparison is between the number of successes and failures in the data.

a. Calculate the intercept

Calculate the probability, then the odds, then the logodds that a judge will exit (exit). To do so, you will have to calculate the number of judges that exited (successes) and the number of judges that remained on the Court (failures).

sum over how many exited the court
sum over how many did not exit the court
divide one by the other
logtransform

You have your logodds!

b. Run a binomial regression with only an intercept.

How do your logodds compare with the regression coefficients?

mod0 <- glm(exit ~ 1,
            family = "binomial",
            df)

Exercise 3: Model with one predictor

Now, let’s expand the analysis to a binary predictor. We want to describe the likelihood that a judge will exit the Court as a function of whether he is older than 65 years.

\(y = \alpha + \beta x\)

In a model with predictors, we add a second comparison between the groups. In this example, we have a binary predictor (age60).

a. Calculate the odds of exiting the court among young and old judges

divide the data in two groups according to the values of the predictor (age60).
calculate the odds of exiting the court in each group separately in the same way as you did in exercise 2.

b. Calculate the intercept.

The intercept (\(\alpha\)) is defined as the value of \(y\) when all the \(x\)s are 0. In our model, it reports the logodds when age60 == 0.

logtransform the odds of exiting the court among the age60 == 0 group

c. Calculate the slope parameter.

calculate the ratio between the two odds (odds of age60 == 1 vs. odds of age60 == 1).
logtransform.

c. Run the model in R and compare.

Did you find the same thing?

mod1 <- glm(exit ~
              age60,
            family = "binomial",
            df)

Literature

Hermansen, Silje Synnøve Lyder. 2023. R i praksis - en introduktion for samfundsvidenskaberne. 1st ed. Copenhagen: DJØF Forlag. https://www.djoef-forlag.dk/book-info/r-i-praksis.

Ward, Michael D., and John S. Ahlquist. 2018. Maximum Likelihood for Social Science: Strategies for Analysis. Analytical Methods for Social Research. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781316888544.

Problem set: Back and forth in the logistic regression

Silje Synnøve Lyder Hermansen

2025-03-12

Exercise 1: Interpretation (the logistic transformation)

The recoding of the dependent variable (logit transformation)

Exercise 2: Base-line model without predictors.

Exercise 3: Model with one predictor

Literature