How to Specify IndividualLevel (Random) Effects in Hierarchical Modeling?
Gang Chen
Preface
Hierarchical (also known as mixedeffects or multilevel) modeling is a powerful analytical tool designed to handle complex data structures. However, its power is often matched by the challenges it poses, both at conceptual and specification levels. The motivation behind this blog post stems from the common practice of specifying hierarchical models with a simple varying (random) intercept at the individual level. However, this approach can pose challenges, especially when multiple withinindividual variables are taken into account. In this post, we explore several populationlevel scenarios in neuroimaging data analysis that can be valuable templates. We plan to expand this list by incorporating additional scenarios as they demonstrate typified patterns.
We wish to underscore the following key points:
 Withinindividual variables. The term "withinindividual variables" (or "repeatedmeasures") is a conventional notion used in the context of ANOVA. It denotes situations where the measurement unit, such as an experimental participant, experiences all levels of a categorical variable (factor) or multiple instances of a quantitative variable (e.g., time) in longitudinal studies.

Counterbalance. The models outlined in this discussion for scenarios involving one or more withinindividual factors are not the most intricate ones that could potentially be employed. In other words, there exists the possibility of utilizing more sophisticated models to capture nuanced variancecovariance patterns. However, we believe that the parsimonious models introduced here strike a delicate equilibrium between effectively accounting for data variability and managing computational complexities, rendering them wellsuited for many practical scenarios.

Betweenindividual variables. The treatment of betweenindividual variables requires thorough consideration. The discourse here primarily revolves around withinindividual variables, particularly categorical variables (factors). Variables such as sex, patient/control status, and age, which encompass differences between individuals, can also be integrated into a model. However, the process of variable selection is intricate and will be addressed separately at another juncture.

Implementation. The specifications provided here are especially relevant for populationlevel analyses conducted using the AFNI program 3dLMEr. This program is renowned for its flexibility and is preferred over its predecessor, 3dLME. Additionally, when compared to the ANCOVA program 3dMVM, 3dLMEr demonstrates greater adaptability.
In the scenarios discussed below, we assume:
y
is the response variable (e.g., BOLD response) in a hierarchical data structure;Subj
serves as the unit of measurement (e.g., experiment participant);A
,B
, and so forth, signify withinindividual factors (e.g., emotion, congruency, and more). These categorical variables are commonly utilized in designed experiments to investigate their causal effects on neural response.
1) One withinindividual factor
For situations with just a single withinindividual (or repeatedmeasures) factor A
, the process is relatively straightforward. Assuming indices a and i represent the factor levels and individuals respectively, we typically define the following hierarchical model for data y_{ai}:
Here, m_a denotes the populationlevel effect associated with the ath factor level, \delta_i signifies the effect linked with the ith individual (commonly known as a random effect), and \sigma^2 and \tau^2 represent the population and individuallevel variances respectively. This formulation is commonly referred to as a linear mixedeffects model with random (or varying) intercepts.
Mapping this model into the program 3dLMEr
is straightforward:
model 'A+(1Subj)'
In this case, A
corresponds to m_a, while (1Subj)
corresponds to \delta_i.
(1.A) incorporation of withinindividual quantitative variables
The above model can be extended to include individuallevel slopes. For example, when rating
score r_{ai} is available for the ith individual at the ath level of the factor, we may modify the model to
where s_a and \theta_i are the slopes at the population and individuallevels, respectively. This formulation is commonly referred to as a linear mixedeffects model with both random intercepts and random slopes.
The specification in 3dLMEr
is now updated to
model 'A*rating+(1+ratingSubj)'
To improve the interpretability of differences among the levels of the factor, it may be essential to center the variable rating
within each level of the factor.
(1.B) incorporation of multiple samples
Another extension to the hierarchical model above with one withinindividual factor A
is the scenario with multiple samples. Suppose that the response variable (e.g., BOLD response) is measured across N samples (e.g., scanning runs). With an extra index n for samples (n=1,2,...,N), the original model is now extended to
This extended model can be implemented through 3dLMEr
as
model 'A+(1Subj)+(1A:Subj)'
2) Two withinindividual factors
Now, let us extend our discussion to scenarios with two withinindividual (or repeatedmeasures) factors, say A
and B
. Here, indices a, b, and i denote the factor levels for A
, B
, and individuals respectively. In such cases, a randomintercept model would not be appropriate. Instead, we consider the following hierarchical model for the data y_{abi}:
For mapping this model into the program 3dLMEr
, we use the following specification:
model 'A*B+(1Subj)+(1A:Subj)+(1B:Subj)'
If a withinindividual quantitative variable like rating
is available across all the levels of factors A
and B
, consider the following specification:
model 'A*B*rating+(1+ratingSubj)+(1+ratingA:Subj)+(1+ratingB:Subj)'
Again, proper centering might be essential for the interpretability of some effects.
3) Three withinindividual factors
Extending our approach from two withinindividual factors to three, the case involving factors A
, B
, and C
should come naturally. Indices a, b, c, and i represent the levels of factors A
, B
, C
, and individuals respectively. We adopt the following hierarchical model for the data y_{abci}:
The distribution assumptions for individualspecific effects like \delta_i, \alpha_{ai}, etc., are similar to those in the two withinindividual factors case, and are thus not repeated here.
The mapping for this model to the program 3dLMEr
becomes:
model 'A*B*C+(1Subj)+(1A:Subj)+(1B:Subj)+(1C:Subj)+(1A:B:Subj)+(1A:C:Subj)+(1B:C:Subj)'
What if there are more than three withinindividual factors?
In this scenario, we'd like to emphasize two key points. Firstly, if an investigator plans to design an experiment with such a high level of complexity, they should anticipate and include strategies for managing the resulting model complexity as part of their planning process. Secondly, extending the modeling process beyond the case with three withinindividual factors, as demonstrated above, is not significantly more challenging, although it may involve increased technical intricacy and computational cost.