Modeling Twin Data in 3dLME with nesting? [q for Gang]

Hi Gang,

I have a question about nesting in 3dLME, which is a challenge I am currently facing trying to perform second-level analysis of a twin dataset in 3dLME. My dataset is composed of 134 MZ twins, 50 same sex DZ twins, 70 different sex DZ twins, and 78 subjects for whom we don’t have scans of their twins. Participants completed the MID task with three event conditions of interest (cue5, cue1, cue0), and we have the first level analysis buckets, which we computed using 3dREMLfit.

I am interested in two things.

First, I want to know whether the task “worked”, that is, if the MID contrast for cue5-cue0 shows the same pattern of significant clusters in our sample as has been shown previously in the literature, but controlling for non-independence of twins.

Second, I want to know whether differences in activation between cue5 and cue0 are related to age (quantitative covariate) and other covariates of interest.

What I am struggling with the most is goal #1, more specifically, on how to tell 3dLME to account for two levels of nesting. My thinking is: Level 1 would be each individual person (since we have repeated measures of different conditions from each person), and then Level 2 would be the twin pair (because each person may be very closely genetically related to another participant in the sample, such that the observations are nonindependent in this sense). I know that the Subj column accounts for one level, but I’m not fully sure of how to account for the second. Let me show an abbreviated example of what I currently have:

model: cond*age+sex+TwinType
raneff: ‘~1’
glts I want:

  1. ‘cond : 1cue5 -1cue0’ [goal #1]
  2. 'cond : 1cue5 -1cue0 age : ’ [goal #2]

data:
Subj ID cond sex age TwinType InputFile
aa 001 cue0 M 27 MZ /data/stats.sub-aaTwin1_REML+tlrc’[CueZero#0_Coef]’
aa 001 cue1 M 27 MZ /data/stats.sub-aaTwin1_REML+tlrc’[CueOne#0_Coef]’
aa 001 cue5 M 27 MZ /data/stats.sub-aaTwin1_REML+tlrc’[CueFive#0_Coef]’
aa 002 cue0 M 27 MZ /data/stats.sub-aaTwin2_REML+tlrc’[CueZero#0_Coef]’
aa 002 cue1 M 27 MZ /data/stats.sub-aaTwin2_REML+tlrc’[CueOne#0_Coef]’
aa 002 cue5 M 27 MZ /data/stats.sub-aaTwin2_REML+tlrc’[CueFive#0_Coef]’
bb 003 cue0 F 24 Single /data/stats.sub-bbTwin2_REML+tlrc’[CueZero#0_Coef]’
bb 003 cue1 F 24 Single /data/stats.sub-bbTwin2_REML+tlrc’[CueOne#0_Coef]’
bb 003 cue5 F 24 Single /data/stats.sub-bbTwin2_REML+tlrc’[CueFive#0_Coef]’
cc 004 cue0 M 24 DZ_DiffSex /data/stats.sub-ccTwin1_REML+tlrc’[CueZero#0_Coef]’
cc 004 cue1 M 24 DZ_DiffSex /data/stats.sub-ccTwin1_REML+tlrc’[CueOne#0_Coef]’
cc 004 cue5 M 24 DZ_DiffSex /data/stats.sub-ccTwin1_REML+tlrc’[CueFive#0_Coef]’
cc 005 cue0 F 24 DZ_DiffSex /data/stats.sub-ccTwin2_REML+tlrc’[CueZero#0_Coef]’
cc 005 cue1 F 24 DZ_DiffSex /data/stats.sub-ccTwin2_REML+tlrc’[CueOne#0_Coef]’
cc 005 cue5 F 24 DZ_DiffSex /data/stats.sub-ccTwin2_REML+tlrc’[CueFive#0_Coef]’ \

In the Subj column, I put the twin pair id (Level 2). Then, in the ID column, I differentiate paritcipants (Level 1). Currently, it’s just generating a random intercept for twin pair right? But given the need for nesting, how do I specify the model and random effects in 3dLME?

Thank you very, very much!!

I need to think about this a little bit.

Are you only interested in the contrast of cue5-cue0? In other words, cue1 is irrelevant at the group level?

Hi Gang,

Thank you very much for taking the time to read my question and think about it, I really appreciate it. I would say that the Cue5-Cue0 contrast would be the most relevant to our hypothesis, since it’s likely the strongest contrast (high reward vs no reward). We would probably be interested in doing this same analysis with a 0.5Cue5 +0.5Cue1 -1*Cue0 contrast (so, some reward vs. no reward) and see what that looks like as well. But, if that is too complicated, we would be ok with just Cue5-Cue0.

Thank you once again!! We’ve been kinda stumped with this analysis for a while.

I assume that you’re not interested in things such as heritability among different twin types. So, I would just simplify the situation by treating/pretending each pair of twins as one subject and using their effect estimates as duplicates. You can use 3dMVM, and the effects from each twin pair are averaged by 3dMVM if each twin pair shares the same subject ID. Does this sound like a reasonable approach to you?

Hi Gang,

Thank you so much for your response!!

I assume that you’re not interested in things such as heritability among different twin types.

For the purposes of this specific analysis - that is, looking at whether the task worked as it was supposed to, and whether it’s correlated to age, no. However, I did read your recent HBM paper on ICC (http://onlinelibrary.wiley.com/doi/10.1002/hbm.23909/abstract) and am definitely interested in looking into this later. I was actually planning to ask you at some point for some clarification on how to run ICC(1,1) using the best current available AFNI tools, haha. But for now, yes, it’s ok to ignore that.

So, I would just simplify the situation by treating/pretending each pair of twins as one subject and using their effect estimates as duplicates. You can use 3dMVM, and the effects from each twin pair are averaged by 3dMVM if each twin pair shares the same subject ID. Does this sound like a reasonable approach to you?

Thanks for the suggestion! I have a few questions though:

  1. Do you imply that I should use 3dcalc or something like that to compute the average coefficient map for each twin pair, for each condition; and then feed these average coefficient maps into 3dMVM? So, for each pair, there would be 3 rows in the data table (one for each condition)? Or, do you mean that I can just leave the data table as it is, and 3dMVM will average the repeated observations automatically?

  2. By averaging the twins, am I not losing degrees of freedom / power? That is, this approach would allow me the same degrees of freedom as splitting the sample in half (such that no twin pair is in the same group), and then running the same analyses on each cohort, correct?

A clarification is in order here. So, my incentive with looking into a LME approach here was to attempt to maximize power while accounting for non-independence in the data. One thing I did previously, for looking at the interaction and age specifically (Goal #2), was feeding the first-level contrasts (e.g., cue5-cue0) as the observed variables, such that each twin pair had two rows and the non-independence was accounted for by the random intercept (specified as ~1, i.e., by “subj” column). This approach models age as a main effect on the contrast, and is straightforward to specify since there’s only one “level” to account for. Although, it seems like I’d have to do a test for significance of the intercept in this approach, which doesn’t seem possible using 3dLME (at least from my knowledge).

  1. I guess what I am wondering here then is, will I only be able to confirm that the task is “doing what it’s supposed to” (Goal #1) using the 3dMVM/averaged approach, or if there is some sort of way to do it in a more sophisticated fashion using a LME model. Like, for example, specifying nested random effects like ‘(1|subj)+(1|subj:ID)’ [[ i.e., ‘(1|twinpair)+(1+|twinpair:individual)’ ]]

Thank you so, so much!!!

Do you imply that I should use 3dcalc or something like that to compute the average coefficient map
for each twin pair, for each condition; and then feed these average coefficient maps into 3dMVM?

No, just directly provide input files for each pair to 3dMVM: two rows per condition.

for each pair, there would be 3 rows in the data table (one for each condition)?

6 rows per pair.

By averaging the twins, am I not losing degrees of freedom / power?

Not really. You have enough number of twins, so I don’t think that DF is an issue.

That is, this approach would allow me the same degrees of freedom as splitting the sample in half
(such that no twin pair is in the same group), and then running the same analyses on each cohort, correct?

Right.

my incentive with looking into a LME approach here was to attempt to maximize power while
accounting for non-independence in the data.

I’d just go with 3dMVM, which is more straightforward.