F-test, post-hoc testing

Dear experts,

We have two questions:

  1. In our 3dLME model (see below) we have 2 parameters : Age * Time (2 groups of Age and 3 Time moments). From this model we are interested in the main effect of Time and in the interaction of Age * Time. Both groups contain 15 subjects.

When writing the results in our manuscript, co-authors comment on the fact that a certain glt-test is a post-hoc test and should only be performed on significant brain regions in the main effect. Of course it is possible to perform a two-stage approach, and include the outcome of the main effect as a mask for the glt-tests. However, this feels rather artificial to us because normally, glt-tests are already specified in the same model, similar to for instance SPM.

What is your opinion on this matter, and how should we reply/tackle this question?

  1. Related to this: When doing the posthoc test at whole brain, we see in AFNI that the resulting F-map of the main effect compared to a Z-map from a corresponding glt test, shows much smaller regions in the F-test (at the same p-value). We find this somewhat counterintuitive. Is this related to the way we should perform post-hoc testing (question 1) or could this be due to a more conservative way the F-test is calculated?

Many thanks in advance,

Jan-Bernard and Kelly

Model specification: Please note, we deleted some nonrelevant glts in this thread.

3dLME -prefix 2LVL_AgeByTime
-jobs 24
-model "GroupSession"
-SS_type 3
-ranEff ‘~1’
-num_glt 29
-gltLabel 1 ‘(YPostVsPre)Vs(OPostVsPre)’ -gltCode 1 'Group : 1
young -1old Session : -1pre 1post’
-gltLabel 5 ‘(YRetVsPost)Vs(ORetVsPost)’ -gltCode 5 'Group : 1
young -1old Session : -1post 1retention’
-gltLabel 13 ‘YoungPreVsPost’ -gltCode 13 'Group : 1
young Session : 1pre -1post’
-gltLabel 15 ‘OldPreVsPost’ -gltCode 15 ‘Group : 1old Session : 1pre -1post’
–gltLabel 18 ‘YoungRetVsPost’ -gltCode 18 'Group : 1
young Session : 1retention -1post’
-gltLabel 20 ‘OldRetVsPost’ -gltCode 20 ‘Group : 1old Session : 1retention -1*post’
-dataTable
Subj Group Session InputFile
O1 old pre /data/local_dir3/motorlearning/MixedEffects/input_AgeByTime/OlderAdults/O1/Results_afterART/con_0004.nii
O1 old post /data/local_dir3/motorlearning/MixedEffects/input_AgeByTime/OlderAdults/O1/Results_afterART/con_0006.nii
O1 old retention /data/local_dir3/motorlearning/MixedEffects/input_AgeByTime/OlderAdults/O1/Results_afterART/con_0008.nii

Jan-Bernard and Kelly,

we see in AFNI that the resulting F-map of the main effect compared to a Z-map from a corresponding glt test,
shows much smaller regions in the F-test (at the same p-value). We find this somewhat counterintuitive.

Which main effect and which GLT?

It seems that you have a simple two-way ANOVA data structure, so it would be more straightforward to use 3dMVM; that is, replace the following lines

3dLME -prefix 2LVL_AgeByTime \
-jobs 24 \
-model “Group*Session” \
-SS_type 3 \
-ranEff ‘~1’ \

with

3dMVM -prefix 2LVL_AgeByTime2
-jobs 24
-bsVars"Group"
-wsVars “Session” \

When writing the results in our manuscript, co-authors comment on the fact that a certain glt-test is
a post-hoc test and should only be performed on significant brain regions in the main effect. Of course
it is possible to perform a two-stage approach, and include the outcome of the main effect as a mask
for the glt-tests. However, this feels rather artificial to us because normally, glt-tests are already
specified in the same model, similar to for instance SPM.

What is your opinion on this matter, and how should we reply/tackle this question?

Your co-authors’ opinion is more aligned with the common suggestion in the broader statistical field when performing post hoc testing for simple situations. However, practically the difference is usually subtle and mostly negligible for neuroimaging data analysis: you may sometimes see situations where a statistically significant effect in the post hoc test does not correspond to a statistically significant main effect, or vice versa. I would not worry too much about this for two reasons: 1) such scenarios are usually rare, and if it does occur, they would be some marginal scenarios; 2) the small impact of such subtleties is usually overwhelmed by the correction step at the cluster level.