Questions on Statistical Significance of Results in 3dttest++ (GLA)

Hello!

Group Level Analysis is fairly new to me, and although I have read some AFNI documentation regarding its use, there are still some concepts that I don’t understand well enough. Below my questions I’ll describe the paradigm that I’m using, to provide context.

1: What are the differences, if any, of p & q values in 1st level analysis versus GLA?
2: How strict must one be when choosing p/q values to interpret results in GLA?
3: Do p/q values have different meanings when transitioning from 1st level analysis to GLA?

I’m currently working on a project with a paradigm consisting of three conditions, and am performing EEG-informed fMRI analysis on a small sample of subjects, for now. I’m using 9 regressors in my GLM. 3 regressors describe onset timing of an event, per condition. The other 6 regressors are produced by AM2 for the time-value of an EEG feature some time after onset, coupled with the value of that EEG feature (mean & modulated amplitude, per condition). Subject-wise, I can visually see some decent trends in activation areas across subjects when viewing the AM2 regressor t-scores (however, visualization is best when using p-values more the majority of subjects). Looking back on the output for each proc per subject, However, when I take the Coefficients from GLM into 3dttest++ to find common areas of amplitude modulation, the lowest q values available range from 0.8 to 0.9, so once again I’m forced to visualize results with only p < 0.05.

As a side note, performing GLA on the regressors that do not involve the EEG feature (just the onset timing), low q values are achieved and the results definitely make sense, which should validate the process.

Thank you to all who take the time to read this.

What are the differences, if any, of p & q values in 1st level analysis versus GLA?

The voxel-wise p-value is the false positive rate or type I error under the conventional paradigm of Null Hypothesis Significance Testing, while the voxel-wise q-value is the False Discovery Rate (FDR), which is one of the few approaches to handling the multiple testing issue (see details here: https://en.wikipedia.org/wiki/False_discovery_rate). From interpretation perspective, there is no difference about their meaning between individual and group levels.

How strict must one be when choosing p/q values to interpret results in GLA?

FDR q-value is supposed to control the multiple testing issue at the whole brain level, so you can directly use a reasonable threshold (e.g., q-value of 0.05) to report the results if you’re happy with the results. On the other hand, you would have to correct for multiple testing with the voxel-wise p-values, and there are a few approaches to dealing with that. See the AFNI class material for more details.

Do p/q values have different meanings when transitioning from 1st level analysis to GLA?

No, their meaning does not change regardless of the context.

when I take the Coefficients from GLM into 3dttest++ to find common areas of amplitude modulation, the
lowest q values available range from 0.8 to 0.9, so once again I’m forced to visualize results with only p < 0.05.

It could be that you don’t have enough number of subjects or you don’t have many voxels with strong statistical evidence. In addition, you don’t have to use the FDR q-values.

performing GLA on the regressors that do not involve the EEG feature (just the onset timing), low q values
are achieved and the results definitely make sense, which should validate the process.

The interpretation is different for different events in the experiment.