# Questions on Statistical Significance of Results in 3dttest++ (GLA)

Group Level Analysis is fairly new to me, and although I have read some AFNI documentation regarding its use, there are still some concepts that I don’t understand well enough. Below my questions I’ll describe the paradigm that I’m using, to provide context.

1: What are the differences, if any, of p & q values in 1st level analysis versus GLA?
2: How strict must one be when choosing p/q values to interpret results in GLA?
3: Do p/q values have different meanings when transitioning from 1st level analysis to GLA?

I’m currently working on a project with a paradigm consisting of three conditions, and am performing EEG-informed fMRI analysis on a small sample of subjects, for now. I’m using 9 regressors in my GLM. 3 regressors describe onset timing of an event, per condition. The other 6 regressors are produced by AM2 for the time-value of an EEG feature some time after onset, coupled with the value of that EEG feature (mean & modulated amplitude, per condition). Subject-wise, I can visually see some decent trends in activation areas across subjects when viewing the AM2 regressor t-scores (however, visualization is best when using p-values more the majority of subjects). Looking back on the output for each proc per subject, However, when I take the Coefficients from GLM into 3dttest++ to find common areas of amplitude modulation, the lowest q values available range from 0.8 to 0.9, so once again I’m forced to visualize results with only p < 0.05.

As a side note, performing GLA on the regressors that do not involve the EEG feature (just the onset timing), low q values are achieved and the results definitely make sense, which should validate the process.

What are the differences, if any, of p & q values in 1st level analysis versus GLA?

The voxel-wise p-value is the false positive rate or type I error under the conventional paradigm of Null Hypothesis Significance Testing, while the voxel-wise q-value is the False Discovery Rate (FDR), which is one of the few approaches to handling the multiple testing issue (see details here: https://en.wikipedia.org/wiki/False_discovery_rate). From interpretation perspective, there is no difference about their meaning between individual and group levels.

How strict must one be when choosing p/q values to interpret results in GLA?

FDR q-value is supposed to control the multiple testing issue at the whole brain level, so you can directly use a reasonable threshold (e.g., q-value of 0.05) to report the results if you’re happy with the results. On the other hand, you would have to correct for multiple testing with the voxel-wise p-values, and there are a few approaches to dealing with that. See the AFNI class material for more details.

Do p/q values have different meanings when transitioning from 1st level analysis to GLA?

No, their meaning does not change regardless of the context.

when I take the Coefficients from GLM into 3dttest++ to find common areas of amplitude modulation, the
lowest q values available range from 0.8 to 0.9, so once again I’m forced to visualize results with only p < 0.05.

It could be that you don’t have enough number of subjects or you don’t have many voxels with strong statistical evidence. In addition, you don’t have to use the FDR q-values.

performing GLA on the regressors that do not involve the EEG feature (just the onset timing), low q values
are achieved and the results definitely make sense, which should validate the process.

The interpretation is different for different events in the experiment.