Computation of R^2 and F-stats in 3dDeconvolve considering vs not considering a baseline model

Hi all,
I have a quick question about how 3dDeconvolve computes the T statistic and the F statistic of a regressor and the R[sup]2[/sup] of the model.
In particular, is there any difference between having a set of regressor specified as a baseline model (i.e. with -ortvec or -stim_base) versus specifying all the regressors as stimuli (i.e. with -stim_file but without -stim_base)?

As an example, I have a stimulus of interest in a file “A.1D” and then six motion regressors (in a file “M.par”).
I could set up 3dDeconvolve specifying a baseline model as:

3dDeconvolve -input data.nii.gz  -num_stimts 1 \
-ortvec M.par motion_params \
-stim_file 1 A.1D -stim_label 1 stim_A \

Or without a baseline model, as:

3dDeconvolve -input data.nii.gz  -num_stimts 7 \
-stim_file 1 A.1D -stim_label 1 stim_A \
-stim_file 2 M.par[1] -stim_label 2 motion_params_1 \
-stim_file 3 M.par[2] -stim_label 3 motion_params_2 \
-stim_file 4 M.par[3] -stim_label 4 motion_params_3 \
-stim_file 5 M.par[4] -stim_label 5 motion_params_4 \
-stim_file 6 M.par[5] -stim_label 6 motion_params_5 \
-stim_file 7 M.par[6] -stim_label 7 motion_params_6 \

Would there be a difference in R[sup]2[/sup] between these two calls, as well as in the F and T statistics of the first regressor (“stim_A”)?
I suspect that there would not be a difference in R[sup]2[/sup], but there would be a difference in F and T - but I would like to double check.

Of note, I know that the recommended way is to specify motion parameters in the baseline model. However, I’m asking about this difference because in case of high collinearity between motion parameters and signal of interest it wouldn’t be unreasonable to evaluate a model fit having all the regressors treated in the same way, as it would be tricky to estimate what part of the variance could be assigned to which regressor.

Thank you,
Stefano Moia

Hola Stefano,

The statistics computed in 3dDeconvolve are between a “signal” model and the “non signal” model (S and N in what I write below).
What regressors are considered to be in S and what are considered to be in N depends on the statistic being computed.
The statistic bricks measure how much adding S to N improved the least squares fit.

For individual regressors NOT marked as “baseline” (via -stim_base or -ortvec), then their individual t (and F and R^2) statistics are computed with S = that regressor and N = all other regressors. So these bricks are what is sometimes called a “marginal” statistic, showing how much this one regressor improved the model when it was added in after all the other regressors.

So your two runs should have the same statistical result for stim_A.

On the other hand, the Full F statistic is a collective statistic, where S = all regressors not in the baseline model, and N = all regressors in the baseline model. That is, the Full F measures the improvement of model fit (in the least squares sense) when all non-baseline regressors are added to the baseline fit.

So your two runs should have different results for the Full F brick, since in the first run, the only S regressor is stim_A while in the second run the motion regressors are also in S (and will each get their own t brick output – assuming you use the -tout option).

You should OF COURSE run the program both ways to be sure that what I’m saying is true. Empirical knowledge wins over trans-Atlantic philosophy.

** bob cox

Hello Bob,
thank you for your answer! I’m going to empirically test the difference between the two cases for sure.
In the meantime, if I run 3dDeconvolve with all my regressors as “signal” and adding “polort”, i.e.:

3dDeconvolve -input data.nii.gz  -num_stimts 7 -polort 5 \
-stim_file 1 A.1D -stim_label 1 CO2 \
-stim_file 2 M.par[1] -stim_label 2 Motion1 \
-stim_file 3 M.par[2] -stim_label 3 Motion2 \
-stim_file 4 M.par[3] -stim_label 4 Motion3 \
-stim_file 5 M.par[4] -stim_label 5 Motion4 \
-stim_file 6 M.par[5] -stim_label 6 Motion5 \
-stim_file 7 M.par[6] -stim_label 7 Motion6 \

I get an output similar to the one in the attached picture.
I have two follow up questions to keep the transatlantic enquiring:

  1. The first brick in the output is the “Full R^2” of the model. Does that R^2 consider the whole model (signal+baseline), or only the part of the model attributed to signal?
  2. In case the Full R^2 considers only the signal part of the model, would there be a way to include the polynomials in the signal part of the model as well?