censoring data and design matrix column is all zeros

AFNI version info (Version AFNI_25.0.07 'Severus Alexander'):

Hi AFNI experts,

I am running 3dDeconvolve/3dREMLfit for my 1st level GLM. I have done some censoring using the -censor flag on 3dDeconvolve, such that 1 TR before and 2 TRs after target TR are set to 0. I am also using -stim_times_IM for future RSA analyses.

Because of the aggressive censoring, I think that some stimuli in my design matrix have become all 0s. This is what I get for 3dDeconvolve.err:

*+ WARNING: -------------------------------------------------
*+ WARNING: Problems with the X matrix columns, listed below:
*+ WARNING: !! * Columns 123 [Vis#37] and 191 [Con#37] are (nearly?) collinear!
*+ WARNING: !! * Column 193 [Con#39] is all zeros
*+ WARNING: -------------------------------------------------
*+ WARNING: !! in Signal+Baseline matrix:
 * Largest singular value=2.48161
 * 2 singular values are less than cutoff=2.48161e-07
 * Implies strong collinearity in the matrix columns! 
*+ WARNING: !! in Signal-only matrix:
 * Largest singular value=2.06517
 * 2 singular values are less than cutoff=2.06517e-07
 * Implies strong collinearity in the matrix columns! 

and 3dREMLfit.err

** ERROR: matrix column #193 is all zero!?
** FATAL ERROR: Cannot continue with all zero column without -GOFORIT option!

I am not sure how to proceed. I'm looking forward to your help.

Best,
Deanne

Hi Deanne,

Indeed, when using IM, censoring will very commonly wipe out some regressors. Other regressors might be left with only 1 non-zero time point, which is almost the same as censoring, but you get a perfectly fit but very noisy beta instead. Such is life with IM. :)

In such a case, you will have to tell 3dDeconvolve (and 3dREMLfit, since that is being run as well) that it should proceed with the regression anyway (all-zero regressors will get betas of zero). The legal documentation for you accepting this heavy responsibility is signed by passing a -GOFORIT option.

Note that 3dDeconvolve and 3dREMLfit take slightly different options. 3dDeconvolve requires a GOFORIT parameter level, while 3dREMLfit requires nothing. I do not see the required level in your text output, but for example, you might use -GOFORIT 10 with 3dDeconvolve, but just -GOFORIT with 3dREMLfit. If you are using afni_proc.py, pass these options with something like:

-regress_opts_3dD  -GOFORIT 10    \
-regress_opts_reml -GOFORIT       \

Does that seem reasonable?

-rick

Hi Rick,

Thank you! That was informative.

Following up on this, in the 3dDeconvolve part of the script, I used -allzero_OK in one iteration and -GOFORIT 10 in another, I didn't quite see a difference between the two. I assume they will do similar things if the only error is a zero column?

In the outputted stats file, I am missing some coefficients. I've pasted here just the sub-briks for the Congruent condition coefficients:

  -- At sub-brick #143 'Con#0_Coef' datum type is float:     -5010.61 to       5685.58
  -- At sub-brick #144 'Con#1_Coef' datum type is float:     -5030.14 to       3045.52
  -- At sub-brick #145 'Con#2_Coef' datum type is float:     -10873.2 to       10566.2
  -- At sub-brick #146 'Con#3_Coef' datum type is float:      -106062 to       90707.4
  -- At sub-brick #147 'Con#4_Coef' datum type is float:      -439268 to        744690
  -- At sub-brick #148 'Con#5_Coef' datum type is float:     -6609.38 to       8371.92
  -- At sub-brick #149 'Con#6_Coef' datum type is float:     -10185.2 to       9097.12
  -- At sub-brick #150 'Con#7_Coef' datum type is float:     -8796.89 to       6687.54
  -- At sub-brick #151 'Con#8_Coef' datum type is float:     -3651.27 to       7387.02
  -- At sub-brick #152 'Con#9_Coef' datum type is float:     -4747.54 to       5230.79
  -- At sub-brick #153 'Con#10_Coef' datum type is float:     -4208.49 to       3498.99
  -- At sub-brick #154 'Con#11_Coef' datum type is float:     -2239.54 to       1865.19
  -- At sub-brick #155 'Con#12_Coef' datum type is float:     -1873.51 to       2428.59
  -- At sub-brick #156 'Con#13_Coef' datum type is float:     -1982.87 to       2177.41
  -- At sub-brick #157 'Con#14_Coef' datum type is float:     -1320.66 to       1377.42
  -- At sub-brick #158 'Con#15_Coef' datum type is float:     -2277.87 to       2410.72
  -- At sub-brick #159 'Con#16_Coef' datum type is float:     -1790.05 to       2825.61
  -- At sub-brick #160 'Con#17_Coef' datum type is float:     -2960.34 to       3289.12
  -- At sub-brick #161 'Con#18_Coef' datum type is float:      -2971.8 to       2464.71
  -- At sub-brick #162 'Con#19_Coef' datum type is float:     -2898.14 to       2387.56
  -- At sub-brick #163 'Con#20_Coef' datum type is float:     -2155.49 to       3777.93
  -- At sub-brick #164 'Con#21_Coef' datum type is float:     -3864.15 to       4228.17
  -- At sub-brick #165 'Con#22_Coef' datum type is float:     -4040.75 to       3531.13
  -- At sub-brick #166 'Con#23_Coef' datum type is float:     -4887.91 to       10267.8
  -- At sub-brick #167 'Con#24_Coef' datum type is float:     -7882.59 to       6568.39
  -- At sub-brick #168 'Con#25_Coef' datum type is float:     -2128.17 to       2510.28
  -- At sub-brick #169 'Con#26_Coef' datum type is float:     -3007.78 to       6493.65
  -- At sub-brick #170 'Con#27_Coef' datum type is float:     -3310.79 to        2276.1
  -- At sub-brick #171 'Con#28_Coef' datum type is float:     -1754.58 to       2066.93
  -- At sub-brick #172 'Con#29_Coef' datum type is float:     -3248.16 to       2913.43
  -- At sub-brick #173 'Con#30_Coef' datum type is float:     -2396.31 to       1781.88
  -- At sub-brick #174 'Con#31_Coef' datum type is float:     -2654.68 to       2962.24
  -- At sub-brick #175 'Con#32_Coef' datum type is float:     -2627.59 to       4011.48
  -- At sub-brick #176 'Con#33_Coef' datum type is float:     -3151.11 to          4238
  -- At sub-brick #177 'Con#34_Coef' datum type is float:     -5433.04 to       11390.2
  -- At sub-brick #178 'Con#35_Coef' datum type is float:     -3294.67 to       5906.48
  -- At sub-brick #179 'Con#36_Coef' datum type is float:      -3645.5 to       5346.52
  -- At sub-brick #180 'Con#37_Coef' datum type is float:      -697402 to        589456
  -- At sub-brick #181 'Con#38_Coef' datum type is float:     -17925.4 to       10478.3
  -- At sub-brick #182 'Con#39_Coef' datum type is float:     -6208.61 to       9459.94
  -- At sub-brick #183 'Con#40_Coef' datum type is float:      -2338.2 to       3511.16
  -- At sub-brick #184 'Con#41_Coef' datum type is float:     -3476.95 to       3350.59
  -- At sub-brick #185 'Con#42_Coef' datum type is float:     -2702.22 to        3074.3
  -- At sub-brick #186 'Con#43_Coef' datum type is float:     -28215.5 to       41779.6
  -- At sub-brick #187 'Con#44_Coef' datum type is float:     -4590.46 to       3167.64
  -- At sub-brick #188 'Con#45_Coef' datum type is float:      -2674.9 to       3191.92
  -- At sub-brick #189 'Con#46_Coef' datum type is float:     -2345.82 to       5059.29
  -- At sub-brick #190 'Con#47_Coef' datum type is float:     -3029.82 to       2801.03
  -- At sub-brick #191 'Con#48_Coef' datum type is float:      -3117.9 to       2315.93
  -- At sub-brick #192 'Con#49_Coef' datum type is float:     -2848.98 to       2665.81
  -- At sub-brick #193 'Con#50_Coef' datum type is float:     -3224.09 to       3259.65
  -- At sub-brick #194 'Con#51_Coef' datum type is float:     -3362.05 to        2658.3
  -- At sub-brick #195 'Con#52_Coef' datum type is float:     -4269.63 to       2650.78
  -- At sub-brick #196 'Con#53_Coef' datum type is float:     -7021.99 to       5965.92
  -- At sub-brick #197 'Con#54_Coef' datum type is float:     -3837.99 to       4706.61
  -- At sub-brick #198 'Con#55_Coef' datum type is float:     -3050.49 to        2701.6
  -- At sub-brick #199 'Con#56_Coef' datum type is float:     -2145.94 to       2684.22
  -- At sub-brick #200 'Con#57_Coef' datum type is float:     -3314.52 to       5293.58
  -- At sub-brick #201 'Con#58_Coef' datum type is float:     -1862.26 to       2846.83
  -- At sub-brick #202 'Con#59_Coef' datum type is float:     -1648.14 to       4042.79
  -- At sub-brick #203 'Con#60_Coef' datum type is float:     -2859.86 to       3143.26
  -- At sub-brick #204 'Con#61_Coef' datum type is float:     -2916.65 to       4627.22
  -- At sub-brick #205 'Con#62_Coef' datum type is float:     -3081.18 to       3883.84
  -- At sub-brick #206 'Con#63_Coef' datum type is float:       -38586 to       46971.5
  -- At sub-brick #207 'Con#64_Coef' datum type is float:     -26195.2 to       20345.7
  -- At sub-brick #208 'Con#65_Coef' datum type is float:      -5713.1 to       6340.82
  -- At sub-brick #209 'Con#66_Coef' datum type is float:  -2.7981e+06 to   2.52047e+06

There should be in fact 68 stimuli and I happen to be missing Con#67, when I thought I should be missing Con#39 (from the previous warnings and errors). Is this because Con#39 was replaced with Con#40?

Also, I noticed some large ranges in coefficients, see Con#66. Does this mean that the coefficients are unstable and that I need to remove this stimulus?

Thank you so much for your help.

Best,
Deanne

Both -allzero_OK and -GOFORIT XX are intended just to let the program proceed, not to alter anything about output (except to let it exist :). But I have found that -allzero_OK might be neither sufficient nor required, so I tend not to use it (it was intended to be sufficient in special cases like this one, however the condition number checks would still prevent further execution). It certainly won't hurt though.

Would you be willing to send your X.xmat.1D and X.nocensor.xmat.1D files to me? If so, please send them to afni.bootcamp at gmail. I would like to take a closer look.

The large values are also a (potential) side-effect of censoring. The ideal responses put into a regressor tend to start and end very close to zero, with a peak around 1, depending on the basis function. If the response gets mostly censored, the regressor might be left with just one tiny value. This will inversely scale into the betas for that regressor. If the only value left in the regressor is 0.0001, that will scale the resulting beta by 10,000, for example. And if it is the only non-zero entry in the regressor, it will get an exact fit to the data, which is probably around 100. The result is a output beta of around one million.

I should add a test for this in 1d_tool.py. It pops up often enough, and deserves a warning (though at what limit?).

Anyway, please feel free to send the xmat files.

Thanks,

-rick

Hope you don't mind me chiming in for a sec but i'm wondering here about the data acquisition and processing contexts, namely:

what's your TR?
what's your censoring level?

i've found in multiple datasets with very fast TR (approx 1sec or faster; e.g. HCP, ABCD etc) that non-aliased respiratory B0 fluctuations cause pseudo motion effects that result in a ton of censoring at typical or default censoring levels. this can be exacerbated by characteristics in one's population too (e.g. BMI). let us know; it's possible the censoring criteria simply needs to be relaxed because i've heard it said that motion regression tends to deal well with such pseudo motion (although i'd have to dig for a good reference for that).

-Sam

Hi Rick and Sam,

Thank you both for your responses!

My TRs are 1 second. I was using -stim_times_IM with aggressive censoring with quite wiggly children. My censoring threshold was FD > 0.5 mm and the previous TR and 2 TRs after the flagged TR. This decision was based on Power et al. (2014). However, I think I needed to be more liberal. The group had another paper with task-based fMRI where they found success in censoring just FD > 0.9 mm (without censoring previous or later volumes; Siegel et al., 2014). I am currently trying this solution and removing runs where >20% of the TRs have been censored. One third of the way in, it looks like the GLMs are being created without zero columns or strong collinearity. Additionally, the coefficients appear to be more reasonable.

I appreciate both of your help!

Best,
Deanne

sounds like we're narrowing into a solution! if you need another reference re: 0.9 mm censoring you could also consider J Etzel's 2023 "Efficient evaluation of the Open QC task fMRI dataset".

as for the pseudo motion issue, my 3 fave references are:

Power et al 2019 "Distinctions among real and apparent respiratory motions in human fMRI data"

Fair et al 2020 "Correction of respiratory artifacts in MRI head motion estimates"

Gratton et al 2020 "Removal of high frequency contamination from motion estimates in single-band fMRI saves data without biasing functional connectivity"

-Sam

2 Likes