Bandpassing in Example 11?

Dear AFNI Gurus,

I hope you are enjoying this special time of the year even though it is a very different year.

I have a question regarding the pre-processing of resting-state data, especially with respect to bandpassing.

This resting-state-note states that the data should be bandpassed if physio recordings are not available. Unfortunately, we did not collect any physio recordings, hence suggesting that we should bandpass. However, the associated data loss is a concern for me - our TR is 2, meaning that we would lose 60% of DoF.

Paul’s answer in the topic quick question about bandpassing stated that to avoid bandpassing and the loss of DoF, highpass filtering can be achieved by regressing out the polynomials. Does that mean “-regress_polort” should be included in the afni_proc.py command?

Additionally, I was wondering whether there is also a possible substitute for the lowpass filtering? After consulting the resting-state note, it seems that processing the data using example 11 might be suitable for e.g. resting-state functional connectivity analysis. However, this pre-processing example includes neither bandpassing nor the regress_polort option. Could you please explain to me why this is still suitable for the resting-state analysis and how high and low frequency noise are removed from the final dataset? Paul’s answer to this post on Conducting bandpassing in one step with versus in two steps states that bandpassing should be done in the regression model, but I am not sure whether it can only be achieved using “-regress_bandpass” or whether a combination of using “-regress_polort” and other options could have a similar effect in removing high and low-frequency noise from the data without having to bandpass it (and hence lose 60% of DoF).

Related to this, we also collected data from the same participants watching short video clips. We used the same scanning parameters, but the main analysis method we would like to use there is Intersubject correlation. Do the same rules regarding bandpassing apply there given that EPIs acquired under naturalistic stimulation are usually pre-processed like resting-state data?

Any further clarification about this would be highly appreciated!

Many thanks in advance for your help and best regards,
Stef

Hi, Stef-

2020… is almost… over… Can’t. Wait.

(back to data)

Well, I think the language is that, without physio recordings, bandpassing RS-FMRI data is default processing in the field. “The Field” often doesn’t seem concerned with degrees of freedom (DFs) lost by bandpassing the data to be between ~0.01-0.1 Hz. In most cases, this does lead to purging 60% of DFs (when TR = 2 s), with that fraction increasing as the TR becomes shorter.

Note: in my (cited answer) about bandpassing, the comment about highpass filtering vs polynomial regress is separate from the RS-FMRI bandpassing question. For any FMRI processing, one typically wants to account for scanner drift in the data, which is veeerrry looooong time scale fluctuations in the signal—or, equivalently, very looow frequency components. You can address this by including either fairly low order polynomials as regressors (the default AFNI way) or by including very low frequency components as regressors (which is the same as high pass filtering, which I believe is what SPM does, purging frequencies <0.001 Hz or something; note, I don’t know whether SPM does this filtering with regressors or as a separate Fourier Transform step). These two methods are not the same, but they are each trying to address the same thing, and in practice they probably end up producing pretty similar results, as far as I know (at least for reasonably short and non-weird data). So, to reiterate: this is not to do anything with the RS-FMRI processing step of bandpassing to ~0.01-0.1 Hz.

Note that by default, a ~low order polynomial is included by default when using afni_proc.py. Looking at the help of “-regress_polort …”, it is worth noting the formula, based on the length of the run:


-regress_polort DEGREE  : specify the polynomial degree of baseline

                e.g. -regress_polort 2
                default: 1 + floor(run_length / 150.0)

            3dDeconvolve models the baseline for each run separately, using
            Legendre polynomials (by default).  This option specifies the
            degree of polynomial.  Note that this will create DEGREE * NRUNS
            regressors.

            The default is computed from the length of a run, in seconds, as
            shown above.  For example, if each run were 320 seconds, then the
            default polort would be 3 (cubic).

            Please see '3dDeconvolve -help' for more information.

You can also see this by looking at the script created by afni_proc.py (probably called proc.*), searching for the 3dDeconvolve command and the “-polort …” opt therein). Again, this is still different than bandpassing for RS-FMRI data.

All specified bandpassing and regression in afni_proc.py’s “regress” block is done as regression. Often, one things of bandpassing as being done separately with the Fourier Transform, but it need not be done so. In fact, as I commented there and as is probably commented in the afni_proc.py help, it would not be correct to separate those processing steps: everything should be taken out of the original data in one single step, which is possible (in brief, regression out the sine+cosine of a given frequency is the same as bandpassing it out with a Fourier Transform).

This also allows us to count the DF loss equivalently:

  • every frequency purged is (essentially) 2 DFs removed,
  • every non-frequency regressor in the model (polynomial, motion, etc.) is 1 DF removed,
  • every censored time point is 1 DF removed (motion-based censoring will lead to 2 time points being removed; outlier-based censoring will lead to 1 time point being removed).
    Each of these contributes to loss of DFs.

I don’t think I would bandpass naturalistic data by default. The processing is done like resting state, and your main output will be the residuals of the model, but here you have activity during that time of varying time scale. Probably there is information at frequency scales >0.1 Hz (which are time scales <10 s) that are of interest to you.

–pt

Hey Paul,

Thank you very much for this detailed answer, this is very informative and helpful.

I was not aware that “-regress_polort” is included by default - I am learning something about my friend afni_proc.py all the time B-) One quick follow up question here: if I don’t specify the DEGREE manually, it seems as if it is computed based on the length of the first run rather than the average run length of all runs, is that correct? Is there a way to determine the DEGREE separately for each run? Our variation in the run length within and across subjects is not large [mean = 759.57s, sd = 24.45s, min = 700s, max = 826.01s], but enough to let the calculated polynomial vary between 5 and 6. When specifying “-regress_polort”, I used the mean run length across all subjects to specify the DEGREE, but if there was a way to be more precise than that, that would of course be even better :slight_smile:

The fact that “-regress_polort” is included by default helps me to understand how low frequency noise is dealt with in Example 11 without specifying “-regress_polort” explicitly and without bandpassing. What I am unfortunately still don’t quite understand yet despite your thorough answer is whether Example 11 also has a clever way to address the problem of high-frequency noise without the necessity of applying bandpassing? I’m sorry if this is a very obvious thing, I am just trying to understand why Example 11 (which is recommended for complete pre-processing) does not apply bandpassing although it is the somehow questionable standard in “The Field”.

Again, many thanks for your help and I hope you will get into 2021 safe and sound (or how it’s said in Germany: Guten Rutsch)
Stef

Hi, Stef-

Yes, I would recommend browsing the script created by afni_proc.py (AP) for extra details and to see what is happening-- it is actually automatically commented, in addition to being organized by block, so it is muuuuch more readable than the scripts that, say, I write.

Yes, the default polort degree is calculated based on the duration (that is, “TR” times “number of time points,” in units of seconds) of the first time series listed. Note if you have several time series, it is the first one in the list input to AP that will determine the degree. There is a comment about having different run lengths:
https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/programs/afni_proc.py_sphx.html#runs-of-different-lengths-note
… namely that the differing degree shouldn’t be a huge problem-- esp. in your case, I think those would all end up with quite similar formulae. If your time series were very different in length, that could be an issue. That might be something to get Rick’s opinion on (I will try to get his opinion on the current run lengths, as well).

Re. high frequency noise:
The goal of most resting state/naturalistic processing is to remove all noise and keep all signal. However, we don’t know what those are exactly (otherwise, life would be too easy). This is true with the removal of frequencies > 0.1 Hz that standard RS-FMRI processing does: the idea is that those frequencies contain a lot of breathing and heartrate information in the signal (based on typically breathing and heartrates), which is basically what we would consider noise because we want to focus on neuronal based signals. However, there is also useful information at those frequencies-- this is what Gohel & Biswal (2015) show, as well as the others in the afni_proc.py help section on bandpassing. So, you the processor have to weigh the risk/reward of just bandpassing everything out there. (If we had shorter TR data, like the 0.1 s TRs that some specialized sequences get, the breathing and heartrate can be more targetedly removed, but most people don’t have that.) The Field doesn’t really consider the degree of freedom (DF) loss, either, which is a problem with purging that much data-- there should be statistical/inferential consequences on the output when it has been so heavily processed.

Danke schoen, and Guten Rutsch to you, as well.

–pt

Hi Stef,

To some degree, if run lengths varied enough to lead to an unreasonable difference between the baseline models, then one might argue that those runs do not belong on the same model in the first place. Also, these runs approach 14 minutes in duration, which makes it generally harder to accurately model the baseline.

Here, there is only a difference of 1 degree in the polorts. It would be reasonable to simply tell afni_proc.py to use the larger polort for everyone (via -regress_polort). But still, 826s is a long run.

Regarding bandpassing being a standard in the field, be aware that the afni_proc.py NOTE that you linked to, while starting with a comment about it being fairly standard (and that was many years ago), is basically a long description about why bandpassing is not a great method. It was not intended to recommend the method, quite the opposite.
Personally, I believe that the main reason it is still mostly standard is because it is applied incorrectly (with a separate model that is not accounted for), and because people are not fully aware of the cost.

For long runs such as yours, it might also be reasonable to model the drifts using sinusoids (i.e., with a high-pass filter). In AFNI, we suggest doing that with -polort 2 (since the sinusoids do not capture that), plus the low-pass filter. A basic example might be something like:

   -regress_polort 2 \
   -regress_bandpass 0.01 1 
  • rick

Hey Paul, hey Rick,

Happy New Year!

Thanks for your response and sorry for taking some time to reply, I needed to think about what you both said.

I think it might be helpful to give you a bit more background on the data: we acquired a 10 min (TR = 2, 300 volumes) resting-state scan [pre-learning rest] before subjects participated in an incidental learning paradigm where we displayed short movie clips and asked participants to rate them. There were three task blocks with slightly varying scan durations due to pseudo-randomization of trials (see previous post, average duration 12min 39 seconds, all 3 runs combined 1140 volumes with TR = 2). After the last task block, we acquired a second 10 min resting-state scan [post-learning rest].

Re. high-frequency noise:
In the resting-state analysis, I am interested in the change in resting-state functional connectivity (RSFC) between two ROIs from pre-learning to post-learning resting-state. I hence calculate the correlation of the signal in both ROIs separately for each run and then calculate the difference between them. This change in RSFC is later also correlated with behavioural measurements like the number of movie clips encoded. Because I am looking at the difference in functional connectivity, I am tempted to say that either
[ul]
[li] (A) high-frequency noise affects pre- and post-learning resting state in the same manner (participants still have to breath) and the effect of high-frequency noise on functional connectivity correlation estimates cancel each other out when looking at the difference
[/li][/ul]
or
[ul]
[li] (B) high-frequency-noise is different between pre- and post-learning rest because participants breath more or less after the incidental encoding task, but in that case, this could also be a function of the task and not necessarily noise per se.
[/li][/ul]
In either case (and especially weighing in the downsides of band-passing), I think it might be better to not bandpass after all. But maybe it would be better to examine this empirically and see whether the change in functional connectivity differs depending on whether or not the data was bandpassed - I have a feeling that this is something a reviewer will say to me at some point…

Re. modelling the drift:
Thanks for sharing that note again, Paul, I thought I had read something about this topic before, but just could not remember where exactly. I have used the larger polort (i.e. 6) for everyone in my pipeline, but I was not aware that the run length makes it difficult to model the baseline accurately. Additionally, I forgot to mention that there is one subject whose second task block was interrupted and who has hence two 3dT+ epi volumes capturing that block, totalling to four 3dT+ task volumes with 363, 169, 219, and 389 volumes respectively. I also haven’t mentioned that we concatenate the fully pre-processed time series (errts*fanaticor+tlrc.) at the end using 3dTcat before we compute the intersubject correlation maps using 3dTcorrelate. This is because the movie clips were displayed in pseudo-randomised order and we need to “re-order” the volumes so that say the first 30 volumes in each subject always cover the same input movie clip regardless of when that movie clip was displayed across the three runs and so on. Sorry that I did not provide this information in my previous message. So taking all these things into consideration, do you still think it would be better to model the drifts using sinusoids (-regress_bandpass 0.01 1) + regress_polort 2 instead of -regress_polort 6? If so, just so clarify: when applying a low-pass filter as suggested, I would not lose 60% of DoF because the Nyquist frequency at TR = 2 is 0.25 Hz and I define 1 as the low pass boundary? And would it also be better to apply sinusoids (-regress_bandpass 0.01 1) + regress_polort 2 to the slightly shorter (10 min, 300 volumes, TR = 2) resting-state runs than using regress_polort 5?

Also, thank you for clarifying again how the resting state note is meant, Rick. I actually watched your talk on the MITCBMM YouTube channel earlier and fully understand now that the note is more meant as a rant about bandpassing than a recommendation to perform bandpassing.

I really appreciate all the input and help from the AFNI gurus, thanks again!

Best wishes,
Stef