Multiple Comparisons Correction Fatcat DTI MVM Analysis

What is the best way to correct for multiple comparisons for the ROI-based post-hoc tests from a script produced with

I am utilizing 3dMVM to compare the numerical DTI parameters (specifically FA and MD) obtained from probabilistic tractography grid files. We have 19 ROIs, 75 ROI pairs, a significant overall group effect (control vs. patient), and 8 ROI-pair based post-hoc tests for group that are significant.

What is the best way to correct for multiple comparisons for the ROI-based post-hoc tests from a script produced with

Currently the correction provided by fat_mvm is similar to the conventional ANOVA approach: the post hoc tests are controlled through the omnibus F-tests. I’m currently working on a new approach, which is most likely stronger than the conventional method, but it may take a while to get it fully implemented.

Hi Gang,
May I ask is there any update on this topic?
If the post hoc tests are controlled through the omnibus F-tests, that means i do not need to do p-value adjustment, right ? I have 96 ROIs and 407 pair of tracts. The 3dmvm results is as followed. It said the degree of freedom is 35. I am confused which number should i used in post hoc p-value adjustment. The 96? 407? or 35? thank you for the help!


2 # Number of effects
# Chisq DF Pr(>Chisq)
100 1 0 # (Intercept)
10.01019 1 0.001556765 # type

RESULTS: Post hoc tests - fNT

465 # Number of tests
# value t-stat DF 2-sided-P
0.00E+00 0 0 0.00E+00 # 201__202–age
0.00E+00 0 0 0.00E+00 # 201__202–dur
1.90E-04 0.408577575 35 6.85E-01 # 201__202–type(+R-T)
2.80E-03 7.656723836 35 5.54E-09 # 201__202–type^^R
2.61E-03 9.149709242 35 8.22E-11 # 201__202–type^^T

I am confused which number should i used in post hoc p-value adjustment.

As discussed previously in this thread, we recommend that you start with the omnibus F-test for each effect of interest, and then look for the associated post hoc tests to find out the specifics regarding that effect. The omnibus F-test offers some extent of (although a little weak) false positive control.

We’re developing a Bayesian approach as an alternative to model white-matter connectivity data. If you’re interested, contact me offline.

Just adding to Gang’s comments briefly:

Generally, the mindset is to investigate at the network level first (e.g., “does this set of quantities in this network differ between group A and group B?” or, “is this set of quantities in the this network associated with measure X for this group?”), taking all the quantities in the network (e.g., the set of all mean FA values) together-- the omnibus F-test operates at this level. This is the primary test, the main model of interest.

IF the network level test shows significance in the desired quantity/relation, then it would be of interest to ask: well, now that we see this significant relation between/within the network, which ROI(s) within it is/are driving this most strongly? Is it one or two, or a more diffuse/spread out property? This question is address in the post hoc tests, where the same model that was tested at the network level is tested ROI-by-ROI. While there are indeed multiple tests here (N of them if there are N ROIs in the network), we aren’t so concerned with absolute significance here. The main, network level test above has already determined significance; here, we are intested in whether one or more ROIs individually shows high significance. This ROI-level testing is just performed a follow-up to provide more description about what might be driving the main effect of interest at the network level.

On another note-- I notice in your modeling example that the matrix parameter was ‘fNT’, the fractional number of tracts per connection. While several matrices are output by 3dTrackID, some are more for ‘diagnostic’ or ‘informative’ purposes, and note really expected to be used for modeling with/among groups. The numbers of tracts fall into this category-- I can’t see any real physical relation between numbers of tractographic tracts and a while matter property; furthermore, trying to normalize this quantity across brain sizes, for example, would be reeeeeally tough-- do we normalize based on volume (because the brain vols differ), surface area of ROIs (because those differ among subjects), or linear distance (because there is a bias against finding connections among targets that are farther apart, as error accummulates in tract progression)? So, NT and fNT probably wouldn’t make good things to compare across groups. Things derived from DTI measures would; perhaps some of the volume-related measures could, as well, depending how you want to account for differing brain volume.