# inflated 3dclustsim values with no mask?

Hello!

I’m trying to make sure I’m understand how the different parts of how 3dclustsim works.

I noticed if I don’t specify a grey matter mask (i.e., no mask flag) in my 3dclustsim analysis, my k values are much lower. Intuitively, I would have expected the opposite. My goal is to include the whole brain in my calculation.

Could you help tease apart why my values vary so dramatically between using a grey matter mask (more stringent p/k value ratios) and no mask? I’m assuming the k values provided with the grey matter mask are correct because they are more stringent, but I just want to make sure I’m understanding why.

Thanks again!

Could you post a pair of example tables from 3dClustSim? Sounds weird to me at first neuronal firing.

Also, what are “k” values? The cluster size threshold?

–pt

Here is what I was running originally

# Set path to the RX directory to save output

rx_path=/nas/longleaf/home/jess1/whole-brain-dev-social/analysis/thresholding

# Set path to the mask used in analyses #mask is at 70% threshold grey matter mask

# Run 3dClustSim using the average acf outputs from the script "calculate_average_ACF.Rmd"
#acf parameters were calculated at individual residual files, then averaged together. Note this is across 3 waves.

and sample values.

# bi-sided thresholding
# Grid: 91x109x91 2.00x2.00x2.00 mm^3 (172610 voxels in mask)
#
# CLUSTER SIZE THRESHOLD(pthr,alpha) in Voxels
# -NN 3  | alpha = Prob(Cluster >= given size)
#  pthr  | .10000 .05000 .02000 .01000
# ------ | ------ ------ ------ ------
0.050000  1291.0 1565.2 1889.0 2153.0
0.020000   501.2  594.0  751.8  896.7
0.010000   281.3  343.6  433.0  491.7
0.005000   171.1  210.4  265.7  311.5
0.002000    97.3  120.6  153.2  186.7
0.001000    65.4   82.0  105.9  131.3
0.000500    45.2   57.3   74.5   92.8
0.000200    27.8   35.9   49.2   59.6
0.000100    19.1   25.5   35.8   43.0

Here is what I was running to see the effect of the mask.

# Run 3dClustSim using the average acf outputs from the script "calcualte_average_ACF.Rmd"
3dClustSim -acf 0.555380169013242 4.61362680991569 12.3309040184921 > \${rx_path}/threshold_SID_nomask.txt

and sample values of p/k values.

# 3dClustSim -acf 0.555380169013242 4.61362680991569 12.3309040184921
# bi-sided thresholding
# Grid: 64x64x32 3.50x3.50x3.50 mm^3 (131072 voxels)
#
# CLUSTER SIZE THRESHOLD(pthr,alpha) in Voxels
# -NN 3  | alpha = Prob(Cluster >= given size)
#  pthr  | .10000 .05000 .02000 .01000
# ------ | ------ ------ ------ ------
0.050000   463.0  537.0  624.0  708.0
0.020000   159.8  189.1  223.2  248.6
0.010000    86.6  101.6  123.6  141.3
0.005000    51.9   61.6   75.2   85.7
0.002000    29.4   35.1   43.3   50.6
0.001000    20.3   24.1   30.2   34.8
0.000500    14.5   17.3   21.6   25.4
0.000200     9.6   11.6   14.3   16.7
0.000100     7.1    8.6   10.9   12.6

Again, I’m assuming the second one is wrong, but trying to understand why it was so liberal. The first results also feel overly strict, but I trust those results more than the second version without a mask.

Thank you!!

Hi-

When you provide a mask, you provide both:

1. a grid, with voxel dimensions, and
2. a certain number of voxels (and shape of region)
within which to calculate noise-like simulations. When a mask is not provided, I don’t see how the program would even know what size matrix or voxel sizes it should be useful in the simulations.

In your case of using a mask, the output tables notes that the grid (matrix size) and voxel size are:

# Grid: 91x109x91 2.00x2.00x2.00 mm^3 (172610 voxels in mask)

… as well as how many of the total voxels (9110991 = 902629) are in the mask. That makes sense.

However, in the case of not using a mask, note that the (apparently internal) dset being used for testing has a different grid (matrix size) and voxel size:

# Grid: 64x64x32 3.50x3.50x3.50 mm^3 (131072 voxels)

Soooo, trying to compare output tables is an apples-to-oranges comparison. The total number of voxels being compared in each case is different (just by chance here, not too different). But note that the major difference is that the voxel sizes are totally different between runs: 2x2x2 voxels have a volume of 8 mm3. The 3.5 mm iso voxels have a volume of ~43 mm3. So, the same ACF params (from which the effective smoothness size is calculated) covers the noise fields in each volume preettty differently.

You cannot check the differences of running 3dClustSim with or without a mask in this way. If you wanted to, you could make a “mask” that covers your whole volume (i.e., something that is 1s across the whole grid specified by the input volume; check in the GUI to verify):

… and run 3dClustSim with that, and compare results to your sub-FOV mask.
(Note that in the present case with such small 2mm iso voxels as you have, and a pretty large FOV, this will be pretty computationally intensive.)

I am actually surprised that 3dClustSim runs at all without any mask given, because then it has to guess at a grid. I will have to look more deeply into the code to see how it picks the grid and voxel size in such a case (the terminal text suggest it depends on ACF params), but I would not run 3dClustSim without providing a “-mask …” to specify this kind of information, regardless of whether that mask really is a brain mask or a whole FOV “mask”.

–pt

Thank you so much! This is incredibly helpful! I too was surprised it ran without a mask.

Hello again!

I have a follow up question regarding the mask. I apologize if this is obvious answer, but I want to confirm so I can also accurately pass this information along to labmates with similar questions.

If our data were collected with a larger voxel size (e.g., 3x3x3) and resampled to 2x2x2, would I want my mask to match the resampled voxel size or original FOV voxel size?

Howdy-

The following datasets involved in this should all have the same grid/resolution:

• the 4D dataset used to estimate the ACF parameters with 3dFWHMx
• the mask dataset used for delimiting the brain in 3dFWHMx, and which will be provided to 3dClustSim
• the statistics dataset that will have the clusterizing applied.

Perhaps, in theory, some of those probably could be on separate grids, because of the stability of resampling within AFNI:
https://arxiv.org/abs/1709.07471
… but in practice, there is typically no reason to have any of those on a separate grid, and it would be very easy to mis-apply parameters or estimated quantities. If you are using afni_proc.py, all of the output dsets involved should be automatically on the same output grid anyways (I think…).

–pt