comparing distributions of statistical values between groups: estimating spatial degrees of freedom

paul.hamilton · July 25, 2016, 3:22pm

Hi all,

Setting up the problem:

We’re using a voxel-wise measure of neural inflammation (NI) to assess depressed and healthy samples. The state of affairs in doing neural inflammation work is such that we do not have strong a priori regional predictions as much as strong general predictions in that we expect to see certain relations hold pretty diffusely throughout grey matter in depressed more than controls. For example, we expect to see in depressed relative to control samples higher positive correlations diffusely in grey matter between our voxel-wise neural inflammation measure and certain cytokines (CYT) assessed from plasma. We do not, however, expect these differential relations to be so diffuse that we can just make a single grey matter ROI and assess data from that ROI.

Indeed, it looks like we have an effect–using 3dhistog on r values (NI-by-CYT) within grey matter shows that the distribution of voxel values is strongly rightward shifted for the depressed versus control group. The question, then, is how to determine if this apparent difference in brainwide r distributions is statistically reliable. I was thinking that the best way to do this would be to use bootstrapping to compute confidence intervals of some descriptive statistic (like Cohen’s d) that summarizes the difference between the depressed and control distributions of r. It also strikes me, though, that I could calculate a single, two-sample t score based on the mean and SD of the distributions of r for the depressed and control groups. This would be easy except for calculating the degrees of freedom necessary to perform this calculation.

And, finally, to my question:

Is it possible to estimate the spatial df in a (masked) volume based on (I’m assuming) the spatial smoothness of the residuals? I’m happy to default back to bootstrapping but let me know if you have a solution?

All best,

Paul

Gang · July 25, 2016, 9:08pm

I could calculate a single, two-sample t score based on the mean and SD of the distributions of r for the depressed and control groups.

If the total number of voxels within the gray matter is N, the degrees of freedom would be 2*(N-1), but you can approximately treat it as a Z-test.

paul.hamilton · July 25, 2016, 9:51pm

Thanks, Gang. I could see the df being as you say if we assume the voxels are spatially independent but what if there is non-random spatial structure in the data? Wouldn’t I be violating assumptions of independence?

paul.hamilton · July 25, 2016, 9:51pm

Agreed, though, that with the presumably high N of the distributions, a two-sample z test will suffice. Thanks!

Gang · July 26, 2016, 5:53pm

Yeah, z-test should be able to give you some rough idea about the difference, but there is no good way to account for the spatial correlations among the voxels other than nonparametric approaches such as bootstrapping.