Please help, can't get files to load into 3dMVM

Indigo · September 27, 2023, 8:17am

Every time I run the following script, I get the following error.

***** End of data structure information *****
++++++++++++++++++++++++++++++++++++++++++++++++++++

Reading input files now...

** Error:
Problem with input files! Two possibilities: 1) There is a specification error
with either file path or file name. Use shell command 'ls' on the last column in the
data table to find out the problem. 2) At least one of the input files has different dimensions:
either (1) numbers of voxels along X, Y, Z axes are different across files;
or (2) some input files have more than one value per voxel.
Run "3dinfo -header_line -prefix -same_grid -n4 *.HEAD" in the directory
where the files are stored, and pinpoint out which file(s) is the trouble maker.
Replace *.HEAD with *.nii or something similar for other file formats.

I am running the script in the same location as my data. The files look like this...

stats.a087+tlrc.BRIK

Here is my script below...

3dMVM -prefix Exclusion_High_risk \
    -bsVars "grp*age+site" \
    -qVars "age" \
    -qVarsCenters '18.4' \
    -num_glf 2  \
    -glfLabel 1 RISKvHC_exclude  -glfCode 1 'grp : 1*RISK -1*HC' \
    -glfLabel 2 RISKvHC_over -glfCode 2 'grp : 1*RISK -1*HC' \
    -dataTable \
    Subj	grp	age	site	condition	InputFile                                                                            \
    a001	HC	21.1	UT	exclude		stats.a001+tlrc'[13]' \
    a002	HC	18.4	UC	exclude 	stats.a002+tlrc'[13]' \
    a004	HC	14.6	UC	exclude		stats.a004+tlrc'[13]' \
    a017	HC	21.5	UT	exclude 	stats.a017+tlrc'[13]' \
    a024	RISK	14.2	UC	exclude stats.a024+tlrc'[13]' \
    a025	HC	18.9	UT	exclude 	stats.a025+tlrc'[13]' \
    a030	RISK	16.4	UC	exclude	stats.a030+tlrc'[13]' \
    a032	RISK	14.9	UC	exclude stats.a032+tlrc'[13]' \
    a034	RISK	14.2	UC	exclude	stats.a034+tlrc'[13]' \
    a041	RISK	20.0	UT	exclude stats.a041+tlrc'[13]' \
    a042	HC	19.0	UC	exclude		stats.a042+tlrc'[13]' \
    a045	HC	18.5	UT	exclude 	stats.a045+tlrc'[13]' \
    a052	RISK	15.8	UC	exclude	stats.a052+tlrc'[13]' \
    a058	RISK	17.7	UC	exclude stats.a058+tlrc'[13]' \
    a060	RISK	16.0	UC	exclude	stats.a060+tlrc'[13]' \
    a073	RISK	19.7	UT	exclude stats.a073+tlrc'[13]' \
    a087	RISK	19.2	UT	exclude	stats.a087+tlrc'[13]' \
    a101	HC	20.5	UT	exclude 	stats.a101+tlrc'[13]' \
    a105	HC	20.9	UT	exclude		stats.a105+tlrc'[13]' \
    a116	HC	20.5	UC	exclude 	stats.a116+tlrc'[13]' \
    a119	HC	19.8	UT	exclude		stats.a119+tlrc'[13]' \
    a129	HC	20.9	UT	exclude 	stats.a129+tlrc'[13]' \
    a130	HC	19.8	UC	exclude		stats.a130+tlrc'[13]' \
    a131	HC	20.7	UT	exclude 	stats.a131+tlrc'[13]' \
    a134	HC	18.4	UC	exclude		stats.a134+tlrc'[13]' \
    a135	RISK	17.3	UT	exclude stats.a135+tlrc'[13]' \
    a141	RISK	17.7	UT	exclude	stats.a141+tlrc'[13]' \
    a145	HC	21.3	UT	exclude 	stats.a145+tlrc'[13]' \
    a146	HC	15.8	UC	exclude		stats.a146+tlrc'[13]' \
    a147	RISK	18.9	UT	exclude stats.a147+tlrc'[13]' \
    a153	RISK	19.0	UT	exclude	stats.a153+tlrc'[13]' \
    b007	HC	21.9	UT	exclude 	stats.b007+tlrc'[13]' \
    b009	HC	18.8	UT	exclude		stats.b009+tlrc'[13]' \
    b010	HC	18.3	UC	exclude 	stats.b010+tlrc'[13]' \
    b014	RISK	14.4	UC	exclude	stats.b014+tlrc'[13]' \
    b022	RISK	15.4	UC	exclude stats.b022+tlrc'[13]' \
    b029	RISK	18.8	UT	exclude	stats.b029+tlrc'[13]' \
    b048	RISK	18.2	UC	exclude stats.b048+tlrc'[13]' \
    b054	RISK	17.8	UC	exclude	stats.b054+tlrc'[13]' \
    b065	RISK	17.2	UT	exclude stats.b065+tlrc'[13]' \
    b071	RISK	20.3	UT	exclude	stats.b071+tlrc'[13]' \
    b074	RISK	19.5	UC	exclude stats.b074+tlrc'[13]' \
    b075	RISK	19.9	UT	exclude	stats.b075+tlrc'[13]' \
    b083	RISK	14.3	UT	exclude stats.b083+tlrc'[13]' \
    b095	RISK	18.6	UT	exclude	stats.b095+tlrc'[13]' \
    b100	HC	20.2	UC	exclude 	stats.b100+tlrc'[13]' \
    b103	RISK	19.7	UT	exclude	stats.b103+tlrc'[13]' \
    b117	HC	18.6	UT	exclude 	stats.b117+tlrc'[13]' \
    b123	RISK	20.5	UT	exclude	stats.b123+tlrc'[13]' \
    b132	HC	15.2	UC	exclude 	stats.b132+tlrc'[13]' \
    b140	HC	20.3	UC	exclude		stats.b140+tlrc'[13]' \
    b143	RISK	14.1	UT	exclude stats.b143+tlrc'[13]' \
    b144	RISK	19.6	UC	exclude	stats.b144+tlrc'[13]' \
    b148	RISK	14.7	UC	exclude stats.b148+tlrc'[13]' \
    b149	HC	14.0	UT	exclude		stats.b149+tlrc'[13]' \
    b150	RISK	17.5	UC	exclude stats.b150+tlrc'[13]' \
    b154	RISK	17.2	UC	exclude	stats.b154+tlrc'[13]' \
    b155	RISK	17.3	UT	exclude stats.b155+tlrc'[13]' \
    b161	HC	19.7	UT	exclude		stats.b161+tlrc'[13]' \
    b163	HC	20.7	UT	exclude 	stats.b163+tlrc'[13]' \
    b165	HC	20.5	UT	exclude		stats.b165+tlrc'[13]' \
    b167	HC	21.1	UT	exclude 	stats.b167+tlrc'[13]' \
    b173	HC	21.7	UT	exclude		stats.b173+tlrc'[13]'

Could someone please help, I can't get past this error.

Thanks,

Jennifer

ptaylor · September 27, 2023, 11:25am

Hi, Jennifer-

Let's check if the datasets do have the same grids. What is the output of:

3dinfo -same_all_grid -prefix stats.*HEAD

in that directory? (I am assuming that glob will get the exact set of files you want; otherwise, you can run this explicit list:

3dinfo -same_all_grid -prefix \
stats.a001+tlrc stats.a002+tlrc stats.a004+tlrc stats.a017+tlrc stats.a024+tlrc \
stats.a025+tlrc stats.a030+tlrc stats.a032+tlrc stats.a034+tlrc stats.a041+tlrc \
stats.a042+tlrc stats.a045+tlrc stats.a052+tlrc stats.a058+tlrc stats.a060+tlrc \
stats.a073+tlrc stats.a087+tlrc stats.a101+tlrc stats.a105+tlrc stats.a116+tlrc \
stats.a119+tlrc stats.a129+tlrc stats.a130+tlrc stats.a131+tlrc stats.a134+tlrc \
stats.a135+tlrc stats.a141+tlrc stats.a145+tlrc stats.a146+tlrc stats.a147+tlrc \
stats.a153+tlrc stats.b007+tlrc stats.b009+tlrc stats.b010+tlrc stats.b014+tlrc \
stats.b022+tlrc stats.b029+tlrc stats.b048+tlrc stats.b054+tlrc stats.b065+tlrc \
stats.b071+tlrc stats.b074+tlrc stats.b075+tlrc stats.b083+tlrc stats.b095+tlrc \
stats.b100+tlrc stats.b103+tlrc stats.b117+tlrc stats.b123+tlrc stats.b132+tlrc \
stats.b140+tlrc stats.b143+tlrc stats.b144+tlrc stats.b148+tlrc stats.b149+tlrc \
stats.b150+tlrc stats.b154+tlrc stats.b155+tlrc stats.b161+tlrc stats.b163+tlrc \
stats.b165+tlrc stats.b167+tlrc stats.b173+tlrc

)

This will output 6 columns of information: first columns of either 1s or 0s, and the name of the file for that row. The same_all_grid option checks:

   -same_all_grid: Equivalent to listing all of -same_dim -same_delta
                   -same_orient, -same_center, and -same_obl on the 
                   command line.

Each dataset's properties are checked against those of the first dset in the list (and the first dset's list of numbers tells about its similarity to the second dataset).

So, basically, you want to see 1s in all those first 5 columns. Any file that doesn't, has a property that is mismatched to the first one; which column has a zero tells you which column has a mismatch.

-pt

Indigo · September 27, 2023, 12:53pm

That's the problem, we have 2 sites in which we are collecting neuroimaging data. It looks like the sites are outputting different "grids" - here is an example of the top.

|0|0|1|0|1|          stats.a001|
|---|---|---|---|---|---|
|0|0|1|0|1|          stats.a002|
|0|0|1|0|1|          stats.a004|
|1|1|1|1|1|          stats.a017|
|0|0|1|0|1|          stats.a024|
|1|1|1|1|1|          stats.a025|
|0|0|1|0|1|          stats.a030|
|0|0|1|0|1|          stats.a032|
|0|0|1|0|1|          stats.a034|

Thank you so much for your help with this.

Is there any way to fix this - quickly? I needed this analysis out yesterday. Sorry to ask and thank you so much for your help with this again. Spent days trying to figure this out.

jen

ptaylor · September 27, 2023, 1:59pm

Hi, Jen-

Sure, this can probably be addressed in a couple different ways, depending on what the differences are. The main aim will be to address this in a way that does not add blurring to the data---so, hopefully just regridding, if possible.

First question: what was the processing that got the data to this point? Was it afni_proc.py, say, or something else?

To the output of 3dinfo here, the 5 columns represent in order: same_dim, same_delta, same_orient, same_center, same_obl. The zeros occur for:

same_dim (matrix size)
same_delta (voxel size)
same_center (geometric center, likely because of different matrix and voxel size).

Second question: what are the matrix and voxel sizes here? That is, what is the output of, say:

3dinfo -ad3 -n4 -dc3 -prefix stats.a001+tlrc stats.a002+tlrc

... which should be 2 datasets with different voxel sizes, matrix dimens and geometric center, from the output above? (If those 2 dsets have the same values in each row, please add a couple others until we see what is different.)

Unfortunately, this creates a problem, that the voxel sizes themselves are different. That means we can't simply zeropad datasets to get the same grid. The regridding process (changing voxel size, as well as matrix dimensions), will necessarily add blurring. It would be better to not add a layer of blurring a posteriori to the processing---hence knowing what step came before are key.

--pt

pmolfese · September 27, 2023, 2:52pm

Not to hijack the thread from getting 3dMVM to work, but I'll also mention that including multiple different sites makes for difficulty in interpretation. Hopefully grid similarity is just a slight hiccup and most of your other acquisition parameters (e.g. slices timing, flip angle, echo-times) are the same. Beyond that, you want to make sure that you use the -blur_to_fwhm option in afni_proc in pre-processing since the scanners will all have slight changes in baseline smoothness.

Analysis in 3dMVM should include interaction of site and voxel-wise covariates (-vVars) based on the SFNR for each scan's errts file.

The Glover et al. papers from fBIRN are a good reference. Otherwise I fear you're going to face some harsh reviews. This comes up a lot and should motivate me to finally publish our NIH cross-scanner study.

Indigo · September 28, 2023, 2:17pm

Firstly, let me just say thank you for all your help with this, the prompt response is very much appreciated. This is my first time running a data set on AFNI and I will be doing so in the future. I really appreciate the support of "experts" who know this software.

We would have had similar issues when running this in CONN (we also acquired resting state data which seems to have the same info) in SPM but it didn't throw an error??

So, in a mad dash to get some data out, I ran a 3dresample yesterday to get scans to move to the same grid. It sounds like this isn't an ideal long-term solution though.

3dresample -master master+orig -prefix new.dset -input old+orig

It looks like there were slight site differences in scans between sites that might have been exacerbated during preprocessing. 3dinfo on the original nifty converted scans - I get this...

SITE 1

|2.239583|2.239583|2.499998|96|96|60|745|-1.842270|-28.676575|10.911369|sub-a004_ses-01_task-cyb_dir-AP_bold.nii.gz|

|---|---|---|---|---|---|---|---|---|---|---|
SITE2

|2.500000|2.500000|2.500000|86|86|60|519|-6.734261|11.232658|2.573189|sub-a153_ses-01_task-cyb_dir-AP_|

In response to pmolfese, all other parameters were the same between sites and we are also correcting for site during second level analysis.

We ran preprocessing with fMRIprep and used a distortion scan (fMRI scan in opposite phase direction). Smoothing and mean centering were done on AFNI. Site 2 did not have slice time correction (we will reprocess with slice time but this was for some initial data).

Here is 3dinfo for the fMRI prep, smoothed, mean-centered processed scans. When I run the 3dinfo command I get (differences in i and j).
This is the resampled scan and 1st level output for SITE 2

2.500000 2.500000 2.500000 78 93 78 22 0.250000 17.500000 17.750000 stats.b154b

This is the original 1st level output for - SITE 2

|2.240000|2.240000|2.500000|87|103|78|22|0.180000|18.260002|17.750000| stats.b154|
This is the resampled scan and 1st level output for (SITE 1)

|2.500000|2.500000|2.500000|78|93|78|22|0.250000|17.500000|17.750000| stats.b173|

Gang · September 28, 2023, 2:31pm

3dMVM -prefix Exclusion_High_risk \
    -bsVars "grp*age+site" \
    -qVars "age" \
    -qVarsCenters '18.4' \
    -num_glf 2  \
    -glfLabel 1 RISKvHC_exclude  -glfCode 1 'grp : 1*RISK -1*HC' \
    -glfLabel 2 RISKvHC_over -glfCode 2 'grp : 1*RISK -1*HC' \

Just a couple of comments about the model specification.

The two contrasts would be more informatively specified with t-tests than F-tests:

...
    -num_glt 2  \
    -gltLabel 1 RISKvHC_exclude  -gltCode 1 'grp : 1*RISK -1*HC' \
    -gltLabel 2 RISKvHC_over     -gltCode 2 'grp : 1*RISK -1*HC' \

Your current model assumes that the site effect does not interact with other predictors. If the nature of site effect is largely unknown, it might be more prudent to consider potential interactions:

 -bsVars "grp*site*age" \

Gang

Indigo · September 28, 2023, 2:56pm

Hi Gang,

Thank you for this. This was my first second level design in AFNI, I'll work on these improvements now.

Could I also ask, I struggled with exactly which contrast to use during my second-level analysis.

During the individual level analysis I output...

task > baseline
task
baseline

For the second level- I added the first level generated task > baseline and compared it between groups. Would it be better to do task > baseline at the second level???

pmolfese · September 28, 2023, 3:29pm

Hi @Indigo -

Not ideal to have different parameters (but could be listed as a limitation). I'd caution on the 3dremaster "fix" as you're interpolating one site data more than the other. A better (still not perfect) potential mitigation technique could be to use either:
-volreg_warp_dxyz to specify the output voxel size
or
-volreg_warp_master to specify a master dataset
But either of these would require using afni_proc.py to preprocess your data. fMRIprep may have a similar option to specify the final grid of your output, but none of us our experts on that pipeline tool.

I suspect CONN resampled all your data to the template space dimensions, and hence you didn't get an error. Did you use CONN to do the preprocessing resting state data? Or fMRIprep?

In response to pmolfese, all other parameters were the same between sites and we are also correcting for site during second level analysis.

The blip data is a good distortion correction, but not necessarily a signal correction; I would be cautious of expecting the second level analysis alone to account for those site/scanner differences. Both the fBIRN papers and my own (granted just a poster that needs to be written up) show this as well. Do you have physio data to regress out? Are the scanners the same make/model/software?

Some folks have had success using COMBAT and related algorithms for adjusting for site/scanner differences as well. Resources below.

Some other potentially useful links:
COMBAT fMRI
COMBAT Resting State
COMBAT + RAVEL
My SfN poster on scanner comparisons

Gang · September 28, 2023, 5:09pm

I added the first level generated task > baseline and compared it between groups.

If your research interest is about the contrast between task and baseline, feeding the contrast as input at the population level is appropriate.

Just noticed that those two contrasts you specified at the population level are the same. You may change the specification from

...
    -num_glt 2  \
    -gltLabel 1 RISKvHC_exclude  -gltCode 1 'grp : 1*RISK -1*HC' \
    -gltLabel 2 RISKvHC_over     -gltCode 2 'grp : 1*RISK -1*HC' \

to

...
    -num_glt 1  \
    -gltLabel 1 RISKvHC  -gltCode 1 'grp : 1*RISK -1*HC' \

Gang

Indigo · September 28, 2023, 5:53pm

Thank you, I have a ton of research to do, but this is really awesome! Thank you.

Indigo · September 28, 2023, 5:54pm

Thank you for this feedback, I'll look into this and try rerunning this way.