afni_proc.py takes a long time to run

Ruilin · June 25, 2024, 3:09pm

Hi all,
I have created my specific age T1 template firstly. Then I run Animan_warper to register T1 image to my template. Last, I run afni_proc.py. Unfortunately, afni_proc.py took about 14 hours for each subject. Is it normal and do I need to adjust my code to process rs-fmri data?The code I used is as follows:

set subj           = $1
set ap_label       = rsfmri_regressall
set dir_inroot     = ${PWD:h}                       
set dir_log        = ${dir_inroot}/logs
set dir_ref        = ${dir_inroot}/data_age1_basic/template    

set dir_basic      = ${dir_inroot}/data_age1_basic          
set dir_aw         = ${dir_inroot}/data_13_aw/age1   
set dir_ap         = ${dir_inroot}/data_${ap_label}/age1  

set sdir_basic     = ${dir_basic}/${subj}
set sdir_anat      = ${sdir_basic}/anat
set sdir_epi       = ${sdir_basic}/func
set sdir_aw        = ${dir_aw}/${subj}
set sdir_ap        = ${dir_ap}/${subj}
set anat_orig    = ${sdir_anat}/${subj}*T1w_bet.nii.gz
set anat_orig_ab = ${subj}_anat
set ref_base     = ${dir_ref}/Template_Standard_Age1.nii 
set ref_base_ab  = template_age1 

set dsets_epi     = ( ${sdir_epi}/${subj}*task-rest_bold.nii.gz )

set anat_cp       = ${sdir_aw}/${anat_orig_ab}_nsu.nii.gz

set dsets_NL_warp = ( ${sdir_aw}/${anat_orig_ab}_warp2std_nsu.nii.gz           \
                    ${sdir_aw}/${anat_orig_ab}_composite_linear_to_template.1D \
                    ${sdir_aw}/${anat_orig_ab}_shft_WARP.nii.gz                )
set nthr_avail = `afni_system_check.py -disp_num_cpu`
set nthr_using = `afni_check_omp`

echo "++ INFO: Using ${nthr_avail} of available ${nthr_using} threads"

setenv AFNI_COMPRESSOR GZIP
set ap_cmd = ${sdir_ap}/ap.cmd.${subj}

\mkdir -p ${sdir_ap}


cat <<EOF >! ${ap_cmd}
# -----------------------------------------------------------------
setenv OMP_NUM_THREADS 10
afni_proc.py                                                                \
    -subj_id                  ${subj}                                       \
    -blocks                   tshift align tlrc volreg blur mask scale regress   \
    -dsets                    ${dsets_epi}                                  \
    -copy_anat                ${anat_cp}                                    \
    -anat_has_skull           no                                            \
    -anat_uniform_method      none                                          \
    -radial_correlate_blocks  tcat volreg  regress                                 \
    -radial_correlate_opts    -sphere_rad 14                                \
    -tcat_remove_first_trs    10                                     \
    -volreg_align_to          MIN_OUTLIER                                   \
    -volreg_align_e2a                                                       \
    -volreg_tlrc_warp                                                       \
    -volreg_warp_dxyz         0.5                                          \
    -volreg_compute_tsnr      yes                                           \
    -align_opts_aea           -cost nmi     \
                              -check_flip    -feature_size 0.5                               \
    -align_unifize_epi        local                             \
    -tlrc_base                ${ref_base}                                   \
    -tlrc_NL_warp                                                           \
    -tlrc_NL_warped_dsets     ${dsets_NL_warp}                              \
    -blur_size                2.5                                             \
    -mask_segment_anat        yes                                           \
    -mask_segment_erode       yes                                           \
    -mask_import              Tvent /data/home/bnu006/UW-Madison_Rhesus_MRI/preprocess/data_age1_basic/template/ventricles_in_template_age1.nii \
    -mask_intersect           Svent CSF Tvent                              \
    -regress_ROI              WMe Svent                                     \
    -regress_ROI_per_run      WMe Svent                                     \
    -regress_motion_per_run                                                 \
    -regress_apply_mot_types  demean deriv                                  \
    -regress_polort           2                                             \
    -regress_bandpass         0.01 0.1                                      \
    -regress_est_blur_errts                                                 \
    -regress_est_blur_epits                                                 \
    -regress_run_clustsim     no                                            \
    -html_review_style        pythonic 

EOF

cd ${sdir_ap}


tcsh -xef ${ap_cmd} |& tee output.ap.cmd.${subj}

time tcsh -xef proc.${subj} |& tee output.proc.${subj}

echo "++ FINISHED AP: ${ap_label}"

exit 0

ptaylor · June 25, 2024, 8:41pm

Hi-

14 hours sounds long, but it depends on:

the number of input EPI datasets you have and their voxel size
the voxel size of the anatomical data
the voxel size of your final data set (here, you set it to be 0.5 mm)
the available (and asked for) computational resources, and how much you are running simultaneously
... and more.

A) Do you know what part is going so slowly? You could output a time-stamp ordered list of all dsets in the AP results directory and see where the gap(s) occur:

\ls -ltrd *

This will contain a lot of stuff.

B) To check the EPI properties (matrix dimensions and voxel size), what is the output of:

3dinfo -n4 -ad3 -prefix ${dsets_epi}

? Since your blur size is 2.5, I'm guessing your EPI voxels aren't tiny (but maybe the FOV is large, which can happen in animal imaging).

C) To check the same anatomical dset properties, what is the output of:

3dinfo -n4 -ad3 -prefix ${anat_cp}

?

D) You ask to use 10 CPUs per thread. Does your OS have that available? What is the output of:

afni_system_check.py -disp_num_cpu

E) How many simultaneous jobs are you running? And what OS are you using, with how much RAM? Are there other computationally expensive things running at the same time?

F) A more subtle thing is: how good is the initial overlap between your anatomical-template and EPI-anatomical dataset pairs? The reason is that for applying the nonlinear warps, 3dNwarpApply/3dQwarp will make a dataset that encompasses both the source and master datasets, on the final grid resolution. So, even if alignment works on individual steps, if the datasets are not reasonably well centered/overlapping, very large memory resources might be being used that can slow things down, depending on the OS. To check this, you can look at:

the EPI over the anatomical in the AFNI GUI, and/or the afni_proc.py QC HTML shows images of EPI-anatomical overlap at the end of the "vorig" section.
the QC directory of the @animal_warper output should show an "init overlap" QC image of the anatomical and template volume.

G) If you are using Windows Subsystem Linux and haven't installed+loaded the vcXsrvr or other server, then generating graphics/images that happens along the way can be surprisingly slow (though 14 hours sounds far too slow for that). It is listed on the install instructions how to install it, and don't forget to start it before running the Linux terminals.

--pt

Ruilin · June 26, 2024, 8:16am

Hi,
A) The time-stamp ordered list of all dsets in the results directory is as follows:

B) The matrix dimensions and voxel size of EPI are: 64 26 64 456 2.187500 3.100002 2.187500
C) The matrix dimensions and voxel size of anatomical dset are: 512 248 512 1 0.273400 0.500000 0.273400, and the matrix dimensions and voxel size of my age template are: 280 248 280 1 0.500000 0.500000 0.500000
D) The avaiable CPU of my OS are 20

E）The OS and RAM are as follows, and I also processed DWI data using Mrtrix3 at the same time.

F) The ''init overlap'' QC images of the anatomical and template volume, and the image of EPI-anatomical overlap are as follows:

G) I used CentOS Linux to run afni_proc.py.

Best regards,
Ruilin

ptaylor · June 26, 2024, 8:52am

Hi, Ruilin-

Thanks for posting that. From the output in "A", I see that your initial zipped brick dataset size for pb01* is 66MB, which is fine, but your zipped brick datasets for pb02*volreg* are over 20GB in size! That is a huge increase!

The thing that is driving this is primarily iyour huge factor of upsampling. Your EPI voxel dims are 2.2 x 3.1 x 2.2 mm^3. Your chosen final resolution, though, is muuuuch finer 0.5 x 0.5 x 0.5 mm^3. Comparing volumes, that means you are upsampling by a volumetric factor of 15/0.125 = 120! So that is increasing your file size by 120.

I would not recommend upsampling like that. You cannot create new high resolution information, and you only blow up the file size by a huge factor. We would normally round the smallest dimension up slightly. So, instead of using:

-volreg_warp_dxyz         0.5

perhaps you could use:

-volreg_warp_dxyz         2.0

That will only approx. double the volumetric resolution.

--pt

ps: And I will just note from the other outputs, your QC images from part "F" all look fine:

from the @animal_warper QC images, the initial overlap of anatomical and template are good, and the final alignment are good.
from the APQC HTML "Initial overlap" images, the EPI-anatomical overlap start of great, too. Final EPI-anatomical alignment looks good, too.

Ruilin · June 26, 2024, 10:09am

Hi, ptaylor
As we want to use D99 altas which is in NMT v2 template space, we choose to unsample EPI datasets to 0.5 x 0.5 x 0.5 mm^3. If we use ''-volreg_warp_dxyz 2.0'' when run afni_proc.py, could us resample the final image with ''errts* '' to 0.5 x 0.5 x 0.5` mm^3 and then analysis FC between different regions.

Best,
Ruilin

ptaylor · June 27, 2024, 1:06am

Hi, Ruilin-

Indeed, the issue is: the EPI his low-res and the atlas of interest is high-res. Should the EPI -> atlas resolution, or atlas -> EPI resolution? For a couple reasons, we would recommend the latter.

Firstly, there is the practical issue of file sizes here, which is apparent from the start of this thread.

But perhaps more importantly, we have more mathematical/interpretational constraints. Ideally, for stable signal and statistics, you would like to have multiple EPI voxels per ROI. If your EPI voxels are much larger than ROIs in the atlas and choose to upsample EPI, then you are essentially going to get multiple ROIs with the same signal. And that high correlation is not physiological, but simply due to coarse sampling. Therefore, it would be artifactual.

So, it would likely be better for interpretation to downsample the atlas ROIs to EPI resolution (which might be itself slightly upsampled, say to 2mm isotropic in this case). That seems reasonable. If you do this, then downsampling the atlas might mean that some ROIs "disappear" because they are too small. That might not be desired overall, but that seems realistic considering the sampling. If you want to represent small ROIs, then that places a requirement on getting high res EPI data in the first place.

Since the D99 atlas is already in the final NMT template space, you can add it to your afni_proc.py command and have it appropriately resampled to final EPI space with -ROI_import LABEL DSET_NAME syntax, like:

   -ROI_import D99 D99_atlas_in_NMT_v2.0_sym_05mm.nii.gz

--pt