Hi Rick and Daniel,
it's a pleasure to talk to you again! 
@rickr: my first idea was indeed to try the "dry_run" but it's not exactly the same as the "wet(?)_run" as Daniel points out. In fact, the script produced by the dry_run fails with FATAL ERROR: Input -weight is never positive!. I will paste both the dry_run and the normal_run below for the sake of completeness.
@dglen: Unfortunately, in my case, I really need to use a specific mask as input. The alignment doesn't work if I use an automask, probably because my datasets are epi's of cynomolgus monkeys with considerable distortion. After a lot of tinkering, I finally managed to go from (anat is absent because it is far far away)
Interestingly, of all the costs, hellinger was the one that worked the best. Now I am adapting my afni_proc.py to use my very specific 3dAllineate. The brain stem is still not great though but I still need to optimize my field map correction to see if I gain some.
Coming back to the differences between the dry vs non-dry run:
Align_epi_anat.py with dry_run:
align_epi_anat.py -anat2epi -anat sub-02_ses-01_T1w.cut.ns_der10.nii
-suffix _al2epi
-epi vr_base_min_outlier_noeyes+orig -epi_base 0
-epi_strip None
-anat_has_skull no
-cost hel
-cmass cmass -prep_off -giant_move
-Allineate_opts -weight_frac 1.0 -maxrot 6 -maxshf 10 -VERB -warp aff -source_mask Cyno162_bmask_in_sub-02_ses-01_T1w.cut.der10.nii.gz
-save_script script_epi2anat.tcsh
-volreg off -tshift off -ex_mode dry_run
Script generated by the dry run:
3dAttribute DELTA ./vr_base_min_outlier_noeyes+orig
3dAttribute DELTA ./vr_base_min_outlier_noeyes+orig
3dAttribute DELTA ./sub-02_ses-01_T1w.cut.ns_der10.nii
\rm -f ./__tt_vr_base_min_outlier_noeyes*
\rm -f ./__tt_sub-02_ses-01_T1w.cut.ns_der10*
3dcopy ./sub-02_ses-01_T1w.cut.ns_der10.nii
3dbucket -prefix ./__tt_vr_base_min_outlier_noeyes_ts
3dBrickStat -automask -percentile 90.000000 1 90.000000
3dcalc -datum float -prefix ./__tt_vr_base_min_outlier_noeyes_ts_wt -a
./__tt_vr_base_min_outlier_noeyes_ts+orig -expr 'min(1,(a/-999.000000))'
3dAllineate -hel -wtprefix ./__tt_sub-02_ses-01_T1w.cut.ns_der10_al2epi_wtal
-weight ./__tt_vr_base_min_outlier_noeyes_ts_wt+orig -source
./__tt_sub-02_ses-01_T1w.cut.ns_der10+orig -prefix
./sub-02_ses-01_T1w.cut.ns_der10_al2epi -base
./__tt_vr_base_min_outlier_noeyes_ts+orig -cmass -1Dmatrix_save
./sub-02_ses-01_T1w.cut.ns_der10_al2epi_mat.aff12.1D -master BASE
-mast_dxyz 1.234567 -weight_frac 1.0 -maxrot 6 -maxshf 10 -VERB -warp aff
-source_mask Cyno162_bmask_in_sub-02_ses-01_T1w.cut.der10.nii.gz -twobest
11 -twopass -VERB -maxrot 45 -maxshf 40 -fineblur 1 -source_automask+2
3dNotes -h "align_epi_anat.py -anat2epi -anat
sub-02_ses-01_T1w.cut.ns_der10.nii -suffix _al2epi -epi
vr_base_min_outlier_noeyes+orig -epi_base 0 -epi_strip None
-anat_has_skull no -cost hel -cmass cmass -prep_off -giant_move
-Allineate_opts -weight_frac 1.0 -maxrot 6 -maxshf 10 -VERB -warp aff
-source_mask Cyno162_bmask_in_sub-02_ses-01_T1w.cut.der10.nii.gz
-save_script script_epi2anat.tcsh -volreg off -tshift off -ex_mode
\rm -f ./__tt_vr_base_min_outlier_noeyes*
\rm -f ./__tt_sub-02_ses-01_T1w.cut.ns_der10*
Script generated by the non-dry run:
3dAttribute DELTA ./vr_base_min_outlier_noeyes+orig
3dAttribute DELTA ./vr_base_min_outlier_noeyes+orig
3dAttribute DELTA ./sub-02_ses-01_T1w.cut.ns_der10.nii
\rm -f ./__tt_vr_base_min_outlier_noeyes*
\rm -f ./__tt_sub-02_ses-01_T1w.cut.ns_der10*
3dcopy ./sub-02_ses-01_T1w.cut.ns_der10.nii
3dnvals -all ./vr_base_min_outlier_noeyes+orig
3dbucket -prefix ./__tt_vr_base_min_outlier_noeyes_ts
3dBrickStat -automask -percentile 90.000000 1 90.000000
3dcalc -datum float -prefix ./__tt_vr_base_min_outlier_noeyes_ts_wt -a
./__tt_vr_base_min_outlier_noeyes_ts+orig -expr 'min(1,(a/473.529396))'
3dAllineate -hel -wtprefix ./__tt_sub-02_ses-01_T1w.cut.ns_der10_al2epi_wtal
-weight ./__tt_vr_base_min_outlier_noeyes_ts_wt+orig -source
./__tt_sub-02_ses-01_T1w.cut.ns_der10+orig -prefix
./sub-02_ses-01_T1w.cut.ns_der10_al2epi -base
./__tt_vr_base_min_outlier_noeyes_ts+orig -cmass -1Dmatrix_save
./sub-02_ses-01_T1w.cut.ns_der10_al2epi_mat.aff12.1D -master BASE
-mast_dxyz 0.499999 -weight_frac 1.0 -maxrot 6 -maxshf 10 -VERB -warp aff
-source_mask Cyno162_bmask_in_sub-02_ses-01_T1w.cut.der10.nii.gz -twobest
11 -twopass -VERB -maxrot 45 -maxshf 40 -fineblur 1 -source_automask+2