Scale to shorts misfits error in 3ddelay - not fixed by AFNI_FLOATIZE=YES

AFNI version info (afni -ver):
Precompiled binary linux_ubuntu_24_64: Oct 1 2024 (Version AFNI_24.3.00 'Elagabalus')

Hi there,

I am running standard pre-processing on some task fMRI data (tcat despike align volreg surf blur scale), then averaging the runs, clipping out the portion of phasic stimulation and running 3ddelay on that data.

I have been getting a warning about scaling to shorts (see example below). So I decided to try and force it to output floats to fix this issue (as I don't mind if the data files are larger). I looked in 3ddelay and there was no option for switching to floats, but after some reading tried setting the environment variable AFNI_FLOATIZE=YES. I ran my 3ddelay again, but got the same error. I checked echo $AFNI_FLOATIZE and it said yes, so it was set correctly.

I wondered if I was doing it incorrectly re. when I was setting the environment variable? I had previously ml afni before running afni_proc.py, then the other steps, then set the environment variable, then ran 3ddelay - in that order.

Then I read about about AFNI_FLOATIZE in here (link below), and I wondered if this command would only work for certain afni commands, like 3dANOVA and some others. I did not see 3ddelay listed.
https://afni.nimh.nih.gov/pub/dist/doc/program_help/README.environment.html

I realise it is a warning, not an error, and the error says the nodes affected are likely on the periphery, but thought I should check if I need to do something differently, as the numbers are high-ish (e.g., 12% below).

The phase encoded finger tip maps 3ddelay produces look fine to me...

Thank you for your assistance,

H

+ 3ddelay: AFNI version=AFNI_24.3.00 (Oct  1 2024) [64-bit]
++ Authored by: Ziad Saad (with help from B Douglas Ward)
[7m*+ WARNING:[0m +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[7m*+ WARNING:[0m sub-001_ses-01_fingermap_lh_sinusoid[3] scale to shorts mean misfit error = 12.9% -- * Caution
 + a) Numerical precision has been lost when truncating results
       from 32-bit floating point to 16-bit integers (shorts).
 + b) Consider writing datasets out in float format.
       In most AFNI programs, use the '-float' option.
 + c) This warning is a new message, but is an old issue
       that arises when storing results in an integer format.
 + d) Don't panic! These messages likely originate in peripheral
       or unimportant voxels. They mean that you must examine your output.
       "Assess the situation and keep a calm head about you,
        because it doesn't do anybody any good to panic."
++ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
set fs = 0.520833
set T = 48
set polort = -1
3ddelay -input 'pb0'${block_num}'.'${subj}'.'${hemi}'.rALL_phasic.'${block_type}${extension} \
		-ideal_file $ref_wave \
		-fs $fs \
		-T $T \
		-polort $polort \
		-nophzwrp \
		-correct_bias \
		-co 0.5 \
		-nodsamp \
		-prefix ${subj}_fingermap_${hemi}_${ref_wave_name}

Update: despite what chatGPT says, you also cannot just force 3ddelay to produce floats by using -float. i.e, the below does not work

3ddelay -input 'pb0'${block_num}'.'${subj}'.'${hemi}'.rALL_phasic.'${block_type}${extension} \
		-ideal_file $ref_wave \
		-fs $fs \
		-T $T \
		-polort $polort \
		-nophzwrp \
		-correct_bias \
		-co 0.5 \
		-nodsamp \
		-float \
		-prefix ${subj}_fingermap_${hemi}_${ref_wave_name}

Hello,

Yes, the AFNI_FLOATIZE variable was initially applied to 3dDeconvolve and 3dcalc, and later 3dFDR it seems, but that is all. Indeed, chatGPT does a good job just making up options. Currently 3delay has no ability to write out floats.

Note that those misfit warnings generally apply to numbers very close to zero, which are usually the less-important ones (but not always).

-rick

3ddelay is a very old program in the AFNI package. It was the result of part of a PhD dissertation from the 1990s, and has not been seriously maintained for a long time. Sorry about that.

Hi Rick and Bob, thanks for getting back to me.

I just pulled all my data out from the niml file to have a look at the values to see if it looked like the values were truncated (low number of decimal places, or some such).

Col 4 = Delay - which we use: to give us the intensity value/ colour of our phase encoded finger maps; we set the min and max values to be 0-48, our max time lag; but the range is somewhere from 0 - 100/200

Col 6 = called 'Correlation Coefficient’ but must actually be p value of the correlation coefficient from the values below? which we use: to threshold the maps (p/ q value = .05)

Here is one example of the data, where we see actual correlations with the reference waves, i.e., where we would see finger maps

1557 0 0 27.18159 0.372919 0.028368 347.429
1558 0 0 27.57032 0.366015 0.028001 343.8013
1559 0 0 27.33636 0.368932 0.027782 354.9985
1560 0 0 30.00701 0.343261 0.025511 363.9392
1561 0 0 32.59488 0.310102 0.022948 367.0813

If I am reading this correctly, there are 5/6 decimal places in the values we use… That looks like that would be ok precision to me for what we need. So I have no problem?

Fyi: Most of the file is like this (most voxels have no correlation to the reference waves at any time lag), all 0. Would these be where I am getting the misfit warnings?

1540 0 0 23.99985 0 0 0
1541 0 0 23.99985 0 0 0
1542 0 0 23.99985 0 0 0
1543 0 0 23.99985 0 0 0
1544 0 0 23.99985 0 0 0

Thanks again for your help, I think it all looks like sufficient precision for what I need, but please let me know if I have misunderstood something.

H

Hello,

Here is a background on storing data using scaled shorts. I will get to the actual misfit error next.

Assuming the values are non-integral reals, converting floats (32-bit) to (unscaled) shorts (16-bit) leads to a loss of precision. 17.3 might be converted to 17, and 0.3 might be converted to 0, losses of accuracy of about 1.7% and 100%, respectively. Numbers less than 0.5 (and greater, depending on whether rounding gets applied) would have a loss of 100%. If all values are around 100, for example, an average loss of 0.25% might be common. There are no values of 99.1 or 99.6, there are only values of 99 or 100. If a person cares about the difference between 100 (baseline) and 100.48 (which might be a BOLD response of 0.48), it will not even exist.

Converting to a scaled short volume (where values are stored as shorts, but where there is a single real value to scale back to the "original") would make this less severe. If all of the values are around 100, one could possibly scale all shorts up by a factor of 300, making the largest values around 30000 (with a maxed signed short value of 32767 providing a volume limit of 109.2233 = 32767/300). This volume could then have a scale factor of 1/300, to scale the values back to where they should be.

The advantage now is that the volume can hold short values of 30000, 30001, 300002, 30003, etc., where the volume scale factor of 1/300 would have them represent values of 100.0, 100.00333, 100.00667, 100.01, etc. Now rather than representing only 1 unique value between 100 and 101 (but < 100), one can represent 300 unique values, at a resolution of 0.00333.

However, if the original values already go up to 32000, there might be no point in using scaled shorts at all. Though if the values go above 32767, scaled shorts become necessary again, since such numbers cannot even be represented without the scaling.

Note that afni_proc.py imposes an average EPI time series value of 100, with a max of 200. That max of 200 allows one to use scaled shorts with a scalar of about 150.0, providing numerical resolution of about 0.00667.

In general, using scaled shorts means ~all 15 bits of accuracy are used to represent the values, regardless of the magnitude of the values.

  • rick
1 Like

Getting to the actual misfit warning...

The short-to-scaled-shorts scalar reciprocal determines the precision of data values in a scaled short volume (and note that NIFTI only allows one scalar for the entire 4D dataset, so the maximum can have a big effect). If shorts are scaled up by a factor of 150, the reciprocal scalar showing the precision is 1/150=0.006667. Using rounding means the maximum difference is half of that, or 1/300=0.003333.

The warning message "scale to shorts mean misfit error" shows the average fractional loss when converting a (computed) float to a scaled short: (short*scalar-float)/float. Values less than 0.00333 will have fractional losses of 1.0 (100%).

Suppose a volume of t-stats has this scale factor. Then if it has a lot of values near zero, which is common, the average fractional mismatch might be large. However one often does not really care about t-values close to zero, so this warning might not be important.

On the flip side, what if one ran into this scalar when storing p-values, where the smallest ones are the most important? This would be a disaster. The most important p-values would have the biggest loss of accuracy. Never store p-values as scaled shorts. For that matter, we rarely store p-values at all.

So these warnings provide a reminder to think about the data values and how they are stored. Converting to floats removes such warnings, but it also doubles the size of the stored data.

-rick

FYI: I only get this error when using blurred data (preproc pipeline: tcat despike align volreg surf blur scale). When I don't blur, but use otherwise identical pipeline, I do not get the scale to shorts misfit error...

Depending on the actual data, values outside of the brain tend to be small, and notably, values outside of the bounding box of the EPI will be zero (the final bounding box is generally bigger, based on the anatomical dataset). So any Gaussian blur that grows into these regions tends to have very small values.
-rick

1 Like