Is the RMS difference computed based on the pixel intensity of the images or something else?
Are there recommendations around what would be considered "good" values in terms of data quality? E.g., < 50 RMSnew; or <20 difference between new and old (probably differ between scanners, but some rough estimate so if I see a value of 300, I know it's probably messy).
Yes, RMS is the root mean square of the voxelwise image intensity differences. There isn't a strong notion of what is good, since it depends on the data (and should scale with it, for example). Perhaps RMS/(global mean) might be a measure to contrast with, I am not sure. Of course, actual motion and such will lead to higher values.
When using 3dvolreg's -dfile .. opt, indeed those numbers can be output. They mean:
where: n = sub-brick index
roll = rotation about the I-S axis }
pitch = rotation about the R-L axis } degrees CCW
yaw = rotation about the A-P axis }
dS = displacement in the Superior direction }
dL = displacement in the Left direction } mm
dP = displacement in the Posterior direction }
rmsold = RMS difference between input brick and base brick
rmsnew = RMS difference between output brick and base brick
At each time point, the rmsold quantity is calculated by summing up the squared differences in voxel intensity between the base and input volume, across all voxels in a 3D volume; then dividing by the number of voxels and taking hte square root. (That is, it is calculated from the L2 norm of differences across all voxels, divided by sqrt of N.) rmsnew is the same thing between the base volume and the motion-registered dataset.
Note that the output values are going to depend on the scale and units of the input volumes. When this would typically be calculated, the FMRI time series would still have the arbitrary units from the scanner. Therefore, I don't think there is a specific absolute value of difference one could look to as a reference gauge. I think that would even make it difficult to have a scaleless ratio that one could rely on across data. It would be made trickier by having more or less non-brain material in a dataset, as well.
The values would differ just based on motion patterns in unpredictable ways that probably don't scale with an interpretation of having done a "good" or "bad" job of motion estimation.
So, I don't see an easy, generalizable interpretation of these quantities. Personally, I have never used them.
If you are interested in data quality, we have done a lot of work on this topic recently. There are a lot of tools and procedures for this in AFNI. Some things to check out include:
Comment: it is also worth checking out this fun, online demo of the APQC HTML and some of its interactive functionality, described in the above paper: sub-002
The
National Institute of Mental Health (NIMH) is part of the National Institutes of
Health (NIH), a component of the U.S. Department of Health and Human
Services.