Conventional massive univariate analysis implicitly assumes that neighboring voxels are statistically independent. This unrealistic assumption incurs a substantial cost: statistical evidence must then be heavily penalized to account for multiple testing. As a result, most multiple-comparison adjustment procedures are excessive from the outset. Moreover, additional complications, such as estimating spatial relatedness, as you have noted, further entangle and destabilize the penalization process.
For these reasons, it is less productive to fixate on the precision or exactness of the estimated surviving cluster sizes. Instead, greater emphasis should be placed on the continuity of statistical evidence and on result-reporting practices that “highlight, but don’t hide” the data. This perspective is discussed in detail in this paper and further elaborated in this blog post.
When evidence continuity is foregrounded, the exact surviving cluster size becomes less critical. In practice, this means that you may reasonably adopt whichever approach to estimating spatial relatedness you consider most appropriate for your cluster determination, without overemphasizing the nominal precision of that step.
Gang Chen