Dear Dr Chen Gang，

I am reading the source of MBA, namely MBA.R. Some details have confused me. I need your help.

First, in the definition of rp_summary function, why do you transform the original data in this way: sqrt(2) * (data - mean) + mean?

```
# obtain summary information of posterior samples for RPs
rp_summary <- function(ps, ns, nR) {
mm <- apply(ps, c(2,3), mean)
for(ii in 1:nR) for(jj in 1:nR) ps[,ii,jj] <- sqrt(2)*(ps[,ii,jj] - mm[ii,jj]) + mm[ii,jj]
RP <- array(NA, dim=c(nR, nR, 8))
RP[,,1] <- apply(ps, c(2,3), mean)
RP[,,2] <- apply(ps, c(2,3), sd)
RP[,,3] <- apply(ps, c(2,3), cnt, ns)
RP[,,4:8] <- aperm(apply(ps, c(2,3), quantile, probs=c(0.025, 0.05, 0.5, 0.95, 0.975)), dim=c(2,3,1))
dimnames(RP)[[1]] <- dimnames(ps)[[2]]
dimnames(RP)[[2]] <- dimnames(ps)[[3]]
dimnames(RP)[[3]] <- c('mean', 'SD', 'P+', '2.5%', '5%', '50%', '95%', '97.5%')
return(RP)
}
```

Secondly, in extracting region pair effects, why is the reference level "intercept - sum(other EOI)" ( in code: psa[nl,,,] <- ps[nl,,,] - psa[nl,,,] # reference level)? And why is the EOI level "EOI + intercept" (in code: psa[jj,,,] <- ps[nl,,,] + ps[jj,,,])?

```
########## region pair effects #############
# for factor
if(any(!is.na(lop$EOIc) == TRUE)) for(ii in 1:length(lop$EOIc)) {
lvl <- levels(lop$dataTable[[lop$EOIc[ii]]]) # levels
nl <- nlevels(lop$dataTable[[lop$EOIc[ii]]]) # number of levels: last level is the reference in deviation coding
ps <- array(0, dim=c(nl, ns, nR, nR)) # posterior samples
for(jj in 1:(nl-1)) ps[jj,,,] <- region_pair(pe, ge, paste0(lop$EOIc[ii],jj), nR)
ps[nl,,,] <- region_pair(pe, ge, 'Intercept', nR)
psa <- array(0, dim=c(nl, ns, nR, nR)) # posterior samples adjusted
for(jj in 1:(nl-1)) {
psa[jj,,,] <- ps[nl,,,] + ps[jj,,,]
psa[nl,,,] <- psa[nl,,,] + ps[jj,,,]
}
psa[nl,,,] <- ps[nl,,,] - psa[nl,,,] # reference level
dimnames(psa)[[3]] <- rL
dimnames(psa)[[4]] <- rL
```

Thirdly, why can we get the diagonal posterior samples though we don't provide the diagonal values?

And if we provide a datatable with diagonal values or duplicates entries ( more than n*(n-1)/2 ), will the final results be greatly affected?