Model Version History¶
v1¶
This is the model described in the original publication.
The model for two-group analysis is described by the following sampling statements:
Where \(\hat\mu\) and \(\hat\sigma\) are the sample mean and sample standard deviation of all the data from the two groups. The effect size is calculated as \((\mu_1 - \mu_2) \big/ \sqrt{(\sigma_1^2 + \sigma_2^2) \,/\, 2}\).
v2¶
Version 2 of the model fixes issues about the standard deviation and normality.
The standard deviation of a t distribution \(t_\nu(\mu, \sigma)\) is not \(\sigma\), but \(\sigma \sqrt{\nu / (\nu - 2)}\) if \(2 < \nu\), and infinite if \(1 < \nu \le 2\). Distributions with infinite standard deviation (SD) rarely occur in reality (and never when it comes to humans), so the lower bound of \(\nu\) is changed from 1 to 2.5. The plots now display SD instead of \(\sigma\), and the formula for effect size also uses \(\mathrm{sd}_i\) instead of \(\sigma_i\).
Why is the lower bound of \(\nu\) 2.5 and not 2?
The probability density function of \(t_2\) is quite close to that of \(t_{2.5}\) in the \(\mu \pm 5 \sigma\) region, but for \(\nu\) close to 2, the SD is arbitrarily large because of the strong outliers. Setting a bound of 2.5 prevents strong outliers and extremely large standard deviations.
Another change concerns the sampling of \(\sigma_i\). In the original model \(\sigma_i \,/\, \hat\sigma\) was uniformly distributed between \(1 \, / \,1000\) and \(1000\), meaning the prior probability of \(\sigma > \hat\sigma\) was 1000 times that of \(\sigma < \hat\sigma\), which caused an overestimation of \(\sigma\) with low sample sizes (around \(N = 5\)). To make these probabilities equal, now \(\log(\sigma_i \,/\, \hat\sigma)\) is distributed uniformly between \(\log(1\, / \,1000)\) and \(\log(1000)\). At \(N=25\) this change in the prior does not cause a perceptible change in the posterior.
- Summary of changes:
- Lower bound of \(\nu\) is 2.5.
- SD is calculated as \(\sigma \sqrt{ \nu / (\nu - 2)}\).
- Effect size is calculated as \((\mu_1 - \mu_2) \big/ \sqrt{(\mathrm{sd}_1^2 + \mathrm{sd}_2^2) \,/\, 2}\).
- \(\log(\sigma_i \,/\, \hat\sigma)\) is uniformly distributed.
The model for two-group analysis is described by the following sampling statements: