.. _ch-model-history: Model Version History ===================== v1 -- This is the model described in the `original publication `_. The model for two-group analysis is described by the following sampling statements: .. math:: \mu_1 &\sim \text{Normal}(\hat\mu, 1000\, \hat\sigma) \\ \mu_2 &\sim \text{Normal}(\hat\mu, 1000\, \hat\sigma) \\ \sigma_1 &\sim \text{Uniform}(\hat\sigma \,/\, 1000, 1000\, \hat\sigma) \\ \sigma_2 &\sim \text{Uniform}(\hat\sigma \,/\, 1000, 1000\, \hat\sigma) \\ \nu &\sim \text{Exponential}(1\,/\,29) + 1 \\ y_1 &\sim t_\nu(\mu_1, \sigma_1) \\ y_2 &\sim t_\nu(\mu_2, \sigma_2) Where :math:`\hat\mu` and :math:`\hat\sigma` are the sample mean and sample standard deviation of all the data from the two groups. The effect size is calculated as :math:`(\mu_1 - \mu_2) \big/ \sqrt{(\sigma_1^2 + \sigma_2^2) \,/\, 2}`. .. _sec-model-v2: .. _sec-model-latest: v2 -- Version 2 of the model fixes issues about the standard deviation and normality. The standard deviation of a *t* distribution :math:`t_\nu(\mu, \sigma)` is not :math:`\sigma`, but :math:`\sigma \sqrt{\nu / (\nu - 2)}` if :math:`2 < \nu`, and infinite if :math:`1 < \nu \le 2`. Distributions with infinite standard deviation (SD) rarely occur in reality (and never when it comes to humans), so the lower bound of :math:`\nu` is changed from 1 to 2.5. The plots now display SD instead of :math:`\sigma`, and the formula for effect size also uses :math:`\mathrm{sd}_i` instead of :math:`\sigma_i`. *Why is the lower bound of* :math:`\nu` *2.5 and not 2?* The probability density function of :math:`t_2` is quite close to that of :math:`t_{2.5}` in the :math:`\mu \pm 5 \sigma` region, but for :math:`\nu` close to 2, the SD is arbitrarily large because of the strong outliers. Setting a bound of 2.5 prevents strong outliers and extremely large standard deviations. Another change concerns the sampling of :math:`\sigma_i`. In the original model :math:`\sigma_i \,/\, \hat\sigma` was uniformly distributed between :math:`1 \, / \,1000` and :math:`1000`, meaning the *prior* probability of :math:`\sigma > \hat\sigma` was 1000 times that of :math:`\sigma < \hat\sigma`, which caused an overestimation of :math:`\sigma` with low sample sizes (around :math:`N = 5`). To make these probabilities equal, now :math:`\log(\sigma_i \,/\, \hat\sigma)` is distributed uniformly between :math:`\log(1\, / \,1000)` and :math:`\log(1000)`. At :math:`N=25` this change in the prior does not cause a perceptible change in the posterior. *Summary of changes*: - Lower bound of :math:`\nu` is 2.5. - SD is calculated as :math:`\sigma \sqrt{ \nu / (\nu - 2)}`. - Effect size is calculated as :math:`(\mu_1 - \mu_2) \big/ \sqrt{(\mathrm{sd}_1^2 + \mathrm{sd}_2^2) \,/\, 2}`. - :math:`\log(\sigma_i \,/\, \hat\sigma)` is uniformly distributed. The model for two-group analysis is described by the following sampling statements: .. math:: \mu_1 &\sim \text{Normal}(\hat\mu, 1000 \, \hat\sigma) \\ \mu_2 &\sim \text{Normal}(\hat\mu, 1000 \, \hat\sigma) \\ \log(\sigma_1 \,/\, \hat\sigma) &\sim \text{Uniform}(\log(1 \, / \, 1000), \log(1000)) \\ \log(\sigma_2 \,/\, \hat\sigma) &\sim \text{Uniform}(\log(1 \, / \, 1000), \log(1000)) \\ \nu &\sim \text{Exponential}(1\, / \, 27.5) + 2.5 \\ y_1 &\sim t_\nu(\mu_1, \sigma_1) \\ y_2 &\sim t_\nu(\mu_2, \sigma_2) .. Note: if there is a new model, move the _sec-model-latest label to here.