Determining the sign of an effect size is quite similar from Frequentist and Bayesian perspectives


p-values and confidence intervals on an effect size have this correspondence: if p<0.05, the 95% confidence interval does not contain zero (or choose whatever α cutoff and 100%-α confidence interval you prefer). This means the interval is either entirely above zero or entirely below zero, which is to say we have determined the sign of the effect size (see previous blog entry).

Clarification: The precise guarantee here is "whatever the effect size may be, we will only make a false claim about its sign with probability at most 0.05." We may make no claim at all, and this is counted as not making a false claim.

Formally, the p-value is a means of rejecting the hypothesis that the effect size is zero, but it seems it is often more than this. Significant p-values, at least such as can have an associated confidence interval, allow us to reject fully half of the number line of effect sizes.

Where Frequentists like to talk of p-values, Bayesians like to talk of posterior probabilities. It had always seemed to me that this failed at the first hurdle: trying to replicate the t-test. If we take as H0 that the effect size is zero, and as H1 that the effect size is non-zero and hence drawn from some prior distribution, P(H0|y) and P(H1|y) will be dependent on the prior distribution associated with H1, with an overly wide distribution leading to smaller P(H1|y). This seems hopelessly subjective. Furthermore it requires the machinery of measure theory to even represent these peculiar prior beliefs, with a point mass of probability at zero within a continuous distribution.

But now consider an H1 of an effect size less than zero, and an H2 of an effect size greater than zero. A perfectly natural prior belief is that the distribution of the effect size is symmetric around zero. We no longer need a point mass. This still corresponds to the Frequentist test in that we are attempting to determine the sign of the effect size.

For the t-test, there is a choice of prior* such that the p-value is simply twice the Bayesian posterior probability of the less likely hypothesis.

* Improper, but choose a proper prior to get as close as you like.

Update: @higherfiveprime notes that Andrew Gelman (of course) and Francis Tuerlinckx have a paper somewhat related to this. Errors determining the sign conditional on having confidently determined the sign are referred to as "Type S" errors, and their point is that these are not controlled by the Frequentist procedure. Frequentist "Type I" errors, which are not conditional on a determination of the sign being made, are still controlled.

For Frequentist Type S error control, it appears you need to perform a False Discovery Rate (FDR) correction (eg Benjamini & Hochberg's method). So now we also have a nice Bayesian equivalent of FDR control!

See also: