Plotting the distributions of confidence intervals on algebraic operators on proportions

Introduction

Elsewhere on this blog we have discussed the distribution of values predicted by confidence intervals, referred to more formally as probability density functions (pdfs).

When we plot confidence intervals, we determine the nearest point(s) to our observed value where we would expect the true population value to be and still be considered significantly different from it (at a given error level).

But we can also plot the distribution of the interval function by varying the error level α from 1 (every difference is significant) to almost zero (asymptotically: nothing is significant). We can then see which expected values are more likely than others given one further piece of information:

  • The upper and lower bounds are computed independently, so the plot effectively assumes that there is equal chance of an observed property p being greater than the expected property P, as vice versa, except at the boundary, where one distribution becomes empty.

We can then assess the overall shape, observing how these values tail off, a process that is much more instructive than arguing about ‘p-values’.

We can also see something else.

Most traditional discussions of confidence intervals assume that intervals are approximately Normal, an assumption Wallis (2021: 297) calls the ‘Normal fallacy’. This conceptual error has dogged discussion of confidence intervals in the statistics literature, and deeply affects how people rationalise about intervals.

In my book, I point out that even the simplest interval about the single proportion p cannot be Normal. Instead we discover that it is profoundly shaped by the boundaries of the probabilistic range P = [0, 1]. Elsewhere on this blog, I have developed the implications of this argument and plotted more distributions.

I show the shape of the distribution for the Wilson score interval based on p (Wilson 1927) and other related distributions. Only when n is large and p central does this distribution approximate to the Normal.

In this blog post we will refer to the interval p ∈ (w, w+) in terms of a functional notation. Wallis (2021: 111) proposes two Wilson functions with three parameters.

lower bound w = WilsonLower(p, n, α/2) = p′ – e,
upper bound w+ = WilsonUpper(p, n, α/2) = p′ + e, (1)

where

p′ = p + zα/2²/2n
1 + zα/2²/n
,
and e′ = zα/2 p(1 – p)/n + zα/2²/4n²
1 + zα/2²/n
.

where n is the sample size, p = f/n is the observed proportion and α/2 is the error level for each tail. Each bound should be treated separately, although they converge at p where α = 1. Continue reading “Plotting the distributions of confidence intervals on algebraic operators on proportions”

Plotting the pairwise ϕ interval distribution

Introduction

Now that we are able to plot pdfs for the Newcombe-Wilson difference interval, we can employ the algebraic method of Wallis (2021: 225) to compute the equivalent distribution pdfs for 2 × 2 ϕ. This is a well-known signed measure of association which may be defined as

signed 2 × 2 ϕ ≡ (adbc) /(a + b)(c + d)(a + c)(b + d),(1)

for a table [[a, b], [c, d]]. For larger contingency tables, an unsigned ϕ score due to Harald Cramér (1946) may be computed from

Cramér’s ϕ = √χ²/n(k – 1),(2)

where χ² is the r × c test for homogeneity (independence), n is the total frequency in the table, and k the minimum number of values of variables X and Y, i.e. k = min(r, c).

In Confidence intervals on pairwise ϕ statistics, I note a useful proof I found in (Wallis 2012). It turns out that if we let d(x1) = p(x1 | y2) – p(x1 | y1) represent the difference in proportions on the y axis of a 2 × 2 table, and d(y1) represent the difference in proportions on the x axis, the following equality holds.

ϕ(X, Y)² = –d(x1) × –d(y1) = d(x1) × d(y1).(3)

Another way of saying this is that 2 × 2 ϕ is the geometric mean of d(x1) and d(y1). The equation has three implications:

  1. There is a strict monotonic relationship between ϕ, d(x1) and d(y1). In fact, with d = p2p1 on both axes, the relationship between ϕ and either d score is negatively monotonic (ϕ increases as d decreases), whereas the relationship between d scores is positively monotonic.
  2. This equality must apply to the interval bounds for ϕ as well as to any observed ϕ score.
  3. Geometric means are not conventionally applied to negative numbers, and if any point is zero we obtain a discontinuity.

In this earlier post and my book, I used these observations to derive an interval on 2 × 2 ϕ as

wd(ϕ) = –sign(d+(x1))√d+(y1) × d+(x1), and
wd+(ϕ) = –sign(d(x1))√d(y1) × d(x1). (4)

where sign(x) obtains +1 for x > 0, and -1 otherwise, and d(x1) represents the lower interval bound for d(x1), d+(x1) the upper bound, etc. (For more discussion on the derivation see Confidence intervals on pairwise ϕ statistics.)

Crucially, these bounds are expressed relative to the difference d, rather than centred on zero. In other words, d(x1) is the potential population difference less than an observed d(x1) that is just significantly different from it at a given error level α.

To convert zero-based Newcombe-Wilson interval bounds to relative bounds, we simply subtract them from d:

d ∈ (d, d+) = d – (wd, wd+) = (dwd+, dwd),(5)

where (wd, wd+) is the Newcombe-Wilson interval for d.

Now that we can compute the pdf for the NW interval, we can also compute the pdf for this new 2 × 2 ϕ interval. We calculate the difference interval d for both axes on a 2 × 2 table for varying ϕ scores, combine both difference intervals into a ϕ interval and then apply delta approximation to plot the curve. Continue reading “Plotting the pairwise ϕ interval distribution”

Plotting the Newcombe-Wilson distribution

Introduction

In a previous post, Plotting the Wilson distribution, we saw how the probability density function (pdf) for Wilson score intervals (colloquially, ‘Wilson distributions’) could be estimated using delta approximation. See also Wallis (2021: 297). These curves were often far from Normal (the bell-curve, Gaussian) in shape, being squeezed into the probability range P = [0, 1]. Notably, we showed that the same function is, however, very nearly Gaussian on a logit scale.

We also noted that whereas this ‘Wilson distribution’ adopted a continuous (often asymmetric) curve based on p, it should properly be considered as two separate distributions, one for each bound, w and w+, each meeting at p. This insight became clearer and more important when continuity-corrected Wilson and Clopper-Pearson distributions were considered. Continuity-corrected distributions began at p ± 12n, and CP distributions similarly include an ‘excluded middle’.

In this post we explore the corresponding distributions (pdfs) for the Newcombe-Wilson interval. Robert Newcombe (1998) created a difference interval centred on zero using a method of summing independent variances at opposite bounds:

(wd, wd+) = (–√(p1  – w1   )² + (w2+    – p2  )², (w1+    – p1  )² + (p2  – w2   )²),(1)

where (wi, wi+) are the Wilson interval bounds for pi, i ∈ {0, 1} obtained by Equation (2) or (3), dropping subscript indexes for simplicity.

w = WilsonLower(pn, α/2), and w+ = WilsonUpper(pn, α/2).(2)

With a correction for continuity,

wcc = WilsonLower(max(p – 12n, 0), n, α/2), and w+cc = WilsonUpper(min(p + 12n, 1), n, α/2).(3)

These Wilson functions may be defined in the following way.

WilsonLower(p, n, α/2)w = p′ – e′, and WilsonUpper(p, n, α/2)w+ = p′ + e′,(4)

where

p′ = p + zα/2²/2n
1 + zα/2²/n
,
and e′ = zα/2 p(1 – p)/n + zα/2    ²/4n²
1 + zα/2²/n
.

See also Wallis (2021: 109-111). Continue reading “Plotting the Newcombe-Wilson distribution”