Logistic regression with Wilson intervals

Introduction

Back in 2010 I wrote a short article on the logistic (‘S’) curve in which I described its theoretical justification, mathematical properties and relationship to the Wilson score interval. This observed two key points.

  • We can map any set of independent probabilities p ∈ [0, 1] to a flat Cartesian space using the inverse logistic (‘logit’) function, defined as
    • logit(p) ≡ log(p / 1 – p) = log(p) – log(1 – p),
    • where ‘log’ is the natural logarithm and logit(p) ∈ [-∞, ∞].
  • By performing this transformation
    • the logistic curve in probability space becomes a straight line in logit space, and
    • Wilson score intervals for p ∈ (0, 1) are symmetrical in logit space, i.e. logit(p) – logit(w⁻) = logit(w⁺) – logit(p).
Logistic curve (k = 1) with Wilson score intervals for n = 10, 100.

Logistic curve (k = 1) with Wilson score intervals for n = 10, 100.

Continue reading

Advertisements