Impossible logistic multinomials

Introduction

Recently, a number of linguists have begun to question the wisdom of assuming that linguistic change tends to follow an ‘S-curve’ or more properly, logistic, pattern. For example, Nevalianen (2015) offers a series of empirical observations that show that whereas data sometimes follows a continuous ‘S’, frequently this does not happen. In this short article I try to explain why this result should not be surprising.

The fundamental assumption of logistic regression is that a probability representing a true fraction, or share, of a quantity undergoing a continuous process of change by default follows a logistic pattern. This is a reasonable assumption in certain limited circumstances because an ‘S-curve’ is mathematically analogous to a straight line (cf. Newton’s first law of motion).

Regression is a set of computational methods that attempts to find the closest match between an observed set of data and a function, such as a straight line, a polynomial, a power curve or, in this case, an S-curve. We say that the logistic curve is the underlying model we expect data to be matched against (regressed to). In another post, I comment on the feasibility of employing Wilson score intervals in an efficient logistic regression algorithm.

We have already noted that change is assumed to be continuous, which implies that the input variable (x) is real and linear, such as time (and not e.g. probabilistic). In this post we discuss different outcome variable types. What are the ‘limited circumstances’ in which logistic regression is mathematically coherent?

  • We assume probabilities are free to vary from 0 to 1.
  • The envelope of variation must be constant, i.e. it must always be possible for an observed probability to reach 1.

Taken together this also means that probabilities are Binomial, not multinomial. Let us discuss what this implies. Continue reading

Logistic regression with Wilson intervals

Introduction

Back in 2010 I wrote a short article on the logistic (‘S’) curve in which I described its theoretical justification, mathematical properties and relationship to the Wilson score interval. This observed two key points.

  • We can map any set of independent probabilities p ∈ [0, 1] to a flat Cartesian space using the inverse logistic (‘logit’) function, defined as
    • logit(p) ≡ log(p / 1 – p) = log(p) – log(1 – p),
    • where ‘log’ is the natural logarithm and logit(p) ∈ [-∞, ∞].
  • By performing this transformation
    • the logistic curve in probability space becomes a straight line in logit space, and
    • Wilson score intervals for p ∈ (0, 1) are symmetrical in logit space, i.e. logit(p) – logit(w⁻) = logit(w⁺) – logit(p).
Logistic curve (k = 1) with Wilson score intervals for n = 10, 100.

Logistic curve (k = 1) with Wilson score intervals for n = 10, 100.

Continue reading

Competition between choices over time

Introduction Paper (PDF)

Measuring choices over time implies examining competition between alternates.

This is a fairly obvious statement. However, some of the mathematical properties of this system are less well known. These inform the expected behaviour of observations, helping us correctly specify null hypotheses.

  • The proportion of {shall, will} utterances where shall is chosen, p(shall | {shall, will}), is in competition with the alternative probability of will (they are mutually exclusive) and bounded on a probabilistic scale.
  • The probability associated with each member of a set of alternates X = {xi}, which we might write as p(xi | X), is bounded, 0 ≤ p(xi | X) ≤ 1, and exhaustive, Σp(xi | X) = 1.

A bounded system behaves differently from an unbounded one. Every child knows that a ball bouncing in an alley behaves differently than in an open playground. ‘Walls’ direct motion toward the centre.

In this short paper we discuss two properties of competitive choice:

  1. the tendency for change to be S-shaped rather than linear, and
  2. how this has an impact on confidence intervals. Continue reading