 # Impossible logistic multinomials

### Introduction

Recently, a number of linguists have begun to question the wisdom of assuming that linguistic change tends to follow an ‘S-curve’ or more properly, logistic, pattern. For example, Nevalianen (2015) offers a series of empirical observations that show that whereas data sometimes follows a continuous ‘S’, frequently this does not happen. In this short article I try to explain why this result should not be surprising.

The fundamental assumption of logistic regression is that a probability representing a true fraction, or share, of a quantity undergoing a continuous process of change by default follows a logistic pattern. This is a reasonable assumption in certain limited circumstances because an ‘S-curve’ is mathematically analogous to a straight line (cf. Newton’s first law of motion).

Regression is a set of computational methods that attempts to find the closest match between an observed set of data and a function, such as a straight line, a polynomial, a power curve or, in this case, an S-curve. We say that the logistic curve is the underlying model we expect data to be matched against (regressed to). In another post, I comment on the feasibility of employing Wilson score intervals in an efficient logistic regression algorithm.

We have already noted that change is assumed to be continuous, which implies that the input variable (x) is real and linear, such as time (and not e.g. probabilistic). In this post we discuss different outcome variable types. What are the ‘limited circumstances’ in which logistic regression is mathematically coherent?

• We assume probabilities are free to vary from 0 to 1.
• The envelope of variation must be constant, i.e. it must always be possible for an observed probability to reach 1.

Taken together this also means that probabilities are Binomial, not multinomial. Let us discuss what this implies. Continue reading

# Competition between choices over time

### Introduction Paper (PDF)

Measuring choices over time implies examining competition between alternates.

This is a fairly obvious statement. However, some of the mathematical properties of this system are less well known. These inform the expected behaviour of observations, helping us correctly specify null hypotheses.

• The proportion of {shall, will} utterances where shall is chosen, p(shall | {shall, will}), is in competition with the alternative probability of will (they are mutually exclusive) and bounded on a probabilistic scale.
• The probability associated with each member of a set of alternates X = {xi}, which we might write as p(xi | X), is bounded, 0 ≤ p(xi | X) ≤ 1, and exhaustive, Σp(xi | X) = 1.

A bounded system behaves differently from an unbounded one. Every child knows that a ball bouncing in an alley behaves differently than in an open playground. ‘Walls’ direct motion toward the centre.

In this short paper we discuss two properties of competitive choice:

1. the tendency for change to be S-shaped rather than linear, and
2. how this has an impact on confidence intervals. Continue reading