Not everything that looks like a probability is.

Just because a variable or function ranges from 0 to 1, it does not mean that it behaves like a unitary probability over that range.

### Natural probabilities

What we might term a **natural** probability is a proper fraction of two frequencies, which we might write as *p* = *f */ *n*.

- Provided that
*f*can be any value from 0 to*n*,*p*can range from 0 to 1. - In this formula,
*f*and*n*must also be natural frequencies, that is,*n*stands for the size of the set of all cases, and*f*the size of a true subset of these cases.

This natural probability is expected to be a Binomial variable, and the formulae for *z *tests, χ² tests, Wilson intervals, etc., as well as logistic regression and similar methods, may be legitimately applied to such variables. The Binomial distribution is the expected distribution of such a variable if each observation is drawn independently at random from the population (an assumption that is not strictly true with corpus data).

Another way of putting this is that a Binomial variable expresses the number of individual events of Type A in a situation where an outcome of either A or B are possible. If we observe, say that 8 out of 10 cases are of Type A, then we can say we have an observed probability of A being chosen, *p*(A | {A, B}), of 0.8. In this case, *f* is the frequency of A (8), and *n* the frequency of both A and B (10). See Wallis (2013a). Continue reading “An unnatural probability?”