Inferential statistics – and other animals

Introduction

Inferential statistics is a methodology of extrapolation from data. It rests on a mathematical model which allows us to predict values in the population based on observations in a sample drawn from that population.

Central to this methodology is the idea of reporting not just the observation itself but also the certainty of that observation. In some cases we can observe the population directly and make statements about it.

  • We can cite the 10 most frequent words in Shakespeare’s First Folio with complete certainty (allowing for spelling variations). Such statements would simply be facts.
  • Similarly, we could take a corpus like ICE-GB and report that in it, there are 14,275 adverbs ending in -ly out of 1,061,263 words.

Provided that we limit the scope of our remarks to the corpus itself, we do not need to worry about degrees of certainty because these statements are simply facts. Statements about the corpus are sometimes called descriptive statistics (the word statistic here being used in its most general sense, i.e. a number). Continue reading

Advertisements

Robust and sound?

When we carry out experiments and perform statistical tests we have two distinct aims.

  1. To form statistically robust conclusions about empirical data.
  2. To make logically sound arguments about experimental conclusions.

Robustness is essentially an inductive mathematical or statistical issue.

Soundness is a deductive question of experimental design and reporting.

Robust conclusions are those that are likely to be repeated if another researcher were to come along and perform the same experiment with different data sampled in much the same way. Sound arguments distinguish between what we can legitimately infer from our data, and the hypothesis we may wish to test.

Continue reading