cramer’s phi – corp.ling.stats

Confidence intervals for Cohen’s h

February 28, 2024April 22, 2024 SeanLeave a comment

1. Introduction

Cohen’s h (Cohen, 2013) is an effect size for the difference of two independent proportions that is sometimes cited in the literature. h ranges between minus and plus pi, i.e. h ∈ [–π, π].

Jacob Cohen suggests that if |h| > 0.2, this is a ‘small effect size’, if |h| > 0.5, it is ‘medium’, and if |h| > 0.8 it is ‘large’. This conventional application of effect sizes – as a descriptive method for distinguishing sizes – is widespread.

The score is defined as the difference between the arcsine transform of the root of Binomial proportions p_i for i ∈ {1, 2}, hence the expanded range, ±π.

That is,

h = ψ(p1) – ψ(p2),(1)

where the transform function ψ(p) is defined as

ψ(p) = 2 arcsin(√p).(2)

In this blog post I will explain how to derive an accurate confidence interval for this property h. The benefits of doing so are multiple.

We can plot h scores with intervals, so we can visualise the reliability of their estimate, pay attention to the smallest bound, etc.
We can compare two scores, h₁ and h₂, for significant difference. In other words, we can conclude that h₂ > h₁, or vice versa.
We can reinterpret ‘large’ and ‘small’ effects for statistical power.
We can consider whether an inner bound is greater than Jacob’s thresholds. Thus if h is positive, if h^– > 0.5 we can report that the likely population score is at least a ‘medium’ effect.

An absolute (unsigned and non-directional) version of |h| is sometimes cited. We can compute intervals for unsigned |h|. We will return to this question later.

Continue reading “Confidence intervals for Cohen’s h” →

Directional evidence revisited

June 16, 2022March 12, 2024 SeanLeave a comment

End weight bias and templating in conjoined phrase postmodification

Abstract Full Paper (PDF)

The tendency of speakers and writers to place larger constructions at the end of sentences, whether consciously or unconsciously, is well established. Often this question of ‘end weight’ is usually discussed in relation to grammatical transformations. In this short paper we demonstrate a simple method for investigating a similar phenomenon in coordination patterns where conjoins are either noun phrases, e.g. the X of Y or Z, or prepositional phrases, e.g. the X of Y or of Z. We then investigate whether the coordinated noun phrases (Y, Z) are themselves postmodified, either by another prepositional phrase or by a clause. As postmodifying phrases and clauses are potentially expansive, they are grammatically complex and we operationalise them as signifiers of ‘weight’. We find that both sets of coordination patterns are end-sequence biased by weight.

We also find an elevated frequency for patterns where both first and last conjoins in the sequence are greater than would be expected were they independently selected. Setting aside potential explanations of directional influence, which cannot be decided inductively, we focus instead on the content of these doubly-postmodified constructions and examine them for evidence of templating, i.e. lexical-syntactic repetition.

We also show that these results are not explicable by semantic ordering in coordination, and contrast evidence from prepositional and clausal postmodification with that from premodifying adjective phrases, where scope ambiguity may also be a factor.

Continue reading “Directional evidence revisited” →

Confidence intervals on goodness of fit ϕ scores

September 8, 2021March 10, 2024 SeanLeave a comment

Introduction

In Wallis (2021), I offered two approaches to computing confidence intervals on the effect size Cramér’s ϕ. I also motivated and summarised approaches to a comparable goodness of fit metric (where a high ϕ score reflects a greater difference and thus a ‘poor fit’).

A goodness of fit evaluation is one where we compare an observed distribution of k cells, say, with an expected distribution of the same number of cells. The test, which is a type of χ² test, has a number of applications. A goodness of fit ϕ score would be expected to range from 0 to 1, with 0 representing identity and 1 representing the opposite, a maximally distinct distribution.

In an earlier paper published on this blog (Wallis 2012), I considered a range of possible measures that had this property. However, one of the questions I had left unresolved was how to compute a confidence interval on such a measure.

Why might we want to do this?

To cite or plot measures with confidence intervals, identifying the level of certainty we can ascribe to a particular observed measure.
To compare ϕ with an arbitrary level, e.g. to test if ϕ ≠ D where D ≠ 0. (As we shall see, where k > 2 and ϕ unsigned, comparing goodness of fit ϕ with 0 is more difficult due to loss of information, and you should employ a goodness of fit test instead.)
To compare two ϕ scores for their significant difference in a given direction, e.g. to establish that, say, ϕ₁ > ϕ₂.

Summing independent, dependent and constrained variances

The Bienaymé theorem serves for computing the total variance of the sum of k independent Normally distributed variables by simple summation of variance.

Bienaymé variance s² = s₁² + s₂² + … + s_k² = ∑s_i².(1)

A total standard deviation s is obtained by taking the square root of Equation (1).

To estimate a confidence interval on a sum of k independent proportions, ∑p_i, we follow Zou and Donner (2008). A confidence interval on a sum of proportions may be obtained by substituting interval widths, u^– = (p – w^–) and u⁺ = (w⁺ – p), for each s_i term in the equation. The confidence interval is then found with the square root of the result. The constant z_α/2 factors out. See An algebra of intervals.

independent sum ∈ (L, U) = (∑p_i – √∑(p_i – w_i^–)², ∑p_i + √∑(w_i⁺ – p_i)²), (1′)

This assumes that all of these proportions are independent. But what of chi-square-type scenarios, where there are k – 1 degrees of freedom for k proportions summing to 1?

Obviously, we are not interested in the confidence interval for ∑p_i, as this must be 1 (or [1, 1] if you prefer). But we are interested in confidence intervals for the sum of functions of p_i, ∑fn(p_i). Zou and Donner argue that equations of this type should obtain a sound interval provided that the original intervals are sound.

Consider the simplest two-valued 2 × 1 goodness of fit χ². As we know, the two proportions are completely dependent. If p₁ increases, p₂ = 1 – p₁ must fall. The table has a single degree of freedom. Consequently, standard deviations and interval positions are simply summed.

total standard deviation s = s₁ + s₂. (2)

dependent sum (L, U) = (∑fn(w_i^–), ∑fn(w_i⁺)), (2′)

for an increasing monotonic function, fn, over P = [0, 1]. We will discuss other function types below.

Another way of thinking about this is that independent variables are considered to vary at right angles (tangents) to each other, whereas strictly dependent variables vary along the same axis. In some circumstances this means variables subtract and even cancel each other out; in others (like χ²) they sum.

Figure 1. Left: standard deviation of sum of independent variables x, y, z; right, summing standard deviations of two dependent variables on the same axis.

How do we generalise this idea to closed k × 1 goodness of fit χ² tables, where there are k – 1 degrees of freedom? Now there are fewer dimensions than variables. Continue reading “Confidence intervals on goodness of fit ϕ scores” →

1. Introduction

Share this:

End weight bias and templating in conjoined phrase postmodification

Abstract Full Paper (PDF)

Share this:

Introduction

Summing independent, dependent and constrained variances

Share this: