Experimenting with deriving accurate 2 × 2 φ intervals, I also considered using Liebetrau’s population standard deviation estimate.
To recap: Cramér’s φ (Cramér 1946) is a probabilistic intercorrelation for contingency tables based on the χ² statistic. An unsigned φ score is defined by
Cramér’s φ = √χ²/N(k – 1)(1)
where χ² is the r × c test for homogeneity (independence), N is the total frequency in the table, and k the minimum number of values of variables X and Y, i.e. k = min(r, c). For 2 × 2 tables, k – 1 = 1, so φ = √χ²/N is often quoted.
An alternative formula for 2 × 2 tables obtains a signed result, where a negative sign implies that the table tends towards the opposite diagonal.
signed 2 × 2 φ ≡ (ad – bc) / √(a + b)(c + d)(a + c)(b + d),(2)
where a, b, c and d are cell frequencies. However, Equation (2) cannot be applied to larger tables.
The method I discuss here is potentially extensible to other effect sizes and other published estimates of standard deviations.
We employ Liebetrau’s best estimate of the population standard deviation of φ for r × c tables:
s(φ) ≈ 1
)² – 3Σ
i j[ pi,j
)]}, for φ ≠ 0, (3)
where pi,j = fi,j / N and pi+, p+j, etc. represent row and column (prior) probabilities (Bishop, Fienberg and Holland 1975: 386). If φ = 0 we adjust the table by a small delta.