Recently, I have been reviewing some work I conducted developing confidence intervals for Cramér’s φ, building on Bishop, Fienberg and Holland (1975). Finalising the edit for my forthcoming book (Wallis, 2021), I realised that Yvonne Bishop and colleagues had provided a formula for the variance of χ² without saying so explicitly!
The authors show how this formula is a building block for other methods, including estimating the standard deviation of ϕ (labelled ‘V’ in their notation). They also make an unfortunate but common error in deriving confidence intervals, but that is another story.
Anyway, they give the formula for the variance of Φ² = χ²/N, but it is trivial to present it as the variance of χ².
S²(ϕ) ≈ 1
4ϕN(k – 1) {4Σ
i jpi,j³
pi+² p+j² – 3Σ
i1
pi+(Σ
jpi,j²
pi+ p+j )² – 3Σ
j1
p+j(Σ
ipi,j²
pi+ p+j )² +2Σ
i j[ pi,j
pi+ p+j(Σ
lpl,j²
pl+ p+j )(Σ
mpi,m²
pi+ p+m )]}, for ϕ ≠ 0, (1)
where pi,j = fi,j / N and pi+, p+j, etc. represent row and column (prior) probabilities in a χ² test for homogeneity (Bishop et al. 1975: 386).
I used this formula to derive a confidence interval for Cramér’s φ. Bishop et al. give the standard deviation for ϕ as (in my notation):
S(ϕ) = 1
2ϕ√k – 1 S(Φ²),(2)
where Φ² = χ²/N. The rest is simple algebra once we recognise that the total number of cases in the table, N, is a scale factor for variance. The missing link is simply
S(Φ²) = √S²(χ²)/N. (3)
I generally cite the formula for the variance S²(ϕ) rather than the standard deviation to avoid a large square root symbol around Equation (1)! But as long as you remember that the standard deviation is the square root of the variance, you will be fine.
The method of inverting S(ϕ) described may be of interest for anyone wishing to compute intervals for χ² or other effect size measures based on it.
In that post I explain a method for computing a confidence interval for χ² for a given error level α which first computes the confidence interval for Cramér’s ϕ and then translates it to a χ² scale.
References
Bishop, Y.M.M., S.E. Fienberg & P.W. Holland (1975). Discrete Multivariate Analysis: Theory and Practice. Cambridge, MA: MIT Press.
Wallis, S.A. (2021). Statistics in Corpus Linguistics Research. New York: Routledge. » Announcement