Chapter 20. Cochran-Armitage Test for Trend

The Cochran-Armitage test for trend (CATT) is used in analyzing germline data. For example, variants in a VCF (variant call format) file generated by DNA sequencing can be labeled as germline data. The CATT is a statistical method of directing chi-squared tests toward narrow alternatives. If R is a set of response variables and E is a set of experimental variables, then the CATT is sensitive to the linearity between R(s) and E(s) and detects trends. The CATT can be expressed another way: if B is a binary outcome of some events {PASSED, FAILED} and C is a set of ordered categories {C1, ..., Cn}, then the CATT can be used as a linear trend in proportions on B across levels of C. To apply the CATT, we build a contingency table: two rows with outcome values {PASSED, FAILED} and n columns as {C1, ..., Cn}. The contingency table for the CATT is explained in the next sections.

According to Wikipedia:

The Cochran-Armitage test for trend, named for William Cochran and Peter Armitage, is used in categorical data analysis when the aim is to assess for the presence of an association between a variable with two categories and a variable with k categories. It modifies the Pearson chi-squared test to incorporate a suspected ordering in the effects of the k categories of the second variable. For example, doses of a treatment can be ordered as “low,” “medium,” and “high,” and we may suspect that the treatment benefit cannot become smaller as the dose ...

Get Data Algorithms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.