Selection of Discriminative Genes from Microarray Data
Microarray technology allows us to record the expression levels of thousands of genes simultaneously within a number of different samples. A microarray gene expression data set can be represented by an expression table, , where wij ∈ ℜ is the measured expression level of gene in the jth sample, m and n represent the total number of genes and samples, respectively. Each row in the expression table corresponds to one particular gene and each column to a sample [1–3].
The wide use of high throughput technology produces an explosion in using gene expression phenotype for the identification and classification in a variety of diagnostic areas. An important application of gene expression data in functional genomics is to classify samples according to their gene expression profiles such as to classify cancer versus normal samples or to classify different types or subtypes of cancer [1–3]. However, for most gene expression data, the number of training samples is still very small compared to the large number of genes involved in the experiments. For example, the colon cancer data set consists of 62 samples and 2000 genes, and the leukemia data set contains 72 samples and 7129 genes. The number of samples is likely ...