19 Principal Component Analysis

The data generated from experiments in bioinformatics may contain several dimensions and be quite complicated. However, the dimensionality of the data may far exceed its complexity. A reduction in dimensionality often allows simpler algorithms to analyze the data effectively. The most common method of data reduction in bioinformatics is principal component analysis (PCA).

19.1 The Purpose of PCA

Principal component analysis is an often-used tool that reduces the dimensionality of a problem. Consider a set of vectors that lie in RN space. It is possible that the data is not scattered about but has an organization. When looked at in one view, the data looks scattered, but if viewed from a different orientation the ...

Get Python for Bioinformatics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.