O'Reilly logo

Modeling and Analysis of Compositional Data by Vera Pawlowsky-Glahn, Juan Jose Egozcue, Raimon Tolosana-Delgado

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5Exploratory data analysis

5.1 General remarks

In this chapter, the first steps in any analysis of a compositional data set are addressed. The set is represented as a matrix c05-math-0001 with c05-math-0002 rows (observed compositions) and c05-math-0003 columns (parts). An exploratory analysis includes the following steps:

  1. computing descriptive statistics, that is, the center and variation matrix of a data set, as well as its total variability;
  2. looking at the biplot of the data set to discover patterns;
  3. plotting patterns in ternary diagrams of subcompositions, possibly centered to enhance visualization;
  4. defining an appropriate representation in orthonormal coordinates and computing the corresponding coordinates; and
  5. computing classical summary statistics of the coordinates and representing the results in a balance-dendrogram.

In general, the last two steps will be based on a particular sequential binary partition, defined either a priori or as a result of the insights provided by the first three steps.

Before starting, some general considerations need to be made. The first step in a statistical analysis is to check the data set for errors. It can be done using standard procedures, for example, using the minimum ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required