Resolving skewing data

In statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. And in Teradata, it is defined as imbalanced processing, caused by uneven distribution. Highly skewed means some AMPs have more rows and some much less, as in data is not properly/evenly distributed. We can have data skew, CPU skew, and IO skew.

Shared Nothing architecture – dividing the work

The shared nothing architecture ensures that each virtual processor is responsible for the storage and retrieval of its own unique data. Data is stored physically together on the node, but the virtual processors ensure parallelism. This is also the basis of Teradata scalability. Each AMP owns an ...

Get Teradata Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.