O'Reilly logo
  • Rishabh Chaudha thinks this is interesting:

Normalization poses problems for Hadoop processing because it makes reading a record a nonlocal operation,


Cover of Hadoop: The Definitive Guide, 4th Edition


In Hadoop a local operation refers to executing code in the same physical location where the data it needs to work with is being stored.

When you normalize your data you're essentially splitting it up. If this "split up" data gets distributed in 2 physically different areas you suddenly have non-local operations.