Normalization poses problems for Hadoop processing because it makes reading a record a nonlocal operation,
In Hadoop a local operation refers to executing code in the same physical location where the data it needs to work with is being stored.
When you normalize your data you're essentially splitting it up. If this "split up" data gets distributed in 2 physically different areas you suddenly have non-local operations.
Share this highlighthttp://www.safaribooksonline.com/a/hadoop-the-definitive/9530534/