Learn how to build and deploy a modern big data architecture to empower your business
Traditional relational databases are today ineffective with dealing with the challenges presented by Big Data. A Hadoop-based architecture offers a radical solution, as it is designed specifically to handle huge sets of unstructured data.
This book takes you through the journey of building a modern data lake architecture using HDInsight, a Hadoop-based service that allows you to successfully manage high volume and velocity data in the Microsoft Azure Cloud. Featuring a wealth of practical examples, you'll find tips and techniques to provision your own HDInsight cluster to ingest, organize, transform, and analyze data.
While guided through HDInsight, you'll explore the wider Hadoop ecosystem with plenty of working examples on Hadoop technologies including Hive, Pig, MapReduce, HBase, Storm, and analytics solutions including using Excel PowerQuery, PowerMap, and PowerBI.
What You Will Learn
Explore core features of Hadoop, including the HDFS2 and YARN, the new resource manager for Hadoop
Build your HDInsight cluster in minutes and learn how to administer it using Azure PowerShell
Discover what's new in Hadoop 2.X and the reference architecture for a modern data lake based on Hadoop
Find out more about a data lake vision and its core capabilities
Ingest and organize your data into HDInsight
Utilize open source software to transform data including Hive, Pig, and MapReduce, and make it available for decision makers
Get to grips with architectural considerations for scalability, maintainability, and security
Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.