Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run.
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.
Distributed systems have become more fine-grained in the past 10 years, shifting from code-heavy monolithic applications to smaller, self-contained microservices. But developing these systems brings its own set of headaches. With lots of examples and practical advice, this book takes a holistic view of the topics that system architects and administrators must consider when building, managing, and evolving microservice architectures.
Want to know how the best software engineers and architects structure their applications to make them scalable, reliable, and maintainable in the long term? This book examines the key principles, algorithms, and trade-offs of data systems, using the internals of various popular software packages and frameworks as examples. You’ll learn how to determine what kind of tool is appropriate for which purpose, and how certain tools can be combined to fo...
In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. You’ll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques—classification, collaborative filtering, an...
The release of Java SE 8 introduced
significant enhancements that impact the Core Java technologies and
APIs at the heart of the Java platform. Many old Java idioms are no
longer required and new features like lambda expressions will
increase programmer productivity, but navigating these changes can
be challenging. Core Java® for the
Impatient is a complete but concise guide to Java SE 8.
Whether you’re deploying applications on-premise or in the cloud, this cookbook is for developers, operators, and IT professionals who need practical solutions for using Docker. The recipes in this book will help developers go from zero knowledge to distributed applications packaged and deployed within a couple of chapters. IT professionals will be able to use this cookbook to solve everyday problems, as well as create, run, share, and deploy Doc...
It’s easy to start coding with Python,
which is why the language is so popular. However, Python’s
unique strengths, charms, and expressiveness can be hard to grasp,
and there are hidden pitfalls that can easily trip you up. Effective Python will
help you master a truly “Pythonic” approach to
programming, harnessing Python’s full power to write
exceptionally robust and well-performing code.