You are previewing Scala High Performance Programming.
O'Reilly logo
Scala High Performance Programming

Book Description

Leverage Scala and the functional paradigm to build performant software

About This Book

  • Get the first book to explore Scala performance techniques in depth!

  • Real-world inspired use cases illustrate and support the techniques studied and the language features

  • This book is written by Vincent Theron and Michael Diamant, software engineers with several years of experience in the high-frequency trading and programmatic advertising industries

  • Who This Book Is For

    This book assumes a basic exposure to the Scala programming language and the Java Virtual Machine. You should be able to read and understand moderately advanced Scala code. No other knowledge is required.

    What You Will Learn

  • Analyze the performance of JVM applications by developing JMH benchmarks and profiling with Flight Recorder

  • Discover use cases and performance tradeoffs of Scala language features, and eager and lazy collections

  • Explore event sourcing to improve performance while working with stream processing pipelines

  • Dive into asynchronous programming to extract performance on multicore systems using Scala Future and Scalaz Task

  • Design distributed systems with conflict-free replicated data types (CRDTs) to take advantage of eventual consistency without synchronization

  • Understand the impact of queues on system performance and apply the Free monad to build systems robust to high levels of throughput

  • In Detail

    Scala is a statically and strongly typed language that blends functional and object-oriented paradigms. It has experienced growing popularity as an appealing and pragmatic choice to write production-ready software in the functional paradigm. Scala and the functional programming paradigm enable you to solve problems with less code and lower maintenance costs than the alternatives. However, these gains can come at the cost of performance if you are not careful.

    Scala High Performance Programming arms you with the knowledge you need to create performant Scala applications. Starting with the basics of understanding how to define performance, we explore Scala's language features and functional programming techniques while keeping a close eye on performance throughout all the topics.

    We introduce you as the newest software engineer at a fictitious financial trading company, named MV Trading. As you learn new techniques and approaches to reduce latency and improve throughput, you'll apply them to MV Trading’s business problems. By the end of the book, you will be well prepared to write production-ready, performant Scala software using the functional paradigm to solve real-world problems.

    Style and approach

    This step-by-step guide will help you create high performance applications using Scala. Packed with lots of code samples, tips and tricks, every topic is explained in a detailed, easy-to-understand manner.

    Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at If you purchased this book elsewhere, you can visit and register to have the code file.

    Table of Contents

    1. Scala High Performance Programming
      1. Scala High Performance Programming
      2. Credits
      3. About the Authors
      4. About the Reviewer
        1. eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
      6. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Downloading the color images of this book
          3. Errata
          4. Piracy
          5. Questions
      7. 1. The Road to Performance
        1. Defining performance
          1. Performant software
          2. Hardware resources
          3. Latency and throughput
          4. Bottlenecks
        2. Summarizing performance
          1. The problem with averages
          2. Percentiles to the rescue
        3. Collecting measurements
          1. Using benchmarks to measure performance
          2. Profiling to locate bottlenecks
          3. Pairing benchmarks and profiling
        4. A case study
        5. Tooling
        6. Summary
      8. 2. Measuring Performance on the JVM
        1. A peek into the financial domain
        2. Unexpected volatility crushes profits
        3. Reproducing the problem
          1. Throughput benchmark
          2. Latency benchmark
            1. The first latency benchmark
            2. The coordinated omission problem
            3. The second latency benchmark
            4. The final latency benchmark
          3. Locating bottlenecks
            1. Did I test with the expected set of resources?
            2. Was the system environment clean during the profiling?
            3. Are the JVM's internal resources performing to expectations?
            4. Where are the CPU bottlenecks?
            5. What are the memory allocation patterns?
            6. Trying to save the day
            7. A word of caution
            8. A profiling checklist
          4. Taking big steps with microbenchmarks
            1. Microbenchmarking the order book
        4. Summary
      9. 3. Unleashing Scala Performance
        1. Value classes
          1. Bytecode representation
          2. Performance considerations
          3. Tagged types - an alternative to value classes
        2. Specialization
          1. Bytecode representation
          2. Performance considerations
        3. Tuples
          1. Bytecode representation
          2. Performance considerations
        4. Pattern matching
          1. Bytecode representation
          2. Performance considerations
        5. Tail recursion
          1. Bytecode representation
          2. Performance considerations
        6. The Option data type
          1. Bytecode representation
          2. Performance considerations
        7. Case study – a more performant option
        8. Summary
      10. 4. Exploring the Collection API
        1. High-throughput systems – improving the order book
          1. Understanding historical trade-offs – list implementation
            1. List
            2. TreeMap
            3. Adding limit orders
            4. Canceling orders
          2. The current order book – queue implementation
            1. Queue
          3. Improved cancellation performance through lazy evaluation
            1. Set
            2. Benchmarking LazyCancelOrderBook
            3. Lessons learned
        2. Historical data analysis
          1. Lagged time series returns
            1. Vector
            2. Data clean up
          2. Handling multiple return series
            1. Array
            2. Looping with the Spire cfor macro
        3. Summary
      11. 5. Lazy Collections and Event Sourcing
        1. Improving the client report generation speed
          1. Diving into the reporting code
          2. Using views to speed up report generation time
            1. Constructing a custom view
            2. Applying views to improve report generation performance
          3. View caveats
            1. SeqView extends Seq
            2. Views are not memoizers
          4. Zipping up report generation
        2. Rethinking reporting architecture
          1. An overview of Stream
          2. Transforming events
          3. Building the event sourcing pipeline
          4. Streaming Markov chains
          5. Stream caveats
            1. Streams are memoizers
            2. Stream can be infinite
        3. Summary
      12. 6. Concurrency in Scala
        1. Parallelizing backtesting strategies
          1. Exploring Future
          2. Future and crazy ideas
          3. Future usage considerations
            1. Performing side-effects
            2. Blocking execution
            3. Handling failures
          4. Hampering performance through executor submissions
        2. Handling blocking calls and callbacks
          1. ExecutionContext and blocking calls
            1. Asynchronous versus nonblocking
            2. Using a dedicated ExecutionContext to block calls
            3. Using the blocking construct
          2. Translating callbacks with Promise
            1. From callbacks to a Future-based API
            2. Combining Future with Promise
        3. Tasked with more backtest performance improvements
          1. Introducing Scalaz Task
            1. Creating and executing Task
            2. Asynchronous behavior
            3. The execution model
          2. Modeling trading day simulations with Task
          3. Wrapping up the backtester
        4. Summary
      13. 7. Architecting for Performance
        1. Distributed automated traders
          1. A glimpse into distributed architectures
          2. The first attempt at a distributed automated trader
          3. Introducing CRDTs
            1. The state-based increase-only counter
            2. The operation-based increase-only counter
          4. CRDTs and automated traders
          5. When the balance is not enough
            1. A new CRDT - the grow-only set
        2. Free trading strategy performance improvements
          1. Benchmarking the trading strategy
          2. The danger of unbounded queues
          3. Applying back pressure
          4. Applying load-control policies
            1. Rejecting work
            2. Interrupting expensive processing
          5. Free monads
            1. Describing a program
            2. Building an interpreter
            3. Benchmarking the new trading strategy pipeline
            4. A Task interpreter
            5. Exploring free monads further
        3. Summary