You are previewing Mastering Concurrency Programming with Java 8.
O'Reilly logo
Mastering Concurrency Programming with Java 8

Book Description

Master the principles and techniques of multithreaded programming with the Java 8 Concurrency API

About This Book

  • Implement concurrent applications using the Java 8 Concurrency API and its new components

  • Improve the performance of your applications or process more data at the same time, taking advantage of all of your resources.

  • Construct real-world examples related to machine learning, data mining, image processing, and client/server environments

  • Who This Book Is For

    If you are a competent Java developer with a good understanding of concurrency but have no knowledge of how to effectively implement concurrent programs or use streams to make processes more efficient, then this book is for you.

    What You Will Learn

  • Design concurrent applications by converting a sequential algorithm into a concurrent one

  • Discover how to avoid all the possible problems you can get in concurrent algorithms

  • Use the Executor framework to manage concurrent tasks without creating threads

  • Extend and modify Executors to adapt their behavior to your needs

  • Solve problems using the divide and conquer technique and the Fork/Join framework

  • Process massive data sets with parallel streams and Map/Reduce implementation

  • Control data-race conditions using concurrent data structures and synchronization mechanisms

  • Test and monitor concurrent applications

  • In Detail

    Concurrency programming allows several large tasks to be divided into smaller sub-tasks, which are further processed as individual tasks that run in parallel. All the sub-tasks are combined together once the required results are achieved; they are then merged to get the final output. The whole process is very complex. This process goes from the design of concurrent algorithms to the testing phase where concurrent applications need extra attention. Java includes a comprehensive API with a lot of ready-to-use components to implement powerful concurrency applications in an easy way, but with a high flexibility to adapt these components to your needs.

    The book starts with a full description of design principles of concurrent applications and how to parallelize a sequential algorithm. We'll show you how to use all the components of the Java Concurrency API from basics to the most advanced techniques to implement them in powerful concurrency applications in Java.

    You will be using real-world examples of complex algorithms related to machine learning, data mining, natural language processing, image processing in client / server environments. Next, you will learn how to use the most important components of the Java 8 Concurrency API: the Executor framework to execute multiple tasks in your applications, the phaser class to implement concurrent tasks divided into phases, and the Fork/Join framework to implement concurrent tasks that can be split into smaller problems (using the divide and conquer technique). Toward the end, we will cover the new inclusions in Java 8 API, the Map and Reduce model, and the Map and Collect model. The book will also teach you about the data structures and synchronization utilities to avoid data-race conditions and other critical problems. Finally, the book ends with a detailed description of the tools and techniques that you can use to test a Java concurrent application.

    Style and approach

    A complete guide implementing real-world examples with algorithms related to machine learning, data mining, and natural language processing in client/server environments. All the examples are explained in a step-by-step approach.

    Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

    Table of Contents

    1. Mastering Concurrency Programming with Java 8
      1. Table of Contents
      2. Mastering Concurrency Programming with Java 8
      3. Credits
      4. About the Author
      5. About the Reviewers
      6. www.PacktPub.com
        1. eBooks, discount offers, and more
          1. Why subscribe?
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. eBooks, discount offers, and more
          5. Questions
      8. 1. The First Step – Concurrency Design Principles
        1. Basic concurrency concepts
          1. Concurrency versus parallelism
          2. Synchronization
          3. Immutable object
          4. Atomic operations and variables
          5. Shared memory versus message passing
        2. Possible problems in concurrent applications
          1. Data race
          2. Deadlock
          3. Livelock
          4. Resource starvation
          5. Priority inversion
        3. A methodology to design concurrent algorithms
          1. The starting point – a sequential version of the algorithm
          2. Step 1 – analysis
          3. Step 2 – design
          4. Step 3 – implementation
          5. Step 4 – testing
          6. Step 5 – tuning
          7. Conclusion
        4. Java concurrency API
          1. Basic concurrency classes
          2. Synchronization mechanisms
          3. Executors
          4. The Fork/Join framework
          5. Parallel streams
          6. Concurrent data structures
        5. Concurrency design patterns
          1. Signaling
          2. Rendezvous
          3. Mutex
          4. Multiplex
          5. Barrier
          6. Double-checked locking
          7. Read-write lock
          8. Thread pool
          9. Thread local storage
        6. The Java memory model
        7. Tips and tricks to design concurrent algorithms
          1. Identify the correct independent tasks
          2. Implement concurrency at the highest possible level
          3. Take scalability into account
          4. Use thread-safe APIs
          5. Never assume an execution order
          6. Prefer local thread variables over static and shared when possible
          7. Find the more easily parallelizable version of the algorithm
          8. Using immutable objects when possible
          9. Avoiding deadlocks by ordering the locks
          10. Using atomic variables instead of synchronization
          11. Holding locks for as short a time as possible
          12. Taking precautions using lazy initialization
          13. Avoiding the use of blocking operations inside a critical section
        8. Summary
      9. 2. Managing Lots of Threads – Executors
        1. An introduction to executors
          1. Basic characteristics of executors
          2. Basic components of the executor framework
        2. First example – the k-nearest neighbors algorithm
          1. K-nearest neighbors – serial version
          2. K-nearest neighbors – a fine-grained concurrent version
          3. K-nearest neighbors – a coarse-grained concurrent version
          4. Comparing the solutions
        3. The second example – concurrency in a client/server environment
          1. Client/server – serial version
            1. The DAO part
            2. The command part
            3. The server part
          2. Client/server – parallel version
            1. The server part
            2. The command part
            3. Extra components of the concurrent server
              1. The status command
              2. The cache system
              3. The log system
        4. Comparing the two solutions
        5. Other methods of interest
        6. Summary
      10. 3. Getting the Maximum from Executors
        1. Advanced characteristics of executors
          1. Cancellation of tasks
          2. Scheduling the execution of tasks
          3. Overriding the executor methods
          4. Changing some initialization parameters
        2. The first example – an advanced server application
          1. The ServerExecutor class
            1. The statistics object
            2. The rejected task controller
            3. The executor tasks
            4. The executor
          2. The command classes
            1. The ConcurrentCommand class
            2. Concrete commands
          3. The server part
            1. The ConcurrentServer class
            2. The RequestTask class
          4. The client part
        3. The second example – executing periodic tasks
          1. The common parts
          2. The basic reader
          3. The advanced reader
        4. Additional information about executors
        5. Summary
      11. 4. Getting Data from the Tasks – The Callable and Future Interfaces
        1. Introducing the Callable and Future interfaces
          1. The Callable interface
          2. The Future interface
        2. First example – a best-matching algorithm for words
          1. The common classes
          2. A best-matching algorithm – the serial version
            1. The BestMatchingSerialCalculation class
            2. The BestMachingSerialMain class
          3. A best-matching algorithm – the first concurrent version
            1. The BestMatchingBasicTask class
            2. The BestMatchingBasicConcurrentCalculation class
          4. A best-matching algorithm – the second concurrent version
          5. The word exists algorithm – a serial version
            1. The ExistSerialCalculation class
            2. The ExistSerialMain class
          6. The word exists algorithm – the concurrent version
            1. The ExistBasicTasks class
            2. The ExistBasicConcurrentCalculation class
            3. The ExistBasicConcurrentMain class
          7. Comparing the solutions
            1. Best-matching algorithms
            2. Exist algorithms
        3. The second example – creating an inverted index for a collection of documents
          1. Common classes
            1. The Document class
            2. The DocumentParser class
          2. The serial version
          3. The first concurrent version – a task per document
            1. The IndexingTask class
            2. The InvertedIndexTask class
            3. The ConcurrentIndexing class
          4. The second concurrent version – multiple documents per task
            1. The MultipleIndexingTask class
            2. The MultipleInvertedIndexTask class
            3. The MultipleConcurrentIndexing class
          5. Comparing the solutions
          6. Other methods of interest
        4. Summary
      12. 5. Running Tasks Divided into Phases – The Phaser Class
        1. An introduction to the Phaser class
          1. Registration and deregistration of participants
          2. Synchronizing phase changes
          3. Other functionalities
        2. First example – a keyword extraction algorithm
          1. Common classes
            1. The Word class
            2. The Keyword class
            3. The Document class
            4. The DocumentParser class
          2. The serial version
          3. The concurrent version
            1. The KeywordExtractionTask class
            2. The ConcurrentKeywordExtraction class
          4. Comparing the two solutions
        3. The second example – a genetic algorithm
          1. Common classes
            1. The Individual class
            2. The GeneticOperators class
          2. The serial version
            1. The SerialGeneticAlgorithm class
            2. The SerialMain class
          3. The concurrent version
            1. The SharedData class
            2. The GeneticPhaser class
            3. The ConcurrentGeneticTask class
            4. The ConcurrentGeneticAlgorithm class
            5. The ConcurrentMain class
          4. Comparing the two solutions
            1. The Lau15 dataset
            2. The Kn57 dataset
            3. Conclusions
        4. Summary
      13. 6. Optimizing Divide and Conquer Solutions – The Fork/Join Framework
        1. An introduction to the Fork/Join framework
          1. Basic characteristics of the Fork/Join framework
          2. Limitations of the Fork/Join framework
          3. Components of the Fork/Join framework
        2. The first example – the k-means clustering algorithm
          1. The common classes
            1. The VocabularyLoader class
            2. The Word, Document, and DocumentLoader classes
            3. The DistanceMeasurer class
            4. The DocumentCluster class
          2. The serial version
            1. The SerialKMeans class
          3. The concurrent version
            1. Two tasks for the Fork/Join framework – AssignmentTask and UpdateTask
            2. The ConcurrentKMeans class
            3. The ConcurrentMain class
          4. Comparing the solutions
        3. The second example – a data filtering algorithm
          1. Common parts
          2. The serial version
            1. The SerialSearch class
            2. The SerialMain class
          3. The concurrent version
            1. The TaskManager class
            2. The IndividualTask class
            3. The ListTask class
            4. The ConcurrentSearch class
            5. The ConcurrentMain class
          4. Comparing the two versions
        4. The third example – the merge sort algorithm
          1. Shared classes
          2. The serial version
            1. The SerialMergeSort class
            2. The SerialMetaData class
          3. The concurrent version
            1. The MergeSortTask class
            2. The ConcurrentMergeSort class
            3. The ConcurrentMetaData class
          4. Comparing the two versions
        5. Other methods of the Fork/Join framework
        6. Summary
      14. 7. Processing Massive Datasets with Parallel Streams – The Map and Reduce Model
        1. An introduction to streams
          1. Basic characteristics of streams
          2. Sections of a stream
            1. Sources of a stream
            2. Intermediate operations
            3. Terminal operations
          3. MapReduce versus MapCollect
        2. The first example – a numerical summarization application
          1. The concurrent version
            1. The ConcurrentDataLoader class
            2. The ConcurrentStatistics class
              1. Job information from subscribers
              2. Age data from subscribers
              3. Marital data from subscribers
              4. Campaign data from nonsubscribers
              5. Multiple data filter
              6. Duration data from nonsubscribers
              7. People aged between 25 and 50
            3. The ConcurrentMain class
          2. The serial version
          3. Comparing the two versions
        3. The second example – an information retrieval search tool
          1. An introduction to the reduction operation
          2. The first approach – full document query
            1. The basicMapper() method
            2. The Token class
            3. The QueryResult class
          3. The second approach – reduced document query
            1. The limitedMapper() method
          4. The third approach – generating an HTML file with the results
            1. The ContentMapper class
          5. The fourth approach – preloading the inverted index
            1. The ConcurrentFileLoader class
          6. The fifth approach – using our own executor
          7. Getting data from the inverted index – the ConcurrentData class
          8. Getting the number of words in a file
          9. Getting the average tfxidf value in a file
          10. Getting the maximum and minimum tfxidf values in the index
          11. The ConcurrentMain class
          12. The serial version
          13. Comparing the solutions
        4. Summary
      15. 8. Processing Massive Datasets with Parallel Streams – The Map and Collect Model
        1. Using streams to collect data
          1. The collect() method
        2. The first example – searching data without an index
          1. Basic classes
            1. The Product class
            2. The Review class
            3. The ProductLoader class
          2. The first approach – basic search
            1. The ConcurrentStringAccumulator class
          3. The second approach – advanced search
            1. The ConcurrentObjectAccumulator class
          4. A serial implementation of the example
          5. Comparing the implementations
        3. The second example – a recommendation system
          1. Common classes
            1. The ProductReview class
            2. The ProductRecommendation class
          2. The recommendation system – the main class
          3. The ConcurrentLoaderAccumulator class
          4. The serial version
          5. Comparing the two versions
        4. The third example – common contacts in a social network
          1. Base classes
            1. The Person class
            2. The PersonPair class
            3. The DataLoader class
          2. The concurrent version
            1. The CommonPersonMapper class
            2. The ConcurrentSocialNetwork class
            3. The ConcurrentMain class
          3. The serial version
            1. Comparing the two versions
        5. Summary
      16. 9. Diving into Concurrent Data Structures and Synchronization Utilities
        1. Concurrent data structures
          1. Blocking and non-blocking data structures
            1. Interfaces
              1. BlockingQueue
              2. BlockingDeque
              3. ConcurrentMap
              4. TransferQueue
            2. Classes
              1. LinkedBlockingQueue
              2. ConcurrentLinkedQueue
              3. LinkedBlockingDeque
              4. ConcurrentLinkedDeque
              5. ArrayBlockingQueue
              6. DelayQueue
              7. LinkedTransferQueue
              8. PriorityBlockingQueue
              9. ConcurrentHashMap
          2. Using the new features
            1. First example with ConcurrentHashMap
              1. The forEach() method
              2. The search() method
              3. The reduce() method
              4. The compute() method
            2. Another example with ConcurrentHashMap
            3. An example with the ConcurrentLinkedDeque class
              1. The removeIf() method
              2. The spliterator() method
          3. Atomic variables
        2. Synchronization mechanisms
          1. The CommonTask class
          2. The Lock interface
          3. The Semaphore class
          4. The CountDownLatch class
          5. The CyclicBarrier class
          6. The CompletableFuture class
            1. Using the CompletableFuture class
              1. Auxiliary tasks
              2. The main() method
        3. Summary
      17. 10. Integration of Fragments and Implementation of Alternatives
        1. Big-block synchronization mechanisms
        2. An example of a document clustering application
          1. The four systems of k-means clustering
            1. The Reader system
            2. The Indexer system
            3. The Mapper system
            4. The Clustering system
          2. The main class of the document clustering application
          3. Testing our document clustering application
        3. Implementation of alternatives with concurrent programming
          1. The k-nearest neighbors' algorithm
          2. Building an inverted index of a collection of documents
          3. A best-matching algorithm for words
          4. A genetic algorithm
          5. A keyword extraction algorithm
          6. A k-means clustering algorithm
          7. A filtering data algorithm
          8. Searching an inverted index
          9. A numeric summarization algorithm
          10. A search algorithm without indexing
          11. A recommendation system using the Map and Collect model
        4. Summary
      18. 11. Testing and Monitoring Concurrent Applications
        1. Monitoring concurrency objects
          1. Monitoring a thread
          2. Monitoring a lock
          3. Monitoring an executor
          4. Monitoring the Fork/Join framework
          5. Monitoring a Phaser
          6. Monitoring a stream
        2. Monitoring concurrency applications
          1. The Overview tab
          2. The Monitor tab
          3. The Threads tab
          4. The Sampler tab
          5. The Profiler tab
        3. Testing concurrency applications
          1. Testing concurrent applications with MultithreadedTC
          2. Testing concurrent applications with Java Pathfinder
            1. Installing Java Pathfinder
            2. Running Java Pathfinder
        4. Summary
      19. Index