You are previewing Mastering Python High Performance.
O'Reilly logo
Mastering Python High Performance

Book Description

Measure, optimize, and improve the performance of your Python code with this easy-to-follow guide

About This Book

  • Master the do's and don'ts of Python performance programming

  • Learn how to use exiting new tools that will help you improve your scripts

  • A step-by-step, conceptual guide to teach you how to optimize and fine-tune your critical pieces of code

  • Who This Book Is For

    If you're a Python developer looking to improve the speed of your scripts or simply wanting to take your skills to the next level, then this book is perfect for you.

    What You Will Learn

  • Master code optimization step-by-step and learn how to use different tools

  • Understand what a profiler is and how to read its output

  • Interpret visual output from profiling tools and improve the performance of your script

  • Use Cython to create fast applications using Python and C

  • Take advantage of PyPy to improve performance of Python code

  • Optimize number-crunching code with NumPy, Numba, Parakeet, and Pandas

  • In Detail

    Simply knowing how to code is not enough; on mission-critical pieces of code, every bit of memory and every CPU cycle counts, and knowing how to squish every bit of processing power out of your code is a crucial and sought-after skill. Nowadays, Python is used for many scientific projects, and sometimes the calculations done in those projects require some serious fine-tuning. Profilers are tools designed to help you measure the performance of your code and help you during the optimization process, so knowing how to use them and read their output is very handy.

    This book starts from the basics and progressively moves on to more advanced topics. You’ll learn everything from profiling all the way up to writing a real-life application and applying a full set of tools designed to improve it in different ways. In the middle, you’ll stop to learn about the major profilers used in Python and about some graphic tools to help you make sense of their output. You’ll then move from generic optimization techniques onto Python-specific ones, going over the main constructs of the language that will help you improve your speed without much of a change. Finally, the book covers some number-crunching-specific libraries and how to use them properly to get the best speed out of them.

    After reading this book, you will know how to take any Python code, profile it, find out where the bottlenecks are, and apply different techniques to remove them.

    Style and approach

    This easy-to-follow, practical guide will help you enhance your optimization skills by improving real-world code.

    Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at If you purchased this book elsewhere, you can visit and register to have the files e-mailed directly to you.

    Table of Contents

    1. Mastering Python High Performance
      1. Table of Contents
      2. Mastering Python High Performance
      3. Credits
      4. About the Author
      5. About the Reviewers
        1. Support files, eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Downloading the color images of this book
          3. Errata
          4. Piracy
          5. Questions
      8. 1. Profiling 101
        1. What is profiling?
          1. Event-based profiling
          2. Statistical profiling
        2. The importance of profiling
        3. What can we profile?
          1. Execution time
          2. Where are the bottlenecks?
        4. Memory consumption and memory leaks
        5. The risk of premature optimization
        6. Running time complexity
          1. Constant time – O(1)
          2. Linear time – O(n)
          3. Logarithmic time – O(log n)
          4. Linearithmic time – O(nlog n)
          5. Factorial time – O(n!)
          6. Quadratic time – O(n^)
        7. Profiling best practices
          1. Build a regression-test suite
          2. Mind your code
          3. Be patient
          4. Gather as much data as you can
          5. Preprocess your data
          6. Visualize your data
        8. Summary
      9. 2. The Profilers
        1. Getting to know our new best friends: the profilers
          1. cProfile
          2. A note about limitations
          3. The API provided
          4. The Stats class
          5. Profiling examples
            1. Fibonacci again
            2. Tweet stats
        2. line_profiler
          1. kernprof
          2. Some things to consider about kernprof
          3. Profiling examples
            1. Back to Fibonacci
            2. Inverted index
              1. getOffsetUpToWord
              2. getWords
              3. list2dict
              4. readFileContent
              5. saveIndex
              6. __start__
              7. getOffsetUpToWord
              8. getWords
              9. list2dict
              10. saveIndex
        3. Summary
      10. 3. Going Visual – GUIs to Help Understand Profiler Output
        1. KCacheGrind – pyprof2calltree
          1. Installation
          2. Usage
          3. A profiling example – TweetStats
          4. A profiling example – Inverted Index
        2. RunSnakeRun
          1. Installation
          2. Usage
          3. Profiling examples – the lowest common multiplier
          4. A profiling example – search using the inverted index
        3. Summary
      11. 4. Optimize Everything
        1. Memoization / lookup tables
          1. Performing a lookup on a list or linked list
          2. Simple lookup on a dictionary
          3. Binary search
          4. Use cases for lookup tables
        2. Usage of default arguments
        3. List comprehension and generators
        4. ctypes
          1. Loading your own custom C library
          2. Loading a system library
        5. String concatenation
        6. Other tips and tricks
        7. Summary
      12. 5. Multithreading versus Multiprocessing
        1. Parallelism versus concurrency
          1. Multithreading
            1. Threads
              1. Creating a thread with the thread module
              2. Working with the threading module
              3. Interthread communication with events
          2. Multiprocessing
            1. Multiprocessing with Python
              1. Exit status
              2. Process pooling
              3. Interprocess communication
                1. Pipes
                2. Events
        2. Summary
      13. 6. Generic Optimization Options
        1. PyPy
          1. Installing PyPy
          2. A Just-in-time compiler
          3. Sandboxing
          4. Optimizing for the JIT
            1. Think of functions
            2. Consider using cStringIO to concatenate strings
            3. Actions that disable the JIT
          5. Code sample
        2. Cython
          1. Installing Cython
          2. Building a Cython module
          3. Calling C functions
            1. Solving naming conflicts
          4. Defining types
          5. Defining types during function definitions
          6. A Cython example
          7. When to define a type
          8. Limitations
            1. Generator expressions
            2. Comparison of char* literals
            3. Tuples as function arguments
            4. Stack frames
        3. How to choose the right option
          1. When to go with Cython
          2. When to go with PyPy
        4. Summary
      14. 7. Lightning Fast Number Crunching with Numba, Parakeet, and pandas
        1. Numba
          1. Installation
          2. Using Numba
            1. Numba's code generation
              1. Eager compilation
              2. Other configuration settings
                1. No GIL
                2. NoPython mode
            2. Running your code on the GPU
        2. The pandas tool
          1. Installing pandas
          2. Using pandas for data analysis
        3. Parakeet
          1. Installing Parakeet
          2. How does Parakeet work?
        4. Summary
      15. 8. Putting It All into Practice
        1. The problem to solve
          1. Getting data from the Web
          2. Postprocessing the data
        2. The initial code base
          1. Analyzing the code
            1. Scraper
          2. Analyzer
        3. Summary
      16. Index