O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

OpenCV: Computer Vision Projects with Python

Book Description

Get savvy with OpenCV and actualize cool computer vision applications

About This Book

  • Use OpenCV's Python bindings to capture video, manipulate images, and track objects
  • Learn about the different functions of OpenCV and their actual implementations.
  • Develop a series of intermediate to advanced projects using OpenCV and Python

Who This Book Is For

This learning path is for someone who has a working knowledge of Python and wants to try out OpenCV. This Learning Path will take you from a beginner to an expert in computer vision applications using OpenCV. OpenCV’s application are humongous and this Learning Path is the best resource to get yourself acquainted thoroughly with OpenCV.

What You Will Learn

  • Install OpenCV and related software such as Python, NumPy, SciPy, OpenNI, and SensorKinect - all on Windows, Mac or Ubuntu
  • Apply "curves" and other color transformations to simulate the look of old photos, movies, or video games
  • Apply geometric transformations to images, perform image filtering, and convert an image into a cartoon-like image
  • Recognize hand gestures in real time and perform hand-shape analysis based on the output of a Microsoft Kinect sensor
  • Reconstruct a 3D real-world scene from 2D camera motion and common camera reprojection techniques
  • Detect and recognize street signs using a cascade classifier and support vector machines (SVMs)
  • Identify emotional expressions in human faces using convolutional neural networks (CNNs) and SVMs
  • Strengthen your OpenCV2 skills and learn how to use new OpenCV3 features

In Detail

OpenCV is a state-of-art computer vision library that allows a great variety of image and video processing operations. OpenCV for Python enables us to run computer vision algorithms in real time. This learning path proposes to teach the following topics. First, we will learn how to get started with OpenCV and OpenCV3’s Python API, and develop a computer vision application that tracks body parts. Then, we will build amazing intermediate-level computer vision applications such as making an object disappear from an image, identifying different shapes, reconstructing a 3D map from images , and building an augmented reality application, Finally, we’ll move to more advanced projects such as hand gesture recognition, tracking visually salient objects, as well as recognizing traffic signs and emotions on faces using support vector machines and multi-layer perceptrons respectively.

This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:

  • OpenCV Computer Vision with Python by Joseph Howse
  • OpenCV with Python By Example by Prateek Joshi
  • OpenCV with Python Blueprints by Michael Beyeler

Style and approach

This course aims to create a smooth learning path that will teach you how to get started with will learn how to get started with OpenCV and OpenCV 3's Python API, and develop superb computer vision applications. Through this comprehensive course, you'll learn to create computer vision applications from scratch to finish and more!.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. OpenCV: Computer Vision Projects with Python
    1. Table of Contents
    2. OpenCV: Computer Vision Projects with Python
    3. OpenCV: Computer Vision Projects with Python
    4. Credits
    5. Preface
      1. What this learning path covers
      2. What you need for this learning path
      3. Who this learning path is for
      4. Reader feedback
      5. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    6. 1. Module 1
      1. 1. Setting up OpenCV
        1. Choosing and using the right setup tools
          1. Making the choice on Windows XP, Windows Vista, Windows 7, or Windows 8
            1. Using binary installers (no support for depth cameras)
            2. Using CMake and compilers
          2. Making the choice on Mac OS X Snow Leopard, Mac OS X Lion, or Mac OS X Mountain Lion
            1. Using MacPorts with ready-made packages
            2. Using MacPorts with your own custom packages
            3. Using Homebrew with ready-made packages (no support for depth cameras)
            4. Using Homebrew with your own custom packages
          3. Making the choice on Ubuntu 12.04 LTS or Ubuntu 12.10
            1. Using the Ubuntu repository (no support for depth cameras)
            2. Using CMake via a ready-made script that you may customize
          4. Making the choice on other Unix-like systems
        2. Running samples
        3. Finding documentation, help, and updates
        4. Summary
      2. 2. Handling Files, Cameras, and GUIs
        1. Basic I/O scripts
          1. Reading/Writing an image file
          2. Converting between an image and raw bytes
          3. Reading/Writing a video file
          4. Capturing camera frames
          5. Displaying camera frames in a window
        2. Project concept
        3. An object-oriented design
          1. Abstracting a video stream – managers.CaptureManager
          2. Abstracting a window and keyboard – managers.WindowManager
          3. Applying everything – cameo.Cameo
        4. Summary
      3. 3. Filtering Images
        1. Creating modules
        2. Channel mixing – seeing in Technicolor
          1. Simulating RC color space
          2. Simulating RGV color space
          3. Simulating CMV color space
        3. Curves – bending color space
          1. Formulating a curve
          2. Caching and applying a curve
          3. Designing object-oriented curve filters
          4. Emulating photo films
            1. Emulating Kodak Portra
            2. Emulating Fuji Provia
            3. Emulating Fuji Velvia
            4. Emulating cross-processing
        4. Highlighting edges
        5. Custom kernels – getting convoluted
        6. Modifying the application
        7. Summary
      4. 4. Tracking Faces with Haar Cascades
        1. Conceptualizing Haar cascades
        2. Getting Haar cascade data
        3. Creating modules
        4. Defining a face as a hierarchy of rectangles
        5. Tracing, cutting, and pasting rectangles
        6. Adding more utility functions
        7. Tracking faces
        8. Modifying the application
          1. Swapping faces in one camera feed
          2. Copying faces between camera feeds
        9. Summary
      5. 5. Detecting Foreground/Background Regions and Depth
        1. Creating modules
        2. Capturing frames from a depth camera
        3. Creating a mask from a disparity map
        4. Masking a copy operation
        5. Modifying the application
        6. Summary
      6. A. Integrating with Pygame
        1. Installing Pygame
        2. Documentation and tutorials
        3. Subclassing managers.WindowManager
        4. Modifying the application
        5. Further uses of Pygame
        6. Summary
      7. B. Generating Haar Cascades for Custom Targets
        1. Gathering positive and negative training images
        2. Finding the training executables
          1. On Windows
          2. On Mac, Ubuntu, and other Unix-like systems
        3. Creating the training sets and cascade
          1. Creating <negative_description>
          2. Creating <positive_description>
          3. Creating <binary_description> by running <opencv_createsamples>
          4. Creating <cascade> by running <opencv_traincascade>
        4. Testing and improving <cascade>
        5. Summary
    7. 2. Module 2
      1. 1. Detecting Edges and Applying Image Filters
        1. 2D convolution
        2. Blurring
          1. The size of the kernel versus the blurriness
        3. Edge detection
        4. Motion blur
          1. Under the hood
        5. Sharpening
          1. Understanding the pattern
        6. Embossing
        7. Erosion and dilation
          1. Afterthought
        8. Creating a vignette filter
          1. What's happening underneath?
          2. How do we move the focus around?
        9. Enhancing the contrast in an image
          1. How do we handle color images?
        10. Summary
      2. 2. Cartoonizing an Image
        1. Accessing the webcam
          1. Under the hood
        2. Keyboard inputs
          1. Interacting with the application
        3. Mouse inputs
          1. What's happening underneath?
        4. Interacting with a live video stream
          1. How did we do it?
        5. Cartoonizing an image
          1. Deconstructing the code
        6. Summary
      3. 3. Detecting and Tracking Different Body Parts
        1. Using Haar cascades to detect things
        2. What are integral images?
        3. Detecting and tracking faces
          1. Understanding it better
        4. Fun with faces
          1. Under the hood
        5. Detecting eyes
          1. Afterthought
        6. Fun with eyes
          1. Positioning the sunglasses
        7. Detecting ears
        8. Detecting a mouth
        9. It's time for a moustache
        10. Detecting a nose
        11. Detecting pupils
          1. Deconstructing the code
        12. Summary
      4. 4. Extracting Features from an Image
        1. Why do we care about keypoints?
        2. What are keypoints?
        3. Detecting the corners
        4. Good Features To Track
        5. Scale Invariant Feature Transform (SIFT)
        6. Speeded Up Robust Features (SURF)
        7. Features from Accelerated Segment Test (FAST)
        8. Binary Robust Independent Elementary Features (BRIEF)
        9. Oriented FAST and Rotated BRIEF (ORB)
        10. Summary
      5. 5. Creating a Panoramic Image
        1. Matching keypoint descriptors
          1. How did we match the keypoints?
          2. Understanding the matcher object
          3. Drawing the matching keypoints
        2. Creating the panoramic image
          1. Finding the overlapping regions
          2. Stitching the images
        3. What if the images are at an angle to each other?
          1. Why does it look stretched?
        4. Summary
      6. 6. Seam Carving
        1. Why do we care about seam carving?
        2. How does it work?
        3. How do we define "interesting"?
        4. How do we compute the seams?
        5. Can we expand an image?
        6. Can we remove an object completely?
          1. How did we do it?
        7. Summary
      7. 7. Detecting Shapes and Segmenting an Image
        1. Contour analysis and shape matching
        2. Approximating a contour
        3. Identifying the pizza with the slice taken out
        4. How to censor a shape?
        5. What is image segmentation?
          1. How does it work?
        6. Watershed algorithm
        7. Summary
      8. 8. Object Tracking
        1. Frame differencing
        2. Colorspace based tracking
        3. Building an interactive object tracker
        4. Feature based tracking
        5. Background subtraction
        6. Summary
      9. 9. Object Recognition
        1. Object detection versus object recognition
        2. What is a dense feature detector?
        3. What is a visual dictionary?
        4. What is supervised and unsupervised learning?
        5. What are Support Vector Machines?
          1. What if we cannot separate the data with simple straight lines?
        6. How do we actually implement this?
          1. What happened inside the code?
          2. How did we build the trainer?
        7. Summary
      10. 10. Stereo Vision and 3D Reconstruction
        1. What is stereo correspondence?
        2. What is epipolar geometry?
          1. Why are the lines different as compared to SIFT?
        3. Building the 3D map
        4. Summary
      11. 11. Augmented Reality
        1. What is the premise of augmented reality?
        2. What does an augmented reality system look like?
        3. Geometric transformations for augmented reality
        4. What is pose estimation?
        5. How to track planar objects?
          1. What happened inside the code?
        6. How to augment our reality?
          1. Mapping coordinates from 3D to 2D
          2. How to overlay 3D objects on a video?
          3. Let's look at the code
        7. Let's add some movements
        8. Summary
    8. 3. Module 3
      1. 1. Fun with Filters
        1. Planning the app
        2. Creating a black-and-white pencil sketch
          1. Implementing dodging and burning in OpenCV
          2. Pencil sketch transformation
        3. Generating a warming/cooling filter
          1. Color manipulation via curve shifting
          2. Implementing a curve filter by using lookup tables
          3. Designing the warming/cooling effect
        4. Cartoonizing an image
          1. Using a bilateral filter for edge-aware smoothing
          2. Detecting and emphasizing prominent edges
          3. Combining colors and outlines to produce a cartoon
        5. Putting it all together
          1. Running the app
          2. The GUI base class
            1. The GUI constructor
            2. Handling video streams
            3. A basic GUI layout
          3. A custom filter layout
        6. Summary
      2. 2. Hand Gesture Recognition Using a Kinect Depth Sensor
        1. Planning the app
        2. Setting up the app
          1. Accessing the Kinect 3D sensor
          2. Running the app
          3. The Kinect GUI
        3. Tracking hand gestures in real time
        4. Hand region segmentation
          1. Finding the most prominent depth of the image center region
          2. Applying morphological closing to smoothen the segmentation mask
          3. Finding connected components in a segmentation mask
        5. Hand shape analysis
          1. Determining the contour of the segmented hand region
          2. Finding the convex hull of a contour area
          3. Finding the convexity defects of a convex hull
        6. Hand gesture recognition
          1. Distinguishing between different causes of convexity defects
          2. Classifying hand gestures based on the number of extended fingers
        7. Summary
      3. 3. Finding Objects via Feature Matching and Perspective Transforms
        1. Tasks performed by the app
        2. Planning the app
        3. Setting up the app
          1. Running the app
          2. The FeatureMatching GUI
        4. The process flow
        5. Feature extraction
          1. Feature detection
          2. Detecting features in an image with SURF
        6. Feature matching
          1. Matching features across images with FLANN
          2. The ratio test for outlier removal
          3. Visualizing feature matches
          4. Homography estimation
          5. Warping the image
        7. Feature tracking
          1. Early outlier detection and rejection
        8. Seeing the algorithm in action
        9. Summary
      4. 4. 3D Scene Reconstruction Using Structure from Motion
        1. Planning the app
        2. Camera calibration
          1. The pinhole camera model
          2. Estimating the intrinsic camera parameters
            1. The camera calibration GUI
            2. Initializing the algorithm
            3. Collecting image and object points
            4. Finding the camera matrix
        3. Setting up the app
          1. The main function routine
          2. The SceneReconstruction3D class
        4. Estimating the camera motion from a pair of images
          1. Point matching using rich feature descriptors
          2. Point matching using optic flow
          3. Finding the camera matrices
          4. Image rectification
        5. Reconstructing the scene
        6. 3D point cloud visualization
        7. Summary
      5. 5. Tracking Visually Salient Objects
        1. Planning the app
        2. Setting up the app
          1. The main function routine
          2. The Saliency class
          3. The MultiObjectTracker class
        3. Visual saliency
          1. Fourier analysis
          2. Natural scene statistics
          3. Generating a Saliency map with the spectral residual approach
          4. Detecting proto-objects in a scene
        4. Mean-shift tracking
          1. Automatically tracking all players on a soccer field
          2. Extracting bounding boxes for proto-objects
          3. Setting up the necessary bookkeeping for mean-shift tracking
          4. Tracking objects with the mean-shift algorithm
        5. Putting it all together
        6. Summary
      6. 6. Learning to Recognize Traffic Signs
        1. Planning the app
        2. Supervised learning
          1. The training procedure
          2. The testing procedure
          3. A classifier base class
        3. The GTSRB dataset
          1. Parsing the dataset
        4. Feature extraction
          1. Common preprocessing
          2. Grayscale features
          3. Color spaces
          4. Speeded Up Robust Features
          5. Histogram of Oriented Gradients
        5. Support Vector Machine
          1. Using SVMs for Multi-class classification
          2. Training the SVM
          3. Testing the SVM
            1. Confusion matrix
            2. Accuracy
            3. Precision
            4. Recall
        6. Putting it all together
        7. Summary
      7. 7. Learning to Recognize Emotions on Faces
        1. Planning the app
        2. Face detection
          1. Haar-based cascade classifiers
          2. Pre-trained cascade classifiers
          3. Using a pre-trained cascade classifier
          4. The FaceDetector class
            1. Detecting faces in grayscale images
            2. Preprocessing detected faces
        3. Facial expression recognition
          1. Assembling a training set
            1. Running the screen capture
            2. The GUI constructor
            3. The GUI layout
            4. Processing the current frame
            5. Adding a training sample to the training set
            6. Dumping the complete training set to a file
          2. Feature extraction
            1. Preprocessing the dataset
            2. Principal component analysis
          3. Multi-layer perceptrons
            1. The perceptron
            2. Deep architectures
          4. An MLP for facial expression recognition
            1. Training the MLP
            2. Testing the MLP
            3. Running the script
        4. Putting it all together
        5. Summary
    9. A. Bibliography
    10. Index