Working with Video

When working with video we must consider several functions, including (of course) how to read and write video files. We must also think about how to actually play back such files on the screen.

The first thing we need is the CvCapture device. This structure contains the information needed for reading frames from a camera or video file. Depending on the source, we use one of two different calls to create and initialize a CvCapture structure.

CvCapture* cvCreateFileCapture( const char* filename );
CvCapture* cvCreateCameraCapture( int index );

In the case of cvCreateFileCapture(), we can simply give a filename for an MPG or AVI file and OpenCV will open the file and prepare to read it. If the open is successful and we are able to start reading frames, a pointer to an initialized CvCapture structure will be returned.

A lot of people don't always check these sorts of things, thinking that nothing will go wrong. Don't do that here. The returned pointer will be NULL if for some reason the file could not be opened (e.g., if the file does not exist), but cvCreateFileCapture() will also return a NULL pointer if the codec with which the video is compressed is not known. The subtleties of compression codecs are beyond the scope of this book, but in general you will need to have the appropriate library already resident on your computer in order to successfully read the video file. For example, if you want to read a file encoded with DIVX or MPG4 compression on a Windows machine, there are specific DLLs that provide the necessary resources to decode the video. This is why it is always important to check the return value of cvCreateFileCapture(), because even if it works on one machine (where the needed DLL is available) it might not work on another machine (where that codec DLL is missing). Once we have the CvCapture structure, we can begin reading frames and do a number of other things. But before we get into that, let's take a look at how to capture images from a camera.

The routine cvCreateCameraCapture() works very much like cvCreateFileCapture() except without the headache from the codecs.[45] In this case we give an identifier that indicates which camera we would like to access and how we expect the operating system to talk to that camera. For the former, this is just an identification number that is zero (0) when we only have one camera, and increments upward when there are multiple cameras on the same system. The other part of the identifier is called the domain of the camera and indicates (in essence) what type of camera we have. The domain can be any of the predefined constants shown in Table 4-3.

Table 4-3. Camera "domain" indicates where HighGUI should look for your camera

Camera capture constan

Numerical value

CV_CAP_ANY

0

CV_CAP_MIL

100

CV_CAP_VFW

200

CV_CAP_V4L

200

CV_CAP_V4L2

200

CV_CAP_FIREWIRE

300

CV_CAP_IEEE1394

300

CV_CAP_DC1394

300

CV_CAP_CMU1394

300

When we call cvCreateCameraCapture(), we pass in an identifier that is just the sum of the domain index and the camera index. For example:

CvCapture* capture = cvCreateCameraCapture( CV_CAP_FIREWIRE );

In this example, cvCreateCameraCapture() will attempt to open the first (i.e., number-zero) Firewire camera. In most cases, the domain is unnecessary when we have only one camera; it is sufficient to use CV_CAP_ANY (which is conveniently equal to 0, so we don't even have to type that in). One last useful hint before we move on: you can pass -1 to cvCreateCameraCapture(), which will cause OpenCV to open a window that allows you to select the desired camera.

Reading Video

int        cvGrabFrame( CvCapture* capture );
IplImage*  cvRetrieveFrame( CvCapture* capture );
IplImage*  cvQueryFrame( CvCapture* capture );

Once you have a valid CvCapture object, you can start grabbing frames. There are two ways to do this. One way is to call cvGrabFrame(), which takes the CvCapture* pointer and returns an integer. This integer will be 1 if the grab was successful and 0 if the grab failed. The cvGrabFrame() function copies the captured image to an internal buffer that is invisible to the user. Why would you want OpenCV to put the frame somewhere you can't access it? The answer is that this grabbed frame is unprocessed, and cvGrabFrame() is designed simply to get it onto the computer as quickly as possible.

Once you have called cvGrabFrame(), you can then call cvRetrieveFrame(). This function will do any necessary processing on the frame (such as the decompression stage in the codec) and then return an IplImage* pointer that points to another internal buffer (so do not rely on this image, because it will be overwritten the next time you call cvGrabFrame()). If you want to do anything in particular with this image, copy it elsewhere first. Because this pointer points to a structure maintained by OpenCV itself, you are not required to release the image and can expect trouble if you do so.

Having said all that, there is a somewhat simpler method called cvQueryFrame(). This is, in effect, a combination of cvGrabFrame() and cvRetrieveFrame(); it also returns the same IplImage* pointer as cvRetrieveFrame() did.

It should be noted that, with a video file, the frame is automatically advanced whenever a cvGrabFrame() call is made. Hence a subsequent call will retrieve the next frame automatically.

Once you are done with the CvCapture device, you can release it with a call to cvReleaseCapture(). As with most other de-allocators in OpenCV, this routine takes a pointer to the CvCapture* pointer:

void cvReleaseCapture( CvCapture** capture );

There are many other things we can do with the CvCapture structure. In particular, we can check and set various properties of the video source:

double cvGetCaptureProperty(
  CvCapture* capture,
  int property_id
);

int cvSetCaptureProperty(
  CvCapture* capture,
  int        property_id,
  double     value
);

The routine cvGetCaptureProperty() accepts any of the property IDs shown in Table 4-4.

Table 4-4. Video capture properties used by cvGetCaptureProperty() and cvSetCaptureProperty()

Video capture property

Numerical value

CV_CAP_PROP_POS_MSEC

0

CV_CAP_PROP_POS_FRAME

1

CV_CAP_PROP_POS_AVI_RATIO

2

CV_CAP_PROP_FRAME_WIDTH

3

CV_CAP_PROP_FRAME_HEIGHT

4

CV_CAP_PROP_FPS

5

CV_CAP_PROP_FOURCC

6

CV_CAP_PROP_FRAME_COUNT

7

Most of these properties are self explanatory. POS_MSEC is the current position in a video file, measured in milliseconds. POS_FRAME is the current position in frame number. POS_AVI_RATIO is the position given as a number between 0 and 1 (this is actually quite useful when you want to position a trackbar to allow folks to navigate around your video). FRAME_WIDTH and FRAME_HEIGHT are the dimensions of the individual frames of the video to be read (or to be captured at the camera's current settings). FPS is specific to video files and indicates the number of frames per second at which the video was captured; you will need to know this if you want to play back your video and have it come out at the right speed. FOURCC is the four-character code for the compression codec to be used for the video you are currently reading. FRAME_COUNT should be the total number of frames in the video, but this figure is not entirely reliable.

All of these values are returned as type double, which is perfectly reasonable except for the case of FOURCC (FourCC) [FourCC85]. Here you will have to recast the result in order to interpret it, as described in Example 4-3.

Example 4-3. Unpacking a four-character code to identify a video codec

double f = cvGetCaptureProperty(
  capture,
  CV_CAP_PROP_FOURCC
);

char* fourcc = (char*) (&f);

For each of these video capture properties, there is a corresponding cvSetCapture Property() function that will attempt to set the property. These are not all entirely meaningful; for example, you should not be setting the FOURCC of a video you are currently reading. Attempting to move around the video by setting one of the position properties will work, but only for some video codecs (we'll have more to say about video codecs in the next section).

Writing Video

The other thing we might want to do with video is writing it out to disk. OpenCV makes this easy; it is essentially the same as reading video but with a few extra details.

First we must create a CvVideoWriter device, which is the video writing analogue of CvCapture. This device will incorporate the following functions.

CvVideoWriter* cvCreateVideoWriter(
  const char* filename,
  int         fourcc,
  double      fps,
  CvSize      frame_size,
  int         is_color = 1
);
int cvWriteFrame(
  CvVideoWriter*  writer,
  const IplImage* image
);
void cvReleaseVideoWriter(
  CvVideoWriter** writer
);

You will notice that the video writer requires a few extra arguments. In addition to the filename, we have to tell the writer what codec to use, what the frame rate is, and how big the frames will be. Optionally we can tell OpenCV if the frames are black and white or color (the default is color).

Here, the codec is indicated by its four-character code. (For those of you who are not experts in compression codecs, they all have a unique four-character identifier associated with them). In this case the int that is named fourcc in the argument list for cvCreateVideoWriter() is actually the four characters of the fourcc packed together. Since this comes up relatively often, OpenCV provides a convenient macro CV_FOURCC(c0,c1,c2,c3) that will do the bit packing for you.

Once you have a video writer, all you have to do is call cvWriteFrame() and pass in the CvVideoWriter* pointer and the IplImage* pointer for the image you want to write out.

Once you are finished, you must call CvReleaseVideoWriter() in order to close the writer and the file you were writing to. Even if you are normally a bit sloppy about de-allocating things at the end of a program, do not be sloppy about this. Unless you explicitly release the video writer, the video file to which you are writing may be corrupted.



[45] Of course, to be completely fair, we should probably confess that the headache caused by different codecs has been replaced by the analogous headache of determining which cameras are (or are not) supported on our system.

Get Learning OpenCV now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.