In this paper, https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42455.pdf, the authors explore how CNNs could be used for large-scale video classification. In this use case, the neural networks have access to not only the appearance information in single, static images, but also the complex temporal evolution of the image. There are several challenges in extending and applying CNNs in this setting.
There are very few (or none) video classification benchmarks that match the scale and variety of existing image datasets as videos are significantly more challenging to collect, annotate, and store. To obtain sufficient amount of data needed to train our CNN architectures, ...