MPEG has recently completed the work on Compact Descriptors for Visual Search (CDVS), which enables efficient search in large-scale image collections. While this standard is an important step forward, there are still open challenges from the large and quickly growing amount of video, for example, in the media and entertainment industry, in the automotive industry and in surveillance applications. Video is more than a collection of images, and the temporal redundancy of video as well as the spatiotemporal behaviour of objects in the video need to be taken into account.
CDVA aims at developing tools to analyse and manage video content, including search for object instances in video, categorisation of scenes and content grouping, based on compact descriptors for video, which can be efficiently matched and indexed for large-scale video collections. The ongoing work on CVDA targets search and retrieval applications, aiming to find a specific object instance in a very large video database (e.g., a specific building, a product). Applications include for example content management in media production, linking to objects in interactive media services and surveillance.
The data captured in CDVA descriptors will make it possible to include media streams and content in Big Data analyses – MPEG refers to this as “Big Media”.