Main Menu

Research & Integration


WP2: Evaluation, Integration and Standards
WP3: Visual Content Indexing
WP4: Content Description for Audio, Speech and Text
WP5: Multimodal Processing and Interaction
WP6: Machine Learning and Computation Applied to Multimedia
WP7: Dissemination towards Industry

Logo IST
Home arrow News arrow Previous news arrow Movie Summarization...
Movie Summarization...

Movie Summarisation and Skimming Demonstrator

Leader: Petros Maragos, ICCS-NTUA

  • ICCS-NTUA, Greece (P. Maragos, K. Rapantzikos, G. Evangelopoulos, I. Avrithis)
  • AUTH, Greece (C. Kotropoulos, P. Antonopoulos, V. Moschou, N. Nikolaidis, I. Pitas)
  • INRIA-Texmex, France (P. Gros, X. Naturel)
  • TSI-TUC, Greece (A. Potamianos, E. Petrakis, M. Perakakis, M. Toutoudakis)

Objective: As the amount of video data available (movie, TV programs, clips) in a personal recorder or computer are becoming increasingly large (100h in VCRs or hundreds of hours on a PC) intelligent algorithms for efficiently representing video data and presenting them to the user are becoming important. Video summarization, movie summarization and movie skimming are increasingly popular research areas with immediate applications. We propose to: (i) use combined audio and video saliency detectors to identify the importance of movie content to the user and (ii) design an interface that presents the audio and video information to the user in a compressed form, thus saving time with little or no loss of information. The proposed demonstrator will have the ability to render a movie from its typical 2h duration down to 30’ by skimming over (fast forwarding or omitting) non-salient movie scenes while playback at regular speed parts of the movie with salient audio and video information. The interface will also have the ability to break the synchrony of the audio/video streams and selectively present audio or video information.

Description of work: The proposed system consists of the database server, the video middleware and the user client. The database server stores the video content, the video middleware analyzes/parses the video and the user client acts as the user interfacepresenting the semantic audio/video information to the user according to his/her preferences. The annotated movie database at AUTH will be used as the database backend and to train saliency models for the system. Video segmentation software that separates the movies into scenes will be designed by ICCS, AUTH and INRIA. Statistical models will be used for this purpose by INRIA. The audio-visual saliency detector will use both audio and video information at various time resolutions and will be designed jointly by ICCS and AUTH. Finally the user interface will be designed and implemented by TSI-TUC. This interface will give the user the ability to skim through the movie at various speeds and/or provide movie summaries. User preferences will also be taken into account.

The project deliverable will be a live demonstration of the movie summarizer/skimmer as well as a video demonstrating the various components of the system and example usage.