Muscle - Real-Time Audio-Visual ...

Main Menu

Workpackages

WP2: Evaluation, Integration and Standards

WP3: Visual Content Indexing

WP4: Content Description for Audio, Speech and Text

WP5: Multimodal Processing and Interaction

WP6: Machine Learning and Computation Applied to Multimedia

WP7: Dissemination towards Industry

Real-Time Audio-Visual ...

Real-Time Audio-Visual Automatic Speech Recognition Demonstrator

Leader: Alexanderos Potamianos, TSI-TUC
Partners:

TSI-TUC (Alexandros Potamianos, Manolis Perakakis, Eduardo Sanchez-Soto, Fanis Kanetis)
ICCS-NTUA (Petros Maragos, George Papandreou, Nassos Katsamanis )
INRIA-IRISA (Patrick Gros, Guillaume Gravier)

Objective of the showcase project: One of the most promising approaches to improve the performance and extend the applicability of Automatic Speech Recognition (ASR) systems is to integrate visual information into the recognition process. Towards practically deployable AV-ASR, we build a proof-of-concept laptop-based AV-ASR prototype which: (i) uses consumer microphone and camera to capture the speaker; (ii) performs visual/audio feature extraction, as well as speech recognition on the laptop in real-time; (iii) is robust to failures of a single modality, such as visual occlusion of the speaker's face; and (iv) automatically adapts to changing acoustic noise levels.

Completed Work: Audio/Video capture, Face Tracking, Audio/Visual Front-End

Ongoing Work: Back-End, Fusion Module, Graphical User Interface, Integration

Video of the Real-Time Audio-Visual Automatic Speech Recognition Demonstrator Showcase

Popular

Main Menu

Workpackages

Real-Time Audio-Visual Automatic Speech Recognition Demonstrator