Objectives |
WP4: Content Description for Audio, Speech and Text The aim of this work-package is to prepare numerical multimedia data to knowledge extraction and multimodality semantics association. Tasks will focus on each single modality of text, audio (music, sound, speech), as well as combinations thereof to extract semantic concepts. As most of the approaches in this domain depend to a high degree on various machine learning techniques, there will be tight cooperation with WP5 on Learning and Computation. Furthermore, while the focus is on acoustic and textual modalities in this WP, cooperation with WP3 on Content Description for Image and Video and WP6 on Cross-Modal Integration for Multimedia Content will ensure a more solid extraction of semantic descriptors. Wherever possible, methods shall be evaluated on benchmark data sets available via WP2 on Evaluation, Integration and Standards. Concerning audio processing, we aim to:
Concerning speech processing, we aim to
Concerning text and natural language processing, we aim to:
|