Main Menu

Research & Integration


WP2: Evaluation, Integration and Standards
WP3: Visual Content Indexing
WP4: Content Description for Audio, Speech and Text
WP5: Multimodal Processing and Interaction
WP6: Machine Learning and Computation Applied to Multimedia
WP7: Dissemination towards Industry

Logo IST

E-teams of WP4

          Semantic from Audio (E-team 7)

    This eTeam will collaborate on different methods for audio feature extraction and their appliance in both supervized classification as well as unsuperviced organization as a means to access and explore audio holdings such as sound archives or, particularly, music. The eTeam partners have strong expertise on extracting descriptors from audio data, specialized on music, instruments and other sounds. Moreover, there is also expertise on text mining and therefore textual genre analysis will be combined with the audio-based approaches. Furthermore, the partners have core competencies in the application of machine learning techniques for the analysis and structuring of content, and subsequent visualization in 2D as well as 3D environments.
    Furthermore, the feature sets evaluated in the classification activities will be employed in unsupervized machine learning tasks in order to provide an automatic clustering of audio archives, which in turn serves as an interface for browsing and exploration. eTeam partners will bring in expertise on visualization, providing intuitive interfaces for both 2D visualizations as well as interactive 3D environments for future access models to audio archives.
    Detailed Information: Semantic from Audio and Genre Classification for Music

    We will also investigate and develop special methods for extracting features sets in audio and musical signals. We will also try to make a sort of "quasi"-object detection in audio signals. These extracted features will be used for several applications as music/instruments classification and retrieval (TU Wien, AUTH, ISTI-CNR), dynamic optimization of a digital signal processing algorithm for an enhancement of recorded music (ISTI-CNR, CUED Cambridge), etc. We will also work on the extension of a C++ library for audio processing and synthesis, which could be used for optimizing the feature extraction processes and, more in general, for audio/musical applications (ISTI-CNR, UNIS).
    Detailed Information: Semantic from Audio: features, perception and synthesis

    Integration of structural and semantic models for multimedia metadata management (E-team 1)

    Overall, the purpose of the E-Team is to facilitate and promote communication and exchange. Since the partners in this E-Team have different areas of interest and expertise, we plan to discuss the difficulties in extracting and integrating multimedia data and metadata from different modes (text, images, video).
    The broad questions that we aim to consider are:
    1. What are the different requirements for recording and storing media?
    2. What are the outcomes/outputs from analysing different media?
    3. What is the analysis process/workflow for your media?
    4. What standards are used? What are their limitations or strengths? (e.g. DC, MPEG-7)
    5. How are annotations defined and used? Specifically what type of annotations and how are they captured or extracted?
    Detailed Information: Integration of structural and semantic models for multimedia metadata management