Main Menu

Research & Integration


WP2: Evaluation, Integration and Standards
WP3: Visual Content Indexing
WP4: Content Description for Audio, Speech and Text
WP5: Multimodal Processing and Interaction
WP6: Machine Learning and Computation Applied to Multimedia
WP7: Dissemination towards Industry

Logo IST
Home arrow Research & Integration arrow Overview arrow Semantic from Audio and Genre Classification for Music

Semantic from Audio and Genre Classification for Music

Contact person: Andreas Rauber

This eTeam will collaborate on different methods for audio feature extraction and their appliance in both supervized classification as well as unsuperviced organization as a means to access and explore audio holdings such as sound archives or, particularly, music. The eTeam partners have strong expertise on extracting descriptors from audio data, specialized on music, instruments and other sounds. Moreover, there is also expertise on text mining and therefore textual genre analysis will be combined with the audio-based approaches. Furthermore, the partners have core competencies in the application of machine learning techniques for the analysis and structuring of content, and subsequent visualization in 2D as well as 3D environments.

The eTeam will be a concentrated effort on:

  • feature extraction from audio
  • music classification (based on audio, symbolic notations, text mining and combined approaches)
  • instrument or sound classification
  • organization of music/sound by perceptual similarity
  • visualizations and access interfaces for the exploration of sound collections in 2D and 3D

Furthermore, the feature sets evaluated in the classification activities will be employed in unsupervized machine learning tasks in order to provide an automatic clustering of audio archives, which in turn serves as an interface for browsing and exploration. eTeam partners will bring in expertise on visualization, providing intuitive interfaces for both 2D visualizations as well as interactive 3D environments for future access models to audio archives.

Results of the eTeam will be:

  • Evaluation of different features sets for the classification of music, instruments and other sounds
  • Evaluation of multi-modal approaches (combining textual, symbolic and audio features)
  • Evaluation of performance of those approaches on different classification methods
  • Evaluation of the feature sets on unsupervized machine learning
  • Visualization of audio archives by application of the feature sets
  • Development of Prototypes of interactive applications (2D + 3D) for novel browsing methods for audio archives.
  • Jointly written publications
  • Exchanges between the eTeam partners


  • audio feature extraction, music genre classification, instrument classification, clustering, visualization of music archives

Additonal info: E-Team's webpage