Abstract

Music is fleeting by nature, existing briefly before fading away. Yet more than one wants to preserve it over time and many music notation systems have been developed with this intention. However, most of these formats have been designed for humans to read them. Optical Music Recognition (OMR) is a research field dedicated to investigating how to computationally read music notation and create computer-readable scores from traditional scores. The typical OMR pipeline is divided into four stages: Image preprocessing, Music object detection, Semantic reconstruction, and Encoding. The third stage of the pipeline aims at reconstructing the semantics of the music notation, reestablishing connections between the objects detected previously. This work focuses on scores written in the most common notation system called Common Western Music Notation (CWMN). In this context, the semantics are defined by the configuration of the musical primitives (accidental, noteheads, or flags), how they are grouped and arranged defines their interactions and how the music should sound. Some research exhibits the graph-like property of the music and introduces the concept of Music Notation Graph (MuNG): graphs constructed with the music primitives as nodes and their relations as edges. This graph structure makes it a candidate for leveraging the power of Graph Neural Networks (GNN). This master thesis investigates how GNNs can be used to perform the music semantic reconstruction. We propose a novel pipeline for using GNNs in OMR and discuss a few unsolved problems of the field like how to measure and compare the output of an OMR system or how to define MuNGs.

Reference

de Lambertye, G. (2024). Music Semantic Reconstruction with Deep Learning : Learning How to Construct Music Notation Graphs With Graphs Neural Networks [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2024.120481