Graph-Based Rhythm Interpretation in Optical Music Recognition

<p> Optical Music Recognition (OMR) is a process that automatically converts the image of a music score into symbolic data. OMR can be divided into two main steps: recognition, the goal of which is to recognize &ldquo;valid&rdquo; music symbols, and interpretation, to understand the mu...

Full description

Bibliographic Details
Main Author: Jin, Rong
Language:EN
Published: Indiana University 2017
Subjects:
Online Access:http://pqdtopen.proquest.com/#viewpdf?dispub=10642136
Description
Summary:<p> Optical Music Recognition (OMR) is a process that automatically converts the image of a music score into symbolic data. OMR can be divided into two main steps: recognition, the goal of which is to recognize &ldquo;valid&rdquo; music symbols, and interpretation, to understand the music meaning, such as pitch and rhythm. We focus on the interpretation problem, and more specifically, rhythm interpretation on piano scores. </p><p> In this thesis work, we propose a graph-based algorithm, which interprets rhythm by building a <i>rhythm graph</i> on all symbols in a system measure. Our approach represents the notes and rests in a system measure as the vertices of a graph. Then we build the graph by adding <i>voice edges </i> and <i>coincidence edges</i> between pairs of vertices. The graph is constructed under the constraint such that it leads to a meaningful rhythm interpretation. We score the graph based on music notation rules and choose the graph that has the best score. The problem is thus converted into a constrained optimization problem of finding the graph with the highest score. The rhythmic interpretation follows simply from the connected rhythm graph. </p><p> To evaluate the graph-based algorithm, we perform an experiment on a dataset specifically built to cover different types of rhythmic challenges encountered in polyphonic piano scores. We conclude that our algorithm is capable of applying measure level notation rules and finding the globally optimal interpretation, even in examples with splitting and merging voices as well as missing tuplets.</p><p>