A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining

This article studies a composer style classification task based on raw sheet music images. While previous works on composer recognition have relied exclusively on supervised learning, we explore the use of self-supervised pretraining methods that have been recently developed for natural language pro...

Full description

Bibliographic Details
Main Authors: Daniel Yang, Kevin Ji, TJ Tsai
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/4/1387
id doaj-0c61fe7ed759493eb5581fb91732ada7
record_format Article
spelling doaj-0c61fe7ed759493eb5581fb91732ada72021-02-05T00:00:18ZengMDPI AGApplied Sciences2076-34172021-02-01111387138710.3390/app11041387A Deeper Look at Sheet Music Composer Classification Using Self-Supervised PretrainingDaniel Yang0Kevin Ji1TJ Tsai2Harvey Mudd College, 301 Platt Blvd, Claremont, CA 91711, USAHarvey Mudd College, 301 Platt Blvd, Claremont, CA 91711, USAHarvey Mudd College, 301 Platt Blvd, Claremont, CA 91711, USAThis article studies a composer style classification task based on raw sheet music images. While previous works on composer recognition have relied exclusively on supervised learning, we explore the use of self-supervised pretraining methods that have been recently developed for natural language processing. We first convert sheet music images to sequences of musical words, train a language model on a large set of unlabeled musical “sentences”, initialize a classifier with the pretrained language model weights, and then finetune the classifier on a small set of labeled data. We conduct extensive experiments on International Music Score Library Project (IMSLP) piano data using a range of modern language model architectures. We show that pretraining substantially improves classification performance and that Transformer-based architectures perform best. We also introduce two data augmentation strategies and present evidence that the model learns generalizable and semantically meaningful information.https://www.mdpi.com/2076-3417/11/4/1387sheet musicstyle recognitioncomposer identificationlanguage modelpretrainingself-supervised
collection DOAJ
language English
format Article
sources DOAJ
author Daniel Yang
Kevin Ji
TJ Tsai
spellingShingle Daniel Yang
Kevin Ji
TJ Tsai
A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining
Applied Sciences
sheet music
style recognition
composer identification
language model
pretraining
self-supervised
author_facet Daniel Yang
Kevin Ji
TJ Tsai
author_sort Daniel Yang
title A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining
title_short A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining
title_full A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining
title_fullStr A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining
title_full_unstemmed A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining
title_sort deeper look at sheet music composer classification using self-supervised pretraining
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-02-01
description This article studies a composer style classification task based on raw sheet music images. While previous works on composer recognition have relied exclusively on supervised learning, we explore the use of self-supervised pretraining methods that have been recently developed for natural language processing. We first convert sheet music images to sequences of musical words, train a language model on a large set of unlabeled musical “sentences”, initialize a classifier with the pretrained language model weights, and then finetune the classifier on a small set of labeled data. We conduct extensive experiments on International Music Score Library Project (IMSLP) piano data using a range of modern language model architectures. We show that pretraining substantially improves classification performance and that Transformer-based architectures perform best. We also introduce two data augmentation strategies and present evidence that the model learns generalizable and semantically meaningful information.
topic sheet music
style recognition
composer identification
language model
pretraining
self-supervised
url https://www.mdpi.com/2076-3417/11/4/1387
work_keys_str_mv AT danielyang adeeperlookatsheetmusiccomposerclassificationusingselfsupervisedpretraining
AT kevinji adeeperlookatsheetmusiccomposerclassificationusingselfsupervisedpretraining
AT tjtsai adeeperlookatsheetmusiccomposerclassificationusingselfsupervisedpretraining
AT danielyang deeperlookatsheetmusiccomposerclassificationusingselfsupervisedpretraining
AT kevinji deeperlookatsheetmusiccomposerclassificationusingselfsupervisedpretraining
AT tjtsai deeperlookatsheetmusiccomposerclassificationusingselfsupervisedpretraining
_version_ 1724284623182102528