Multi-objective optimization for model selection in music classification
With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
KTH, Optimeringslära och systemteori
2021
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370 |
id |
ndltd-UPSALLA1-oai-DiVA.org-kth-298370 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-kth-2983702021-07-07T05:24:34ZMulti-objective optimization for model selection in music classificationengFlermålsoptimering för modellval i musikklassificeringUjihara, RintaroKTH, Optimeringslära och systemteori2021Music emotion recognitionMel spectrogramMFCCCENSOnsetTonnetzHPSS1D convolutional neural networkAttention LSTM1DCNN BiLSTMPareto optimalityMathematicsMatematikWith the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds. I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370TRITA-SCI-GRU ; 2021:221application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Music emotion recognition Mel spectrogram MFCC CENS Onset Tonnetz HPSS 1D convolutional neural network Attention LSTM 1DCNN BiLSTM Pareto optimality Mathematics Matematik |
spellingShingle |
Music emotion recognition Mel spectrogram MFCC CENS Onset Tonnetz HPSS 1D convolutional neural network Attention LSTM 1DCNN BiLSTM Pareto optimality Mathematics Matematik Ujihara, Rintaro Multi-objective optimization for model selection in music classification |
description |
With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds. === I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder. |
author |
Ujihara, Rintaro |
author_facet |
Ujihara, Rintaro |
author_sort |
Ujihara, Rintaro |
title |
Multi-objective optimization for model selection in music classification |
title_short |
Multi-objective optimization for model selection in music classification |
title_full |
Multi-objective optimization for model selection in music classification |
title_fullStr |
Multi-objective optimization for model selection in music classification |
title_full_unstemmed |
Multi-objective optimization for model selection in music classification |
title_sort |
multi-objective optimization for model selection in music classification |
publisher |
KTH, Optimeringslära och systemteori |
publishDate |
2021 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370 |
work_keys_str_mv |
AT ujihararintaro multiobjectiveoptimizationformodelselectioninmusicclassification AT ujihararintaro flermalsoptimeringformodellvalimusikklassificering |
_version_ |
1719415791325020160 |