Multi-objective optimization for model selection in music classification

With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which...

Full description

Bibliographic Details
Main Author: Ujihara, Rintaro
Format: Others
Language:English
Published: KTH, Optimeringslära och systemteori 2021
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370
id ndltd-UPSALLA1-oai-DiVA.org-kth-298370
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-kth-2983702021-07-07T05:24:34ZMulti-objective optimization for model selection in music classificationengFlermålsoptimering för modellval i musikklassificeringUjihara, RintaroKTH, Optimeringslära och systemteori2021Music emotion recognitionMel spectrogramMFCCCENSOnsetTonnetzHPSS1D convolutional neural networkAttention LSTM1DCNN BiLSTMPareto optimalityMathematicsMatematikWith the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds. I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370TRITA-SCI-GRU ; 2021:221application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Music emotion recognition
Mel spectrogram
MFCC
CENS
Onset
Tonnetz
HPSS
1D convolutional neural network
Attention LSTM
1DCNN BiLSTM
Pareto optimality
Mathematics
Matematik
spellingShingle Music emotion recognition
Mel spectrogram
MFCC
CENS
Onset
Tonnetz
HPSS
1D convolutional neural network
Attention LSTM
1DCNN BiLSTM
Pareto optimality
Mathematics
Matematik
Ujihara, Rintaro
Multi-objective optimization for model selection in music classification
description With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds. === I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder.
author Ujihara, Rintaro
author_facet Ujihara, Rintaro
author_sort Ujihara, Rintaro
title Multi-objective optimization for model selection in music classification
title_short Multi-objective optimization for model selection in music classification
title_full Multi-objective optimization for model selection in music classification
title_fullStr Multi-objective optimization for model selection in music classification
title_full_unstemmed Multi-objective optimization for model selection in music classification
title_sort multi-objective optimization for model selection in music classification
publisher KTH, Optimeringslära och systemteori
publishDate 2021
url http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-298370
work_keys_str_mv AT ujihararintaro multiobjectiveoptimizationformodelselectioninmusicclassification
AT ujihararintaro flermalsoptimeringformodellvalimusikklassificering
_version_ 1719415791325020160