Fast Cover Song Retrieval in AAC Domain based on Deep Learning

碩士 === 國立中央大學 === 通訊工程學系 === 104 === With the increasing of multimedia data, it becomes more and more important to quickly search the interests from large databases. Keyword annotation is the traditional approach, but it needs large amount of manual effort to annotate the keyword. As the size of dat...

Full description

Bibliographic Details
Main Authors: Yu-ruey Chang, 張育瑞
Other Authors: 張寶基
Format: Others
Language:zh-TW
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/16552950178279839832
id ndltd-TW-104NCU05650009
record_format oai_dc
spelling ndltd-TW-104NCU056500092017-07-09T04:30:21Z http://ndltd.ncl.edu.tw/handle/16552950178279839832 Fast Cover Song Retrieval in AAC Domain based on Deep Learning 基於深度學習之AAC壓縮域翻唱歌快速檢索 Yu-ruey Chang 張育瑞 碩士 國立中央大學 通訊工程學系 104 With the increasing of multimedia data, it becomes more and more important to quickly search the interests from large databases. Keyword annotation is the traditional approach, but it needs large amount of manual effort to annotate the keyword. As the size of data increases, the keyword annotation approach becomes infeasible. Content-based retrieval is more natural, it extracts features from music content to create a representation that overcomes human labeling errors. This thesis focuses on the AAC file which is widely used by streaming internet sources. Here, the proposed system directly maps the modified discrete cosine transform coefficients (MDCT) into a 12-dimensional chroma feature. We combine frames to a segment as the input of deep learning, deep learning can automatically find more meaningful features of music data. We also applied sparse autoencoder to reduce dimensionality of songs. With these efforts, significant matching time can be saved. The experimental results show that the proposed method can reach 0.505 of mean reciprocal rank (MRR) and save over 70% matching time compared with conventional approaches. 張寶基 2015 學位論文 ; thesis 64 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 通訊工程學系 === 104 === With the increasing of multimedia data, it becomes more and more important to quickly search the interests from large databases. Keyword annotation is the traditional approach, but it needs large amount of manual effort to annotate the keyword. As the size of data increases, the keyword annotation approach becomes infeasible. Content-based retrieval is more natural, it extracts features from music content to create a representation that overcomes human labeling errors. This thesis focuses on the AAC file which is widely used by streaming internet sources. Here, the proposed system directly maps the modified discrete cosine transform coefficients (MDCT) into a 12-dimensional chroma feature. We combine frames to a segment as the input of deep learning, deep learning can automatically find more meaningful features of music data. We also applied sparse autoencoder to reduce dimensionality of songs. With these efforts, significant matching time can be saved. The experimental results show that the proposed method can reach 0.505 of mean reciprocal rank (MRR) and save over 70% matching time compared with conventional approaches.
author2 張寶基
author_facet 張寶基
Yu-ruey Chang
張育瑞
author Yu-ruey Chang
張育瑞
spellingShingle Yu-ruey Chang
張育瑞
Fast Cover Song Retrieval in AAC Domain based on Deep Learning
author_sort Yu-ruey Chang
title Fast Cover Song Retrieval in AAC Domain based on Deep Learning
title_short Fast Cover Song Retrieval in AAC Domain based on Deep Learning
title_full Fast Cover Song Retrieval in AAC Domain based on Deep Learning
title_fullStr Fast Cover Song Retrieval in AAC Domain based on Deep Learning
title_full_unstemmed Fast Cover Song Retrieval in AAC Domain based on Deep Learning
title_sort fast cover song retrieval in aac domain based on deep learning
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/16552950178279839832
work_keys_str_mv AT yurueychang fastcoversongretrievalinaacdomainbasedondeeplearning
AT zhāngyùruì fastcoversongretrievalinaacdomainbasedondeeplearning
AT yurueychang jīyúshēndùxuéxízhīaacyāsuōyùfānchànggēkuàisùjiǎnsuǒ
AT zhāngyùruì jīyúshēndùxuéxízhīaacyāsuōyùfānchànggēkuàisùjiǎnsuǒ
_version_ 1718494392432459776