Content-based Automatic Annotation and Preference Learning for Music Information Retrieval
博士 === 國立清華大學 === 資訊工程學系 === 99 === Music information retrieval received more and more attention in the past decades. The goal is to find songs, artists, or albums of users’ interests. In this thesis, we focus on two major retrieval approaches, automatic annotation and preference learning recommenda...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2011
|
Online Access: | http://ndltd.ncl.edu.tw/handle/58uttc |
id |
ndltd-TW-099NTHU5392043 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-099NTHU53920432019-05-15T20:41:44Z http://ndltd.ncl.edu.tw/handle/58uttc Content-based Automatic Annotation and Preference Learning for Music Information Retrieval 基於內容的自動標記與喜好學習應用於音樂資訊檢索 Chen, Edwardson 陳致生 博士 國立清華大學 資訊工程學系 99 Music information retrieval received more and more attention in the past decades. The goal is to find songs, artists, or albums of users’ interests. In this thesis, we focus on two major retrieval approaches, automatic annotation and preference learning recommendation systems. Rather than adopting query-by-example techniques (QBE), searching audio files by a set of semantic concept words is much more natural to associate with music. Such an approach, called query-by-semantic-description (QBSD), needs an accurate and automatic way to help people with tagging lots of audio files. To achieve this demand, we propose an automatic annotation system that uses anti-words for each annotation word based on the concept of supervised multi-class labeling (SML). More specifically, words that are highly associated with the opposite semantic meaning of a word constitute its anti-word set. By modeling both a word and its anti-word set, our annotation system can achieve higher mean per-word precision and recall than the original SML model. Moreover, by constructing the models of the anti-word explicitly, the performance is also significantly improved for the retrieval system. Another major approach for people to discover music is through recommendation which exists frequently in our daily life. Recommenders, such as Amazon, TiVo, and Netflix, adopt collaborative filtering (CF) which often suffers from the so called cold-start problem. However, content-based approach can alleviate this problem since it relies on audio contents instead of users’ past transactions. In the second part of this thesis, we propose a content-based artist recommendation system that can well-predict a user’s tastes. In particular, an artist is characterized by the corresponding acoustical model which is adapted from a universal background model (UBM) through maximum a posterior (MAP) adaptation. These acoustical features, together with their preference rankings, are then used for an ordinal regression algorithm that tries to find a ranking rule which can predict the rank of a new instance. Moreover, an order preserving projection (OPP) algorithm is proposed which is shown to have comparable results with an ordinal regression algorithm, PRank. The proposed linear OPP can also be kernelized to learn the potential nonlinear relationship between music contents and users’ artist rank orders. By introducing the kernel method, we can also efficiently fuse acoustical and symbolic features, i.e. annotation words, under the proposed framework. Experimental results show that the system can successfully predict the user’s tastes and achieve better performance whether using non-linear algorithms of OPP or fusing acoustical and symbolic features. Jang, Jyh-Shing Roger 張智星 2011 學位論文 ; thesis 83 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立清華大學 === 資訊工程學系 === 99 === Music information retrieval received more and more attention in the past decades. The goal is to find songs, artists, or albums of users’ interests. In this thesis, we focus on two major retrieval approaches, automatic annotation and preference learning recommendation systems. Rather than adopting query-by-example techniques (QBE), searching audio files by a set of semantic concept words is much more natural to associate with music. Such an approach, called query-by-semantic-description (QBSD), needs an accurate and automatic way to help people with tagging lots of audio files. To achieve this demand, we propose an automatic annotation system that uses anti-words for each annotation word based on the concept of supervised multi-class labeling (SML). More specifically, words that are highly associated with the opposite semantic meaning of a word constitute its anti-word set. By modeling both a word and its anti-word set, our annotation system can achieve higher mean per-word precision and recall than the original SML model. Moreover, by constructing the models of the anti-word explicitly, the performance is also significantly improved for the retrieval system. Another major approach for people to discover music is through recommendation which exists frequently in our daily life. Recommenders, such as Amazon, TiVo, and Netflix, adopt collaborative filtering (CF) which often suffers from the so called cold-start problem. However, content-based approach can alleviate this problem since it relies on audio contents instead of users’ past transactions. In the second part of this thesis, we propose a content-based artist recommendation system that can well-predict a user’s tastes. In particular, an artist is characterized by the corresponding acoustical model which is adapted from a universal background model (UBM) through maximum a posterior (MAP) adaptation. These acoustical features, together with their preference rankings, are then used for an ordinal regression algorithm that tries to find a ranking rule which can predict the rank of a new instance. Moreover, an order preserving projection (OPP) algorithm is proposed which is shown to have comparable results with an ordinal regression algorithm, PRank. The proposed linear OPP can also be kernelized to learn the potential nonlinear relationship between music contents and users’ artist rank orders. By introducing the kernel method, we can also efficiently fuse acoustical and symbolic features, i.e. annotation words, under the proposed framework. Experimental results show that the system can successfully predict the user’s tastes and achieve better performance whether using non-linear algorithms of OPP or fusing acoustical and symbolic features.
|
author2 |
Jang, Jyh-Shing Roger |
author_facet |
Jang, Jyh-Shing Roger Chen, Edwardson 陳致生 |
author |
Chen, Edwardson 陳致生 |
spellingShingle |
Chen, Edwardson 陳致生 Content-based Automatic Annotation and Preference Learning for Music Information Retrieval |
author_sort |
Chen, Edwardson |
title |
Content-based Automatic Annotation and Preference Learning for Music Information Retrieval |
title_short |
Content-based Automatic Annotation and Preference Learning for Music Information Retrieval |
title_full |
Content-based Automatic Annotation and Preference Learning for Music Information Retrieval |
title_fullStr |
Content-based Automatic Annotation and Preference Learning for Music Information Retrieval |
title_full_unstemmed |
Content-based Automatic Annotation and Preference Learning for Music Information Retrieval |
title_sort |
content-based automatic annotation and preference learning for music information retrieval |
publishDate |
2011 |
url |
http://ndltd.ncl.edu.tw/handle/58uttc |
work_keys_str_mv |
AT chenedwardson contentbasedautomaticannotationandpreferencelearningformusicinformationretrieval AT chénzhìshēng contentbasedautomaticannotationandpreferencelearningformusicinformationretrieval AT chenedwardson jīyúnèiróngdezìdòngbiāojìyǔxǐhǎoxuéxíyīngyòngyúyīnlèzīxùnjiǎnsuǒ AT chénzhìshēng jīyúnèiróngdezìdòngbiāojìyǔxǐhǎoxuéxíyīngyòngyúyīnlèzīxùnjiǎnsuǒ |
_version_ |
1719101876490731520 |