A study on multi-modal music graph

碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === In this paper, we propose a graph-based framework dedicated to music applications; and the problem of features semantic gap. To begin with this paper, musical features is divided into three categories: acoustic feature, element feature, and structural feature. T...

Full description

Bibliographic Details
Main Authors: HO, CHIU-YUAN, 何邱元
Other Authors: HSU, JIA-LIEN
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/77738526741833237643
Description
Summary:碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === In this paper, we propose a graph-based framework dedicated to music applications; and the problem of features semantic gap. To begin with this paper, musical features is divided into three categories: acoustic feature, element feature, and structural feature. The acoustic feature is the physical quantity of audio signal, such as Zero-crossing rate and MFCCs. The element feature is basic musical element of music forms, such as melody and rhythm. The structure feature is the musical significance of the human cognitive, such as genre and emotion. Music is described as multifaceted from musical feature. Enable us to better understand the content from music. In feature class, the semantic gap between the high-level feature and the low-level feature, such as acoustic feature and structural feature. We are hard pressed to describe structural feature content with acoustic feature. In this paper, we constructed the relationships between music features. The multi-modal features will be split up into many feature domains, we denote the music object and feature domains are node, an edge related with two nodes, and the relation is represented as pair of the vertices with respect to the particular edge. We proposed a multi-modal music graph to organize music object and feature domains. Feature domains are connected with various kinds of node is followed by graph with the increase of number of music, graph is reduce the uniqueness of node and made graph into complexly. In order to increase the node uniqueness and reduce the complexity of graph, we introduce the graph projection operator to refactor the structure of the graph. We refactor the edge based on feature domains, for example: we projected the multi-modal music graph (music object, volume, music genre) to a new graph structure (music object, music genre). Make graph simple and better express the correspondence between nodes. Finally, we implement an expandability graph-based framework based on the multi-modal music graph and graph operators. In this framework, we extract the musical features from music object. We construct the multi-modal music graph, using the graph operators implement the music applications in music information retrieval, such as music clustering, music auto-tagging and music similarity. Also, we add the graph visualization operator. We implement graph-based music application framework.