Summary: | 碩士 === 國立中央大學 === 資訊工程學系 === 103 === The idea of hierarchically organize things is human intuition. For example, the items organized hierarchically in shopping website or the book store. In our work, we try to bring this idea into the audio file classifiy problem, so we develop the Bayesian nonparametric tree-structured mixture model. This model constructs the tree-structured representation for audio file. The root node of this tree presents the sharing parts between different audio, the left node presents the unique parts for each audio. We use the nested Chinese restaurant process (nCRP) as the prior distribution for the tree-structured model. Our model is automatically adjust the width and depth of the tree and could be extended to the infinite tree theoretically. This unsupervised learning method solved the problem of model selection and the over-estimation.
We use the Gibbs sampling algorithm to solve the problem of model inference. According to the posterior probabilities sampling, every audio file has a path on this tree and frame distribution among level on this path. Using this result as the clustering feature, then we put this feature into the classifier to get the recognition result. In our experimentation, we collect many different type of audio file database, like environment sounds, guitar-tech clips, music genre and music sub-genre. The result shows the recognition rate is improved via our proposal model.
|