Audio segmentation, classification and visualization

This thesis presents a new approach to the visualization of audio files that simultaneously illustrates general audio properties and the component sounds that comprise a given input file. New audio segmentation and classification methods are reported that outperform existing methods. In order to vis...

Full description

Bibliographic Details
Main Author:	Zhang, Xin (Author)
Other Authors:	Whalley, Jacqueline (Contributor), Brooks, Stephen (Contributor), Macdonell, Stephen (Contributor)
Format:	Others
Published:	Auckland University of Technology, 2009-12-09T02:38:22Z.
Subjects:	Audio Segmentation Classification Visualization Time mosaics Video textures Thesis
Online Access:	Get fulltext


LEADER	02054 am a22002533u 4500
001	802
042			\|a dc
100	1	0	\|a Zhang, Xin \|e author
100	1	0	\|a Whalley, Jacqueline \|e contributor
100	1	0	\|a Brooks, Stephen \|e contributor
100	1	0	\|a Macdonell, Stephen \|e contributor
245	0	0	\|a Audio segmentation, classification and visualization
260			\|b Auckland University of Technology, \|c 2009-12-09T02:38:22Z.
520			\|a This thesis presents a new approach to the visualization of audio files that simultaneously illustrates general audio properties and the component sounds that comprise a given input file. New audio segmentation and classification methods are reported that outperform existing methods. In order to visualize audio files, the audio is segmented (separated into component sounds) and then classified in order to select matching archetypal images or video that represent each audio segment and are used as templates for the visualization. Each segment's template image or video is then subjected to image processing filters that are driven by audio features. One visualization method reported represents heterogeneous audio files as a seamless image mosaic along a time axis where each component image in the mosaic maps directly to a discovered component sound. The second visualization method, video texture mosaics, builds on the ideas developed in time mosaics. A novel adaptive video texture generation method was created by using acoustic similarity detection to produce a resultant video texture that more accurately represents an audio file. Compared with existing visualization methods such as oscilloscopes and spectrograms, both approaches yield more accessible illustrations of audio files and are more suitable for casual and non expert users.
540			\|a OpenAccess
546			\|a en
650	0	4	\|a Audio
650	0	4	\|a Segmentation
650	0	4	\|a Classification
650	0	4	\|a Visualization
650	0	4	\|a Time mosaics
650	0	4	\|a Video textures
655	7		\|a Thesis
856			\|z Get fulltext \|u http://hdl.handle.net/10292/802

Audio segmentation, classification and visualization

Similar Items