Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings

Recent work in blind source separation applied to anechoic mixtures of speech allows for improved reconstruction of sources that rarely overlap in a time-frequency representation. While the assumption that speech mixtures do not overlap significantly in time-frequency is reasonable, music mixtures r...

Full description

Bibliographic Details
Main Authors: Bryan Pardo, John Woodruff
Format: Article
Language:English
Published: SpringerOpen 2007-01-01
Series:EURASIP Journal on Advances in Signal Processing
Online Access:http://dx.doi.org/10.1155/2007/86369
id doaj-6a808a6ce8574839b957d5d428b01df6
record_format Article
spelling doaj-6a808a6ce8574839b957d5d428b01df62020-11-24T21:55:48ZengSpringerOpenEURASIP Journal on Advances in Signal Processing1687-61721687-61802007-01-01200710.1155/2007/86369Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music RecordingsBryan PardoJohn WoodruffRecent work in blind source separation applied to anechoic mixtures of speech allows for improved reconstruction of sources that rarely overlap in a time-frequency representation. While the assumption that speech mixtures do not overlap significantly in time-frequency is reasonable, music mixtures rarely meet this constraint, requiring new approaches. We introduce a method that uses spatial cues from anechoic, stereo music recordings and assumptions regarding the structure of musical source signals to effectively separate mixtures of tonal music. We discuss existing techniques to create partial source signal estimates from regions of the mixture where source signals do not overlap significantly. We use these partial signals within a new demixing framework, in which we estimate harmonic masks for each source, allowing the determination of the number of active sources in important time-frequency frames of the mixture. We then propose a method for distributing energy from time-frequency frames of the mixture to multiple source signals. This allows dealing with mixtures that contain time-frequency frames in which multiple harmonic sources are active without requiring knowledge of source characteristics. http://dx.doi.org/10.1155/2007/86369
collection DOAJ
language English
format Article
sources DOAJ
author Bryan Pardo
John Woodruff
spellingShingle Bryan Pardo
John Woodruff
Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings
EURASIP Journal on Advances in Signal Processing
author_facet Bryan Pardo
John Woodruff
author_sort Bryan Pardo
title Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings
title_short Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings
title_full Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings
title_fullStr Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings
title_full_unstemmed Using Pitch, Amplitude Modulation, and Spatial Cues for Separation of Harmonic Instruments from Stereo Music Recordings
title_sort using pitch, amplitude modulation, and spatial cues for separation of harmonic instruments from stereo music recordings
publisher SpringerOpen
series EURASIP Journal on Advances in Signal Processing
issn 1687-6172
1687-6180
publishDate 2007-01-01
description Recent work in blind source separation applied to anechoic mixtures of speech allows for improved reconstruction of sources that rarely overlap in a time-frequency representation. While the assumption that speech mixtures do not overlap significantly in time-frequency is reasonable, music mixtures rarely meet this constraint, requiring new approaches. We introduce a method that uses spatial cues from anechoic, stereo music recordings and assumptions regarding the structure of musical source signals to effectively separate mixtures of tonal music. We discuss existing techniques to create partial source signal estimates from regions of the mixture where source signals do not overlap significantly. We use these partial signals within a new demixing framework, in which we estimate harmonic masks for each source, allowing the determination of the number of active sources in important time-frequency frames of the mixture. We then propose a method for distributing energy from time-frequency frames of the mixture to multiple source signals. This allows dealing with mixtures that contain time-frequency frames in which multiple harmonic sources are active without requiring knowledge of source characteristics.
url http://dx.doi.org/10.1155/2007/86369
work_keys_str_mv AT bryanpardo usingpitchamplitudemodulationandspatialcuesforseparationofharmonicinstrumentsfromstereomusicrecordings
AT johnwoodruff usingpitchamplitudemodulationandspatialcuesforseparationofharmonicinstrumentsfromstereomusicrecordings
_version_ 1725861346858237952