BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control

Deep learning has given AI-based methods for music creation a boost by over the past years. An important challenge in this field is to balance user control and autonomy in music generation systems. In this work, we present BassNet, a deep learning model for generating bass guitar tracks based on mus...

Full description

Bibliographic Details
Main Authors:	Maarten Grachten, Stefan Lattner, Emmanuel Deruty
Format:	Article
Language:	English
Published:	MDPI AG 2020-09-01
Series:	Applied Sciences
Subjects:	music generation deep learning latent space models user control
Online Access:	https://www.mdpi.com/2076-3417/10/18/6627

id	doaj-763a95c03ae24d29b457c5c14cb7ce21
record_format	Article
spelling	doaj-763a95c03ae24d29b457c5c14cb7ce212020-11-25T03:58:35ZengMDPI AGApplied Sciences2076-34172020-09-01106627662710.3390/app10186627BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive ControlMaarten Grachten0Stefan Lattner1Emmanuel Deruty2Contractor for Sony Computer Science Laboratories, 75005 Paris, FranceSony Computer Science Laboratories, 75005 Paris, France <email>me@stefanlattner.at</email> (S.L.)Sony Computer Science Laboratories, 75005 Paris, France <email>me@stefanlattner.at</email> (S.L.)Deep learning has given AI-based methods for music creation a boost by over the past years. An important challenge in this field is to balance user control and autonomy in music generation systems. In this work, we present BassNet, a deep learning model for generating bass guitar tracks based on musical source material. An innovative aspect of our work is that the model is trained to learn a temporally stable two-dimensional latent space variable that offers interactive user control. We empirically show that the model can disentangle bass patterns that require sensitivity to harmony, instrument timbre, and rhythm. An ablation study reveals that this capability is because of the temporal stability constraint on latent space trajectories during training. We also demonstrate that models that are trained on pop/rock music learn a latent space that offers control over the diatonic characteristics of the output, among other things. Lastly, we present and discuss generated bass tracks for three different music fragments. The work that is presented here is a step toward the integration of AI-based technology in the workflow of musical content creators.https://www.mdpi.com/2076-3417/10/18/6627music generationdeep learninglatent space modelsuser control
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Maarten Grachten Stefan Lattner Emmanuel Deruty
spellingShingle	Maarten Grachten Stefan Lattner Emmanuel Deruty BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control Applied Sciences music generation deep learning latent space models user control
author_facet	Maarten Grachten Stefan Lattner Emmanuel Deruty
author_sort	Maarten Grachten
title	BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control
title_short	BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control
title_full	BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control
title_fullStr	BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control
title_full_unstemmed	BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control
title_sort	bassnet: a variational gated autoencoder for conditional generation of bass guitar tracks with learned interactive control
publisher	MDPI AG
series	Applied Sciences
issn	2076-3417
publishDate	2020-09-01
description	Deep learning has given AI-based methods for music creation a boost by over the past years. An important challenge in this field is to balance user control and autonomy in music generation systems. In this work, we present BassNet, a deep learning model for generating bass guitar tracks based on musical source material. An innovative aspect of our work is that the model is trained to learn a temporally stable two-dimensional latent space variable that offers interactive user control. We empirically show that the model can disentangle bass patterns that require sensitivity to harmony, instrument timbre, and rhythm. An ablation study reveals that this capability is because of the temporal stability constraint on latent space trajectories during training. We also demonstrate that models that are trained on pop/rock music learn a latent space that offers control over the diatonic characteristics of the output, among other things. Lastly, we present and discuss generated bass tracks for three different music fragments. The work that is presented here is a step toward the integration of AI-based technology in the workflow of musical content creators.
topic	music generation deep learning latent space models user control
url	https://www.mdpi.com/2076-3417/10/18/6627
work_keys_str_mv	AT maartengrachten bassnetavariationalgatedautoencoderforconditionalgenerationofbassguitartrackswithlearnedinteractivecontrol AT stefanlattner bassnetavariationalgatedautoencoderforconditionalgenerationofbassguitartrackswithlearnedinteractivecontrol AT emmanuelderuty bassnetavariationalgatedautoencoderforconditionalgenerationofbassguitartrackswithlearnedinteractivecontrol
_version_	1724456276209958912

BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control

Similar Items