Music generation and human voice conversion based on LSTM

Music is closely related to human life and is an important way for people to express their feelings in life. Deep neural networks have played a significant role in the field of music processing. There are many different neural network models to implement deep learning for audio processing. For gener...

Full description

Bibliographic Details
Main Authors:	Li Guangwei, Ding Shuxue, Li Yujie, Zhang Kangkang
Format:	Article
Language:	English
Published:	EDP Sciences 2021-01-01
Series:	MATEC Web of Conferences
Online Access:	https://www.matec-conferences.org/articles/matecconf/pdf/2021/05/matecconf_cscns20_06015.pdf

id	doaj-8b6ee3579c2e4b93a1a89a0ba4fd9067
record_format	Article
spelling	doaj-8b6ee3579c2e4b93a1a89a0ba4fd90672021-02-18T10:45:18ZengEDP SciencesMATEC Web of Conferences2261-236X2021-01-013360601510.1051/matecconf/202133606015matecconf_cscns20_06015Music generation and human voice conversion based on LSTMLi Guangwei0Ding Shuxue1Li Yujie2Zhang Kangkang3School of Artificial Intelligence, Guilin University of Electronic TechnologySchool of Artificial Intelligence, Guilin University of Electronic TechnologySchool of Artificial Intelligence, Guilin University of Electronic TechnologySchool of Artificial Intelligence, Guilin University of Electronic TechnologyMusic is closely related to human life and is an important way for people to express their feelings in life. Deep neural networks have played a significant role in the field of music processing. There are many different neural network models to implement deep learning for audio processing. For general neural networks, there are problems such as complex operation and slow computing speed. In this paper, we introduce Long Short-Term Memory (LSTM), which is a circulating neural network, to realize end-to-end training. The network structure is simple and can generate better audio sequences after the training model. After music generation, human voice conversion is important for music understanding and inserting lyrics to pure music. We propose the audio segmentation technology for segmenting the fixed length of the human voice. Different notes are classified through piano music without considering the scale and are correlated with the different human voices we get. Finally, through the transformation, we can express the generated piano music through the output of the human voice. Experimental results demonstrate that the proposed scheme can successfully obtain a human voice from pure piano Music generated by LSTM.https://www.matec-conferences.org/articles/matecconf/pdf/2021/05/matecconf_cscns20_06015.pdf
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Li Guangwei Ding Shuxue Li Yujie Zhang Kangkang
spellingShingle	Li Guangwei Ding Shuxue Li Yujie Zhang Kangkang Music generation and human voice conversion based on LSTM MATEC Web of Conferences
author_facet	Li Guangwei Ding Shuxue Li Yujie Zhang Kangkang
author_sort	Li Guangwei
title	Music generation and human voice conversion based on LSTM
title_short	Music generation and human voice conversion based on LSTM
title_full	Music generation and human voice conversion based on LSTM
title_fullStr	Music generation and human voice conversion based on LSTM
title_full_unstemmed	Music generation and human voice conversion based on LSTM
title_sort	music generation and human voice conversion based on lstm
publisher	EDP Sciences
series	MATEC Web of Conferences
issn	2261-236X
publishDate	2021-01-01
description	Music is closely related to human life and is an important way for people to express their feelings in life. Deep neural networks have played a significant role in the field of music processing. There are many different neural network models to implement deep learning for audio processing. For general neural networks, there are problems such as complex operation and slow computing speed. In this paper, we introduce Long Short-Term Memory (LSTM), which is a circulating neural network, to realize end-to-end training. The network structure is simple and can generate better audio sequences after the training model. After music generation, human voice conversion is important for music understanding and inserting lyrics to pure music. We propose the audio segmentation technology for segmenting the fixed length of the human voice. Different notes are classified through piano music without considering the scale and are correlated with the different human voices we get. Finally, through the transformation, we can express the generated piano music through the output of the human voice. Experimental results demonstrate that the proposed scheme can successfully obtain a human voice from pure piano Music generated by LSTM.
url	https://www.matec-conferences.org/articles/matecconf/pdf/2021/05/matecconf_cscns20_06015.pdf
work_keys_str_mv	AT liguangwei musicgenerationandhumanvoiceconversionbasedonlstm AT dingshuxue musicgenerationandhumanvoiceconversionbasedonlstm AT liyujie musicgenerationandhumanvoiceconversionbasedonlstm AT zhangkangkang musicgenerationandhumanvoiceconversionbasedonlstm
_version_	1724263111427358720

Music generation and human voice conversion based on LSTM

Similar Items