Compressed graph representation for scalable molecular graph generation
Abstract Recently, deep learning has been successfully applied to molecular graph generation. Nevertheless, mitigating the computational complexity, which increases with the number of nodes in a graph, has been a major challenge. This has hindered the application of deep learning-based molecular gra...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-09-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13321-020-00463-2 |
id |
doaj-08e4171195b04feabe251a39ec08e2b1 |
---|---|
record_format |
Article |
spelling |
doaj-08e4171195b04feabe251a39ec08e2b12020-11-25T03:21:42ZengBMCJournal of Cheminformatics1758-29462020-09-011211810.1186/s13321-020-00463-2Compressed graph representation for scalable molecular graph generationYoungchun Kwon0Dongseon Lee1Youn-Suk Choi2Kyoham Shin3Seokho Kang4Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd.Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd.Samsung Advanced Institute of Technology, Samsung Electronics Co. Ltd.Department of Industrial Engineering, Sungkyunkwan UniversityDepartment of Industrial Engineering, Sungkyunkwan UniversityAbstract Recently, deep learning has been successfully applied to molecular graph generation. Nevertheless, mitigating the computational complexity, which increases with the number of nodes in a graph, has been a major challenge. This has hindered the application of deep learning-based molecular graph generation to large molecules with many heavy atoms. In this study, we present a molecular graph compression method to alleviate the complexity while maintaining the capability of generating chemically valid and diverse molecular graphs. We designate six small substructural patterns that are prevalent between two atoms in real-world molecules. These relevant substructures in a molecular graph are then converted to edges by regarding them as additional edge features along with the bond types. This reduces the number of nodes significantly without any information loss. Consequently, a generative model can be constructed in a more efficient and scalable manner with large molecules on a compressed graph representation. We demonstrate the effectiveness of the proposed method for molecules with up to 88 heavy atoms using the GuacaMol benchmark.http://link.springer.com/article/10.1186/s13321-020-00463-2Molecular graph generationCompressed graph representationGraph variational autoencoderDeep learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Youngchun Kwon Dongseon Lee Youn-Suk Choi Kyoham Shin Seokho Kang |
spellingShingle |
Youngchun Kwon Dongseon Lee Youn-Suk Choi Kyoham Shin Seokho Kang Compressed graph representation for scalable molecular graph generation Journal of Cheminformatics Molecular graph generation Compressed graph representation Graph variational autoencoder Deep learning |
author_facet |
Youngchun Kwon Dongseon Lee Youn-Suk Choi Kyoham Shin Seokho Kang |
author_sort |
Youngchun Kwon |
title |
Compressed graph representation for scalable molecular graph generation |
title_short |
Compressed graph representation for scalable molecular graph generation |
title_full |
Compressed graph representation for scalable molecular graph generation |
title_fullStr |
Compressed graph representation for scalable molecular graph generation |
title_full_unstemmed |
Compressed graph representation for scalable molecular graph generation |
title_sort |
compressed graph representation for scalable molecular graph generation |
publisher |
BMC |
series |
Journal of Cheminformatics |
issn |
1758-2946 |
publishDate |
2020-09-01 |
description |
Abstract Recently, deep learning has been successfully applied to molecular graph generation. Nevertheless, mitigating the computational complexity, which increases with the number of nodes in a graph, has been a major challenge. This has hindered the application of deep learning-based molecular graph generation to large molecules with many heavy atoms. In this study, we present a molecular graph compression method to alleviate the complexity while maintaining the capability of generating chemically valid and diverse molecular graphs. We designate six small substructural patterns that are prevalent between two atoms in real-world molecules. These relevant substructures in a molecular graph are then converted to edges by regarding them as additional edge features along with the bond types. This reduces the number of nodes significantly without any information loss. Consequently, a generative model can be constructed in a more efficient and scalable manner with large molecules on a compressed graph representation. We demonstrate the effectiveness of the proposed method for molecules with up to 88 heavy atoms using the GuacaMol benchmark. |
topic |
Molecular graph generation Compressed graph representation Graph variational autoencoder Deep learning |
url |
http://link.springer.com/article/10.1186/s13321-020-00463-2 |
work_keys_str_mv |
AT youngchunkwon compressedgraphrepresentationforscalablemoleculargraphgeneration AT dongseonlee compressedgraphrepresentationforscalablemoleculargraphgeneration AT younsukchoi compressedgraphrepresentationforscalablemoleculargraphgeneration AT kyohamshin compressedgraphrepresentationforscalablemoleculargraphgeneration AT seokhokang compressedgraphrepresentationforscalablemoleculargraphgeneration |
_version_ |
1724613062848151552 |