A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related

<p>Abstract</p> <p>Background</p> <p>The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in agei...

Full description

Bibliographic Details
Main Authors: Vasieva Olga, Freitas Alex A, de Magalhães João
Format: Article
Language:English
Published: BMC 2011-01-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/12/27
id doaj-0ffc63b466894164a83ed35661aee2ef
record_format Article
spelling doaj-0ffc63b466894164a83ed35661aee2ef2020-11-24T22:16:24ZengBMCBMC Genomics1471-21642011-01-011212710.1186/1471-2164-12-27A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-relatedVasieva OlgaFreitas Alex Ade Magalhães João<p>Abstract</p> <p>Background</p> <p>The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a data mining approach, based on classification methods (decision trees and Naive Bayes), for analysing data about human DNA repair genes. The goal is to build classification models that allow us to discriminate between ageing-related and non-ageing-related DNA repair genes, in order to better understand their different properties.</p> <p>Results</p> <p>The main patterns discovered by the classification methods are as follows: (a) the number of protein-protein interactions was a predictor of DNA repair proteins being ageing-related; (b) the use of predictor attributes based on protein-protein interactions considerably increased predictive accuracy of attributes based on Gene Ontology (GO) annotations; (c) GO terms related to "response to stimulus" seem reasonably good predictors of ageing-relatedness for DNA repair genes; (d) interaction with the XRCC5 (Ku80) protein is a strong predictor of ageing-relatedness for DNA repair genes; and (e) DNA repair genes with a high expression in T lymphocytes are more likely to be ageing-related.</p> <p>Conclusions</p> <p>The above patterns are broadly integrated in an analysis discussing relations between Ku, the non-homologous end joining DNA repair pathway, ageing and lymphocyte development. These patterns and their analysis support non-homologous end joining double strand break repair as central to the ageing-relatedness of DNA repair genes. Our work also showcases the use of protein interaction partners to improve accuracy in data mining methods and our approach could be applied to other ageing-related pathways.</p> http://www.biomedcentral.com/1471-2164/12/27
collection DOAJ
language English
format Article
sources DOAJ
author Vasieva Olga
Freitas Alex A
de Magalhães João
spellingShingle Vasieva Olga
Freitas Alex A
de Magalhães João
A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
BMC Genomics
author_facet Vasieva Olga
Freitas Alex A
de Magalhães João
author_sort Vasieva Olga
title A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_short A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_full A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_fullStr A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_full_unstemmed A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_sort data mining approach for classifying dna repair genes into ageing-related or non-ageing-related
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2011-01-01
description <p>Abstract</p> <p>Background</p> <p>The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a data mining approach, based on classification methods (decision trees and Naive Bayes), for analysing data about human DNA repair genes. The goal is to build classification models that allow us to discriminate between ageing-related and non-ageing-related DNA repair genes, in order to better understand their different properties.</p> <p>Results</p> <p>The main patterns discovered by the classification methods are as follows: (a) the number of protein-protein interactions was a predictor of DNA repair proteins being ageing-related; (b) the use of predictor attributes based on protein-protein interactions considerably increased predictive accuracy of attributes based on Gene Ontology (GO) annotations; (c) GO terms related to "response to stimulus" seem reasonably good predictors of ageing-relatedness for DNA repair genes; (d) interaction with the XRCC5 (Ku80) protein is a strong predictor of ageing-relatedness for DNA repair genes; and (e) DNA repair genes with a high expression in T lymphocytes are more likely to be ageing-related.</p> <p>Conclusions</p> <p>The above patterns are broadly integrated in an analysis discussing relations between Ku, the non-homologous end joining DNA repair pathway, ageing and lymphocyte development. These patterns and their analysis support non-homologous end joining double strand break repair as central to the ageing-relatedness of DNA repair genes. Our work also showcases the use of protein interaction partners to improve accuracy in data mining methods and our approach could be applied to other ageing-related pathways.</p>
url http://www.biomedcentral.com/1471-2164/12/27
work_keys_str_mv AT vasievaolga adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT freitasalexa adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT demagalhaesjoao adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT vasievaolga dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT freitasalexa dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT demagalhaesjoao dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
_version_ 1725790050646491136