Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2009-01-01
|
Series: | Cancer Informatics |
Subjects: | |
Online Access: | http://www.la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484 |
id |
doaj-6cea8f70947b45d586e470cdb80116d8 |
---|---|
record_format |
Article |
spelling |
doaj-6cea8f70947b45d586e470cdb80116d82020-11-25T03:32:42ZengSAGE PublishingCancer Informatics1176-93512009-01-0187594Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured PhenotypesSpiro P. PantazatosJianrong LiPaul PavlidisYves A. LussierAn approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets.http://www.la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484computational ontologiesphenotypesdatabase interoperabilityMediated SchemaSNOMED |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Spiro P. Pantazatos Jianrong Li Paul Pavlidis Yves A. Lussier |
spellingShingle |
Spiro P. Pantazatos Jianrong Li Paul Pavlidis Yves A. Lussier Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes Cancer Informatics computational ontologies phenotypes database interoperability Mediated Schema SNOMED |
author_facet |
Spiro P. Pantazatos Jianrong Li Paul Pavlidis Yves A. Lussier |
author_sort |
Spiro P. Pantazatos |
title |
Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes |
title_short |
Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes |
title_full |
Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes |
title_fullStr |
Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes |
title_full_unstemmed |
Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes |
title_sort |
integration of neuroimaging and microarray datasets through mapping and model-theoretic semantic decomposition of unstructured phenotypes |
publisher |
SAGE Publishing |
series |
Cancer Informatics |
issn |
1176-9351 |
publishDate |
2009-01-01 |
description |
An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets. |
topic |
computational ontologies phenotypes database interoperability Mediated Schema SNOMED |
url |
http://www.la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484 |
work_keys_str_mv |
AT spiroppantazatos integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes AT jianrongli integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes AT paulpavlidis integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes AT yvesalussier integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes |
_version_ |
1724566512309633024 |