Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes

An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable...

Full description

Bibliographic Details
Main Authors: Spiro P. Pantazatos, Jianrong Li, Paul Pavlidis, Yves A. Lussier
Format: Article
Language:English
Published: SAGE Publishing 2009-06-01
Series:Cancer Informatics
Online Access:http://la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484
id doaj-0e76e1fa60034b5197a5f06c80fb2dd6
record_format Article
spelling doaj-0e76e1fa60034b5197a5f06c80fb2dd62020-11-25T02:53:44ZengSAGE PublishingCancer Informatics1176-93512009-06-012009Semantic Technologie7594Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured PhenotypesSpiro P. PantazatosJianrong LiPaul PavlidisYves A. LussierAn approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets. http://la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484
collection DOAJ
language English
format Article
sources DOAJ
author Spiro P. Pantazatos
Jianrong Li
Paul Pavlidis
Yves A. Lussier
spellingShingle Spiro P. Pantazatos
Jianrong Li
Paul Pavlidis
Yves A. Lussier
Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
Cancer Informatics
author_facet Spiro P. Pantazatos
Jianrong Li
Paul Pavlidis
Yves A. Lussier
author_sort Spiro P. Pantazatos
title Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_short Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_full Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_fullStr Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_full_unstemmed Integration of Neuroimaging and Microarray Datasets  through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_sort integration of neuroimaging and microarray datasets  through mapping and model-theoretic semantic decomposition of unstructured phenotypes
publisher SAGE Publishing
series Cancer Informatics
issn 1176-9351
publishDate 2009-06-01
description An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets.
url http://la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484
work_keys_str_mv AT spiroppantazatos integrationofneuroimagingandmicroarraydatasetsnbspthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
AT jianrongli integrationofneuroimagingandmicroarraydatasetsnbspthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
AT paulpavlidis integrationofneuroimagingandmicroarraydatasetsnbspthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
AT yvesalussier integrationofneuroimagingandmicroarraydatasetsnbspthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
_version_ 1724724808661336064