Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes

An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable...

Full description

Bibliographic Details
Main Authors: Spiro P. Pantazatos, Jianrong Li, Paul Pavlidis, Yves A. Lussier
Format: Article
Language:English
Published: SAGE Publishing 2009-01-01
Series:Cancer Informatics
Subjects:
Online Access:http://www.la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484
id doaj-6cea8f70947b45d586e470cdb80116d8
record_format Article
spelling doaj-6cea8f70947b45d586e470cdb80116d82020-11-25T03:32:42ZengSAGE PublishingCancer Informatics1176-93512009-01-0187594Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured PhenotypesSpiro P. PantazatosJianrong LiPaul PavlidisYves A. LussierAn approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets.http://www.la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484computational ontologiesphenotypesdatabase interoperabilityMediated SchemaSNOMED
collection DOAJ
language English
format Article
sources DOAJ
author Spiro P. Pantazatos
Jianrong Li
Paul Pavlidis
Yves A. Lussier
spellingShingle Spiro P. Pantazatos
Jianrong Li
Paul Pavlidis
Yves A. Lussier
Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
Cancer Informatics
computational ontologies
phenotypes
database interoperability
Mediated Schema
SNOMED
author_facet Spiro P. Pantazatos
Jianrong Li
Paul Pavlidis
Yves A. Lussier
author_sort Spiro P. Pantazatos
title Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_short Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_full Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_fullStr Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_full_unstemmed Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes
title_sort integration of neuroimaging and microarray datasets through mapping and model-theoretic semantic decomposition of unstructured phenotypes
publisher SAGE Publishing
series Cancer Informatics
issn 1176-9351
publishDate 2009-01-01
description An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets.
topic computational ontologies
phenotypes
database interoperability
Mediated Schema
SNOMED
url http://www.la-press.com/integration-of-neuroimaging-and-microarray-datasetsnbsp-through-mappin-a1484
work_keys_str_mv AT spiroppantazatos integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
AT jianrongli integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
AT paulpavlidis integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
AT yvesalussier integrationofneuroimagingandmicroarraydatasetsthroughmappingandmodeltheoreticsemanticdecompositionofunstructuredphenotypes
_version_ 1724566512309633024