KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens

Abstract Characterising context‐dependent gene functions is crucial for understanding the genetic bases of health and disease. To date, inference of gene functions from large‐scale genetic perturbation screens is based on ad hoc analysis pipelines involving unsupervised clustering and functional enr...

Full description

Bibliographic Details
Main Authors: Heba Z Sailem, Jens Rittscher, Lucas Pelkmans
Format: Article
Language:English
Published: Wiley 2020-03-01
Series:Molecular Systems Biology
Subjects:
Online Access:https://doi.org/10.15252/msb.20199083
id doaj-237a9479d0044ece8ae00efad08f0afd
record_format Article
spelling doaj-237a9479d0044ece8ae00efad08f0afd2021-08-02T23:33:43ZengWileyMolecular Systems Biology1744-42922020-03-01163n/an/a10.15252/msb.20199083KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screensHeba Z Sailem0Jens Rittscher1Lucas Pelkmans2Department of Engineering Science Institute of Biomedical Engineering University of Oxford Oxford UKDepartment of Engineering Science Institute of Biomedical Engineering University of Oxford Oxford UKDepartment of Molecular Life Sciences University of Zurich Zurich SwitzerlandAbstract Characterising context‐dependent gene functions is crucial for understanding the genetic bases of health and disease. To date, inference of gene functions from large‐scale genetic perturbation screens is based on ad hoc analysis pipelines involving unsupervised clustering and functional enrichment. We present Knowledge‐ and Context‐driven Machine Learning (KCML), a framework that systematically predicts multiple context‐specific functions for a given gene based on the similarity of its perturbation phenotype to those with known function. As a proof of concept, we test KCML on three datasets describing phenotypes at the molecular, cellular and population levels and show that it outperforms traditional analysis pipelines. In particular, KCML identified an abnormal multicellular organisation phenotype associated with the depletion of olfactory receptors, and TGFβ and WNT signalling genes in colorectal cancer cells. We validate these predictions in colorectal cancer patients and show that olfactory receptors expression is predictive of worse patient outcomes. These results highlight KCML as a systematic framework for discovering novel scale‐crossing and context‐dependent gene functions. KCML is highly generalisable and applicable to various large‐scale genetic perturbation screens.https://doi.org/10.15252/msb.20199083cell morphology and microenvironmentCRISPR and siRNA screeningfunctional genomicshigh content screeningolfactory receptors
collection DOAJ
language English
format Article
sources DOAJ
author Heba Z Sailem
Jens Rittscher
Lucas Pelkmans
spellingShingle Heba Z Sailem
Jens Rittscher
Lucas Pelkmans
KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
Molecular Systems Biology
cell morphology and microenvironment
CRISPR and siRNA screening
functional genomics
high content screening
olfactory receptors
author_facet Heba Z Sailem
Jens Rittscher
Lucas Pelkmans
author_sort Heba Z Sailem
title KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_short KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_full KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_fullStr KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_full_unstemmed KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_sort kcml: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
publisher Wiley
series Molecular Systems Biology
issn 1744-4292
publishDate 2020-03-01
description Abstract Characterising context‐dependent gene functions is crucial for understanding the genetic bases of health and disease. To date, inference of gene functions from large‐scale genetic perturbation screens is based on ad hoc analysis pipelines involving unsupervised clustering and functional enrichment. We present Knowledge‐ and Context‐driven Machine Learning (KCML), a framework that systematically predicts multiple context‐specific functions for a given gene based on the similarity of its perturbation phenotype to those with known function. As a proof of concept, we test KCML on three datasets describing phenotypes at the molecular, cellular and population levels and show that it outperforms traditional analysis pipelines. In particular, KCML identified an abnormal multicellular organisation phenotype associated with the depletion of olfactory receptors, and TGFβ and WNT signalling genes in colorectal cancer cells. We validate these predictions in colorectal cancer patients and show that olfactory receptors expression is predictive of worse patient outcomes. These results highlight KCML as a systematic framework for discovering novel scale‐crossing and context‐dependent gene functions. KCML is highly generalisable and applicable to various large‐scale genetic perturbation screens.
topic cell morphology and microenvironment
CRISPR and siRNA screening
functional genomics
high content screening
olfactory receptors
url https://doi.org/10.15252/msb.20199083
work_keys_str_mv AT hebazsailem kcmlamachinelearningframeworkforinferenceofmultiscalegenefunctionsfromgeneticperturbationscreens
AT jensrittscher kcmlamachinelearningframeworkforinferenceofmultiscalegenefunctionsfromgeneticperturbationscreens
AT lucaspelkmans kcmlamachinelearningframeworkforinferenceofmultiscalegenefunctionsfromgeneticperturbationscreens
_version_ 1721225496512954368