Summary: | Background: Skeletal muscle transcriptome has been analyzed to report muscle-specific biomarkers, but muscle heterogeneity has largely been overlooked in pursuit of formulating a balanced design. Given the heterogeneity of muscle tissue in terms of both function and fiber-type composition, there could be several unaccounted sources of variation affecting the gene expression profile of skeletal muscles. Categorization of muscle transcriptome according to the source of variation will not only improve the power of transcriptome comparison tests but also will help to identify unaccounted biological sources of variation. Materials and Methods: Gene expression profile of normal skeletal muscle subjects (GSE18732) were analyzed with R-statistical software and Bioconductor packages. Gene-sets were prepared by grouping Affymetrix probes according to biological processes they were annotated. Coherence score and associated P values were calculated for each gene-set. All gene-sets having P < 0.05 were selected as coherent gene-set. Results: We have analyzed gene-sets and used coherence scores to measure the degree of coregulation between genes of a gene-set. We have shown that coherent gene-sets have a better chance to classify samples into biologically relevant subgroups as compared to noncoherent gene-sets. Further, we have applied the developed method to the muscle gene expression profiles and found that muscle fiber-type proportion in collected biopsies is one of the most prominent unaccounted “source of variations” affecting gene expression measurements. Conclusion: The sample classification produced based on the expression profile of genes belonging to coherent gene-sets has a better chance to result in biologically meaningful clusters.
|