Summary: | Numerous efforts have been made to elucidate the etiology and improve the treatment of lung cancer, but the overall five-year survival rate is still only 15%. Identification of prognostic biomarkers for lung cancer using gene expression microarrays poses a major challenge in that very few overlapping genes have been reported among different studies. To address this issue, we have performed concurrent genome-wide analyses of copy number variation and gene expression to identify genes reproducibly associated with tumorigenesis and survival in non-smoking female lung adenocarcinoma. The genomic landscape of frequent copy number variable regions (CNVRs) in at least 30% of samples was revealed, and their aberration patterns were highly similar to several studies reported previously. Further statistical analysis for genes located in the CNVRs identified 475 genes differentially expressed between tumor and normal tissues (p<10(-5)). We demonstrated the reproducibility of these genes in another lung cancer study (p = 0.0034, Fisher's exact test), and showed the concordance between copy number variations and gene expression changes by elevated Pearson correlation coefficients. Pathway analysis revealed two major dysregulated functions in lung tumorigenesis: survival regulation via AKT signaling and cytoskeleton reorganization. Further validation of these enriched pathways using three independent cohorts demonstrated effective prediction of survival. In conclusion, by integrating gene expression profiles and copy number variations, we identified genes/pathways that may serve as prognostic biomarkers for lung tumorigenesis.
|