Automated classification of homologous human chromosomes using digital fluorescence microscopy images

This thesis is concerned with the classification of chromosome pairs into their paternal and maternal homologs. Many disorders, including cancer, have inheritable components present in one of the parental chromosomes. In order to trace the involvement of such chromosome abnormalities in the developm...

Full description

Bibliographic Details
Main Author: Mousavi, Parvin
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/13770
Description
Summary:This thesis is concerned with the classification of chromosome pairs into their paternal and maternal homologs. Many disorders, including cancer, have inheritable components present in one of the parental chromosomes. In order to trace the involvement of such chromosome abnormalities in the development of cancer, it is important to analyze the parental homologs separately. In the last decade, the role of image processing techniques in chromosome analysis has significantly expanded. This is due to the introduction of novel peptide nucleic acid probes and high quality microscopes, which result in a chromosome imaging technique known as "quantitative fluorescence in-situ hybridization". We use this technique to acquire images of chromosomes with high quantitative information for homolog classification. There is no biological method to verify homolog differentiation. However, for certain chromosomes, after staining with fluorescent probes, centromere intensities, telomere lengths and chromosome shapes show apparent differences among homologs. These differences reflect true heteromorphism and a cytotechnician can visually classify homologs for such chromosomes using the heteromorphic properties. We choose to automatically analyze homolog classification for heteromorphic chromosomes, since we are provided with ground truth for classification results. In addition, since no robust means exists for verifying the results, we use multi-feature analysis of digital chromosome images. The high correlation in the classification results using multiple features is employed as a verification tool. Features used for homolog classification in this study are centromere and telomere intensities, and the length of the P-arm. To calculate centromere intensities, we study and develop methods for their segmentation. Two novel methods are introduced to segment centromeres from DAPI images. These methods use mode histograms, thresholding, and the 2D gradient of the chromosomes. Intensities of the segmented centromeres are calculated and homologous chromosomes are classified into parental classes. The results are verified on heteromorphic chromosome 16, where a technician can visually tell the homologs apart. A fuzzy segmentation algorithm is employed to segment centromeres from FITC images. Intensities of the segmented centromeres are used to classify homologs. These results are then verified using heteromorphic chromosome 22, where the appearance of P-arms are different among homologs and can be differentiated by a technician. Telomere lengths are also calculated for each chromosome using previously developed software. For the first time, a novel automatic quantitative classification method is proposed for homolog differentiation using multiple features. This method is based on mutual information maximization applied to an unsupervised neural network architecture. We designed and tested our algorithm on chromosome 16, since for this chromosome, a cytotechnician can visually classify the homologs, providing us with ground truth. The neural network consists of separate modules trained to classify homologs using independent features. The first module classifies homologs of chromosome 16 based on differences in their centromere intensities. The second module utilizes the P and the Q-arm telomere lengths to classify homologous chromosomes 16. Mutual information is then maximized between the outputs of the modules, training them to produce the same classification results for a given chromosome. A unique input structure is suggested to enable the neural network to compare features within a pair of homologous chromosomes. The results of this work constitute a major step towards multi-feature analysis of chromosome images. Moreover, such analysis will allow a comprehensive evaluation of centromere and telomere repeat content in various chromosome instability syndromes, such as cancer.