Summary: | Copy number changes in protein-coding genes are detrimental if the consequent changes in protein concentrations disrupt essential cellular functions. The dosage sensitivity of transcription factor (TF) genes is particularly interesting because their products are essential in regulating the expression of genetic information. From four recently curated data sets of dosage-sensitive genes (genes with conserved copy numbers across mammals, ohnologs, and two data sets of haploinsufficient genes), we compiled a data set of the most reliable dosage-sensitive (MRDS) genes and a data set of the most reliable dosage-insensitive (MRDIS) genes. The MRDS genes were those present in all four data sets, while the MRDIS genes were those absent from any one of the four data sets and with the probability of being loss of function-intolerant (pLI) values < 0.5 in both of the haploinsufficient gene data sets. Enrichment analysis of TF genes among the MRDS and MRDIS gene data sets showed that TF genes are more likely to be dosage-sensitive than other genes in the human genome. The nuclear receptor family was the most enriched TF family among the dosage-sensitive genes. TF families with very few members were also deemed more likely to be dosage-sensitive than TF families with more members. In addition, we found a certain number of dosage-insensitive TFs. The most typical were the Krüppel-associated box domain-containing zinc-finger proteins (KZFPs). Gene ontology (GO) enrichment analysis showed that the MRDS TFs were enriched for many more terms than the MRDIS TFs; however, the proteins interacting with these two groups of TFs did not show such sharp differences. Furthermore, we found that the MRDIS KZFPs were not significantly enriched for any GO terms, whereas their interacting proteins were significantly enriched for thousands of GO terms. Further characterizations revealed significant differences between MRDS TFs and MRDIS TFs in the lengths and nucleotide compositions of DNA-binding sites as well as in expression level, protein size, and selective force.
|