Summary: | The reference genome sequence represents a key resource for genetic studies of the target species. In 2009, two
reference assemblies of the cattle (Bos taurus) genome, were published (Btau 4.0 and UMD 2.0). Both assemblies were
upgraded several times since then. Highly polymorphic major histocompatibility complex (MHC) encodes proteins crucial
for immune recognition and regulation of immune response in vertebrates. It is characterised by extensive nucleotide
diversity, copy number variation of paralogous genes, and long repetitive sequences. In cattle, MHC is designated as
BoLA (bovine leucocyte antigen), located on the chromosome 23. Its organisation differs from typical mammalian MHCs.
The structural complexity makes it difficult to assemble a reliable reference sequence of this genomic region. Therefore,
this region represents a good genomic model region to compare the accuracy of different assembly strategies. Recent
advances in long-read sequence technology, combined with new scaffolding technologies, enabled issuing of the new
bovine reference genome assembly build ARS-UCD 1.2, which is significantly improved over previous bovine genome
assembly releases. In the current study the software tool Mauve for multiple alignment of conserved genomic sequences
with rearrangements was used to identify the differences of genomic organization in the BoLA region assembled in three
bovine reference genomes, Btau 5.0.1, UMD 3.1.1, and ARS-UCD 1.2. Multiple alignment of the bovine chromosome
23 sequences extracted from three genome assemblies revealed differences in the structure of the BoLA region.
Segments encoding genes BOLA-DMA and BOLA-DQB are rearranged and inverted in the new assembly relative to the
previous builds.
|