Third generation sequencing enables fast, accurate reconstruction of reference genomes, producing high-quality assemblies even for non-model species. In this thesis, we present a case study of a structural and functional genome annotation for six grapevine accessions, including a quasi-homozygous line, three cultivars, one wild accession, and one interspecific hybrid, assembled using PacBio HiFi and ONT long reads. The resulting haplotype-resolved assemblies are highly complete, often spanning entire chromosomes from telomere to telomere. Their quality allowed detailed analysis of centromeric regions, revealing multiple tandem repeat families, sometimes forming megabase scaled arrays, interspersed with chromodomain-containing transposable elements, particularly belonging to the Athila family. Additionally, comparative analyses uncovered two major centromere architectures sharing a core set of repeat families. The comparison between the various genotypes highlighted the forces shaping these regions, like repeat expansion, structural rearrangements, and TEs activity. Moreover, the information of cytosine methylation present in the ONT datasets allowed the characterization of the epigenetic state of these regions, revealing both broad patterns and fine-scale regulatory variation. Overall, this work provides a case study demonstrating the power of long-read sequencing for structural and functional genome annotation in understudied species.
Third generation sequencing enables fast, accurate reconstruction of reference genomes, producing high-quality assemblies even for non-model species. In this thesis, we present a case study of a structural and functional genome annotation for six grapevine accessions, including a quasi-homozygous line, three cultivars, one wild accession, and one interspecific hybrid, assembled using PacBio HiFi and ONT long reads. The resulting haplotype-resolved assemblies are highly complete, often spanning entire chromosomes from telomere to telomere. Their quality allowed detailed analysis of centromeric regions, revealing multiple tandem repeat families, sometimes forming megabase scaled arrays, interspersed with chromodomain-containing transposable elements, particularly belonging to the Athila family. Additionally, comparative analyses uncovered two major centromere architectures sharing a core set of repeat families. The comparison between the various genotypes highlighted the forces shaping these regions, like repeat expansion, structural rearrangements, and TEs activity. Moreover, the information of cytosine methylation present in the ONT datasets allowed the characterization of the epigenetic state of these regions, revealing both broad patterns and fine-scale regulatory variation. Overall, this work provides a case study demonstrating the power of long-read sequencing for structural and functional genome annotation in understudied species.
Structural and functional annotation of genomes for the analysis of centromeric structures in Vitis species / Mario Liva , 2026 Mar 26. 38. ciclo, Anno Accademico 2024/2025.
Structural and functional annotation of genomes for the analysis of centromeric structures in Vitis species
LIVA, MARIO
2026-03-26
Abstract
Third generation sequencing enables fast, accurate reconstruction of reference genomes, producing high-quality assemblies even for non-model species. In this thesis, we present a case study of a structural and functional genome annotation for six grapevine accessions, including a quasi-homozygous line, three cultivars, one wild accession, and one interspecific hybrid, assembled using PacBio HiFi and ONT long reads. The resulting haplotype-resolved assemblies are highly complete, often spanning entire chromosomes from telomere to telomere. Their quality allowed detailed analysis of centromeric regions, revealing multiple tandem repeat families, sometimes forming megabase scaled arrays, interspersed with chromodomain-containing transposable elements, particularly belonging to the Athila family. Additionally, comparative analyses uncovered two major centromere architectures sharing a core set of repeat families. The comparison between the various genotypes highlighted the forces shaping these regions, like repeat expansion, structural rearrangements, and TEs activity. Moreover, the information of cytosine methylation present in the ONT datasets allowed the characterization of the epigenetic state of these regions, revealing both broad patterns and fine-scale regulatory variation. Overall, this work provides a case study demonstrating the power of long-read sequencing for structural and functional genome annotation in understudied species.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


