On 28 August at 14:15 Mikk Puustusmaa will defend his doctoral thesis „On the origin of papillomavirus proteins”.
Supervisors:
Senior Research Fellow in Virology Aare Abroi, Institute of Technology
Professor of Bioinformatics Maido Remm, Institute of Molecular and Cell Biology
Opponent:
Senior Research Fellow Andrew E. Firth, PhD, Department of Pathology, University of Cambridge, Great Britain.
Viruses are obligatory intracellular parasites harbouring enormous genetic and biological diversity. Viruses are the most abundant biological entities on Earth. Unlike cellular organisms, viruses have multiple evolutionary origins. While there are many hypotheses how viruses emerged, their exact origin is still unknown. The occurrence of viral protein domains in cellular organisms may give us information about the origin of viruses.
In the current thesis, papillomaviruses (PVs) were used as an example to study the potential origin of a viral family. PVs infect many mammalian species, but also birds, turtles, snakes, and fish. PVs have been of interest due to their association with various cancers. Oncogenic human papillomaviruses (HPVs) are responsible for almost all cases of cervical and anal cancers. In this thesis, various sequence collections were analysed to detect distant homologs to PV protein domains in other organisms. We found that PVs have very weak connections to cellular organisms, as only domains from the E1 replication protein had distant homologs in cellular organisms. However, our study revealed that PVs are evolutionarily related to Polyomaviridae and Parvoviridae family. Both of them shared structural homologs of capsid protein L1 and two domains of replication protein E1.
Viral genomes mainly encode protein-coding genes. Occasionally, some of these genes are fully embedded inside one another. In this thesis, over 300 PV genomes were analysed in silico to detect an embedded gene called E8, located within the E1 gene. The E8 was detected in almost all PV-s, except PVs infecting Sauropsida and fish. As these hosts are evolutionarily older than mammalian species, it confirms that E8 emerged after the divergence of mammals.
The detection of the dual-coding region E8 and other embedded elements needs specific solutions. In this thesis, a web tool called cRegions [http://bioinfo.ut.ee/cRegions/] was developed to detect overlapping genes and other embedded elements in protein-coding genes of viruses.