On 12 June 2019 at 10.15, J. Liivi 2 room 111 Kristi Läll will defend her thesis "Risk scores and their predictive ability for common complex diseases" for obtaining the degree of Doctor of Philosophy (Mathematical Statistics).
Supervisor:
Prof Krista Fischer (Institute of Mathematics and Statistics UT)
Opponents:
Assoc. Prof. Juan R. González Ruiz, Autonomous University of Barcelona. Spain
Assoc. Prof. Tanel Kaart, Estonian University of Life Sciences, Estonia
Summary:
The prices of genotyping and whole genome sequencing have been decreasing rapidly over the past few years. Due to that, genotypic data has become available in large quantities, allowing for extensive investigation of the genetic background of many common complex diseases. The most studied genetic variants are single nucleotide polymorphisms (SNPs). Each SNP separately tends to have a small effect on common complex diseases. However, by combining the effects of many SNPs together into one variable – called genetic risk score (GRS) – one can compose a useful predictor for determining the genetic predisposition for a disease. In this thesis, a new method called doubly-weighting will be introduced, which allows for inclusion of many uncorrelated markers instead of including only few genome-wide significant ones from genome-wide association study(GWAS) and at the same time, intends to correct for winner’s curse bias problem. We illustrate its predictive ability under several scenarios with both simulations and Estonian Biobank data to show that it systematically performs better than more simple methods. In the second article, it was investigated how the selection of GWAS study affects the predictive ability of GRSs for breast cancer. We also tried combining several GRS together into one metaGRS to achieve the best predictive genetic score. We also addressed the problem that different genetic risk scores with similar predictive ability are not necessarily highly correlated for the same disease. Another important aspect influencing the predictive ability of GRSs is the similarity between discovery and target dataset of which the GRS is intended for. This is investigated in the third article, where it is showed that the distributions of GRSs heavily depend on ancestral background of the population. In the fourth article, three known non-genetic risk scores for ASCVD are validated in the Estonian Biobank data. Two of them were well calibrated, but the newest and most complicated algorithm developed in the UK estimated almost twice as less cases than observed. We also compared the statin treatment recommendations based on guideline specific criteria and found that statins for primary prevention were recommended for almost half of the men and quarter of women under investigation, illustrating high risk levels of ASCVD in Estonia.