Difference between revisions of "PGx in Estonia"

(Challenges and solutions)
(Challenges and solutions)
Line 24: Line 24:
 
| [[Allele definition|Multiple allele matches]] || Made hierarchy of alleles based on the biochemical function (No function > Decreased Function > Other functional statuses) || Probably this can be seen as a variant of the best solution to the [[Unknown function|unknown function problem]]: Look for the most serious consequence, and if no allele with serious consequence was found, assume Normal function.
 
| [[Allele definition|Multiple allele matches]] || Made hierarchy of alleles based on the biochemical function (No function > Decreased Function > Other functional statuses) || Probably this can be seen as a variant of the best solution to the [[Unknown function|unknown function problem]]: Look for the most serious consequence, and if no allele with serious consequence was found, assume Normal function.
 
|-
 
|-
| Haplotype calling || Haplotype estimation for WGS was performed, but it is unclear which method was used. Probably the methodology is similar to that used in [https://www.nature.com/articles/ejhg201751 ''Mitt et al.''], in which case they used SHAPEIT2 or Eagle2. Eagle2 being 6 times faster, that probably was the choice (cf. speed advantage of [https://www.ncbi.nlm.nih.gov/pubmed/27802932 SNP2HLA vs OptiType]) || In general, the difference between haplotyping and PGx allele matching it not clear (maybe right to say that PGx allele matching is a subset of general haplotyping?). In case there were more than one star allele match per haplotype, they matched all possible star allele diplotypes, and picked the diplotype with the most serious clinical consequence
+
| Haplotype calling || Haplotype estimation for WGS was performed, but it is unclear which method was used. Probably the methodology is similar to that used in [https://www.nature.com/articles/ejhg201751 ''Mitt et al.''], in which case they used SHAPEIT2. Eagle2 is 6 times faster (cf. speed advantage of [https://www.ncbi.nlm.nih.gov/pubmed/27802932 SNP2HLA vs OptiType]). || In general, the difference between haplotyping and PGx allele matching it not clear (maybe right to say that PGx allele matching is a subset of general haplotyping?). In case there were more than one star allele match per haplotype, they matched all possible star allele diplotypes, and picked the diplotype with the most serious clinical consequence
 
|-
 
|-
 
| CYP2D6 calling || Combination of Genome STRiP and normal allele matching (favorable comparison to Astrolabe used by PharmCAT) || Did not understand exactly how they did it (maybe check out reference by [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5292679/ ''Gaedigk et al.''])
 
| CYP2D6 calling || Combination of Genome STRiP and normal allele matching (favorable comparison to Astrolabe used by PharmCAT) || Did not understand exactly how they did it (maybe check out reference by [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5292679/ ''Gaedigk et al.''])

Revision as of 12:13, 24 August 2018

The Estonian Genome Centre at the University of Tartu has done a considerable job with Translating genotype data of 44,000 biobank participants into clinical pharmacogenetic recommendations.

Bioinformatic pipelines

Technology Methods Comments
High density microarrays HumanOmniExpress beadchip (OMNI, 8132 patients) and Global Screening Array (GSA, Illumina, 33157 patients), GenomeStudio (Illumina, genotyping, filtering for GSA), PLINK (filtering for all), zCall (genotyping rare variants for GSA), Eagle2 (phasing), Beagle (impuation, population specific imputation panel from WGS) 1308 of these patients were also Whole genome sequenced
Whole genome sequencing TruSeq PCR-free prep, Illumina HiSeq X (150bp paired-end, 30x mean coverage), BWA-MEM (GRCh37 reference genome), Picard (mark PCR duplicates), GATK 3.4, bcftools (normalization and decomposition), Genome STRiP (CNV calls for CYP2D6, 2269 patients), Astrolabe (allele matching for CYP2D6, for comparison) Quality filtering parameters are given in the article. The WGS samples (with some modifications) were also merged into a reference panel used for imputation (total 2279 Estonians and 1856 Finns). Cf. Mitt et al.
Whole exome sequencing Agilent SureSelect Human All Exon V5+UTRs target capture kit, HiSeq2500 (67x mean coverage), BWA-MEM (GRCh37 reference genome), Picard (mark PCR duplicates), GATK 3.4, bcftools (normalization and decomposition)

Challenges and solutions

Challenge Solution Comments
Allele definition Pruning of allele definitions (removing variants from allele definitions (i.e. only keeping variants that destroys the protein), removing alleles with unknown function) The allele pruning also makes it more likely that patients are indeed normal, removing most sources to alleles with unknown function
HLA-typing SNP2HLA tool (WGS only) SNP2HLA is a fast and reasonably accurate tool, but from another article it seems that in a clinical setting, other tools may be considered
Multiple allele matches Made hierarchy of alleles based on the biochemical function (No function > Decreased Function > Other functional statuses) Probably this can be seen as a variant of the best solution to the unknown function problem: Look for the most serious consequence, and if no allele with serious consequence was found, assume Normal function.
Haplotype calling Haplotype estimation for WGS was performed, but it is unclear which method was used. Probably the methodology is similar to that used in Mitt et al., in which case they used SHAPEIT2. Eagle2 is 6 times faster (cf. speed advantage of SNP2HLA vs OptiType). In general, the difference between haplotyping and PGx allele matching it not clear (maybe right to say that PGx allele matching is a subset of general haplotyping?). In case there were more than one star allele match per haplotype, they matched all possible star allele diplotypes, and picked the diplotype with the most serious clinical consequence
CYP2D6 calling Combination of Genome STRiP and normal allele matching (favorable comparison to Astrolabe used by PharmCAT) Did not understand exactly how they did it (maybe check out reference by Gaedigk et al.)

Take home messages

  • Haplotype calling essential
  • Prefiltering (pruning) of the allele definition tables provided by PharmGKB
  • Rare variants (< 1% minor allele frequency) account for 89% of all (different kinds of) deleterious mutations (affect 30-40% of patients with non-normal allele function according to Lauschke et al.)
  • Rare variants should only be used for research
  • Multiple star alleles are for some genes expected on same haplotype. Suggestion: look for the functional effect of variants within star alleles instead of looking for star alleles, making decision trees that prioritize variants
  • WES is not good enough for PGx, unless adding customized probes (which is generally more expensive than a pure microarray approach)
  • Mircoarrays with impuation of unknown variants is cost-effective approach to PGx
  • WGS has similar quality as microarrays. In addition WGS allows for HLA-calling and finds additional variants that are as yet not actionable