Difference between revisions of "PGx in Estonia"

Line 18: Line 18:
 
! [[NGS|Challenge]] !! Solution !! Comments
 
! [[NGS|Challenge]] !! Solution !! Comments
 
|-
 
|-
| [[Allele definition]] || Pruning of allele definitions (removing variants from allele definitions (i.e. only keeping variants that destroys the protein), removing [[Unknown function|alleles with unknown function]]) || The allele pruning also makes it more likely that patients are indeed normal, thus making the problem of [[Unknown function|alleles with unknown function]] less critical
+
| [[Allele definition]] || Pruning of allele definitions (removing variants from allele definitions (i.e. only keeping variants that destroys the protein), removing [[Unknown function|alleles with unknown function]]) || The allele pruning also makes it more likely that patients are indeed normal, removing most sources to [[Unknown function|alleles with unknown function]]
 
|-
 
|-
 
| [[NGS|HLA-typing]] || SNP2HLA tool (WGS only) || SNP2HLA is a fast and reasonably accurate tool, but it seems that in a clinical setting  [https://www.ncbi.nlm.nih.gov/pubmed/27802932 other tools may be considered]
 
| [[NGS|HLA-typing]] || SNP2HLA tool (WGS only) || SNP2HLA is a fast and reasonably accurate tool, but it seems that in a clinical setting  [https://www.ncbi.nlm.nih.gov/pubmed/27802932 other tools may be considered]
Line 24: Line 24:
 
| [[Allele definition|Multiple allele matches]] || Made hierarchy of alleles based on the biochemical function (No function > Decreased Function > Other functional statuses) || Probably this can be seen as a variant of the best solution to the [[Unknown function|unknown function problem]]: Look for the most serious consequence, and if not found, assume Normal function.
 
| [[Allele definition|Multiple allele matches]] || Made hierarchy of alleles based on the biochemical function (No function > Decreased Function > Other functional statuses) || Probably this can be seen as a variant of the best solution to the [[Unknown function|unknown function problem]]: Look for the most serious consequence, and if not found, assume Normal function.
 
|-
 
|-
| Haplotype calling || In case there were more than one star allele match per haplotype, they matched all possible star allele diplotypes || We suppose that they used haplotype estimation for WGS (Eagle2 as for microarrays?)
+
| Haplotype calling || Eagle2. In case there were more than one star allele match per haplotype, they matched all possible star allele diplotypes || Haplotype estimation for WGS was performed, but it is unclear which method was used (Eagle2 as for microarrays, probably)
 
|-
 
|-
| CYP2D6 calling || Combination of Genome STRiP and normal allele matching (favorable comparison to Astrolabe used by PharmCAT) ||  
+
| CYP2D6 calling || Combination of Genome STRiP and normal allele matching (favorable comparison to Astrolabe used by PharmCAT) || Did not understand exactly how they did it (maybe check out reference by [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5292679/ ''Gaedigk et al.''])
 
|}
 
|}

Revision as of 13:34, 23 August 2018

The Estonian Genome Centre at the University of Tartu has done a considerable job with Translating genotype data of 44,000 biobank participants into clinical pharmacogenetic recommendations.

We here list the bioinformatic pipelines used for the biobank

Technology Methods Comments
High density microarrays HumanOmniExpress beadchip (OMNI, 8132 patients) and Global Screening Array (GSA, Illumina, 33157 patients), GenomeStudio (Illumina, genotyping, filtering for GSA), PLINK (filtering for all), zCall (genotyping rare variants for GSA), Eagle2 (phasing), Beagle (impuation, population specific imputation panel from WGS) 1308 of these patients were also Whole genome sequenced
Whole genome sequencing TruSeq PCR-free prep, Illumina HiSeq X (150bp paired-end, 30x mean coverage), BWA-MEM (GRCh37 reference genome), Picard (mark PCR duplicates), GATK 3.4, bcftools (normalization and decomposition), Genome STRiP (CNV calls for CYP2D6, 2269 patients), Astrolabe (allele matching for CYP2D6, for comparison) Quality filtering parameters are given in the article. The WGS samples (with some modifications) were also merged into a reference panel used for imputation (total 2279 Estonians and 1856 Finns)
Whole exome sequencing Agilent SureSelect Human All Exon V5+UTRs target capture kit, HiSeq2500 (67x mean coverage), BWA-MEM (GRCh37 reference genome), Picard (mark PCR duplicates), GATK 3.4, bcftools (normalization and decomposition)

We here list some of the challenges and solutions they identified:

Challenge Solution Comments
Allele definition Pruning of allele definitions (removing variants from allele definitions (i.e. only keeping variants that destroys the protein), removing alleles with unknown function) The allele pruning also makes it more likely that patients are indeed normal, removing most sources to alleles with unknown function
HLA-typing SNP2HLA tool (WGS only) SNP2HLA is a fast and reasonably accurate tool, but it seems that in a clinical setting other tools may be considered
Multiple allele matches Made hierarchy of alleles based on the biochemical function (No function > Decreased Function > Other functional statuses) Probably this can be seen as a variant of the best solution to the unknown function problem: Look for the most serious consequence, and if not found, assume Normal function.
Haplotype calling Eagle2. In case there were more than one star allele match per haplotype, they matched all possible star allele diplotypes Haplotype estimation for WGS was performed, but it is unclear which method was used (Eagle2 as for microarrays, probably)
CYP2D6 calling Combination of Genome STRiP and normal allele matching (favorable comparison to Astrolabe used by PharmCAT) Did not understand exactly how they did it (maybe check out reference by Gaedigk et al.)