Structural biology analyses begin with the placement of amino acid variants on the individual 3D protein structures selected by the pipeline. From these placements, the pipeline estimates the energetic impact of each amino acid substitution via a ΔΔGfolding calculation (Park et al. 2016, Frenz et al. 2020). Our PathProx algorithm(Sivley et al. 2018, Sivley et al. 2018) predicts pathogenic variants when they better fit with patterns of “known pathogenic” sites vs. random or benign variant sites. The potential of a variant to disrupt protein-protein interaction surfaces(Tubiana et al. 2022) or perturb post-translational modification sites (Wang et al. 2017) is also predicted in context of protein 3D structure.
Genomics is integrated in the pipeline. Many of our calculations interrogate the Human Genome(Cunningham et al. 2022 ) and genomic databases. Mechanically, to map genomic changes to 3D structures, our code navigates technical challenges in variant effect prediction and transcript curation(McLaren et al. 2016, Bateman et al. 2023). Several of our predictive calculations integrate sequence constraint, gleaned from both multi-species sequence alignments(Pupko et al. 2002, Ashkenazy et al. 2016) and human population sequences(Li et al. 2022 ). Clinvar(Landrum et al. 2018), COSMIC(Tate et al. 2019) and gnomAD(Karczewski et al. 2020) databases are mined for variants needed by PathProx’s mathematical spatial analysis, as well as for web-based visualizations.
Prediction of digenic disease pairs has been an ongoing area of research in both the Capra and Meiler labs. In addition to our DiGePred algorithm(17), we also run a newer pair prediction tool (18) against each case’s gene list.
Please explore the Calculation Details further to learn about the various analyses and their interpretation.