Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
2.
bioRxiv ; 2023 Oct 20.
Article in English | MEDLINE | ID: mdl-37905042

ABSTRACT

Background: A variant can be pathogenic or benign with relation to a human disease. Current classification categories from benign to pathogenic reflect a probabilistic summary of current understanding. A primary metric of clinical utility for multiplexed assays of variant effect (MAVE) is the number of variants that can be reclassified from uncertain significance (VUS). However, we hypothesized that this measure of utility underrepresents the information gained from MAVEs and that an information theory approach which includes data that does not reclassify variants will better reflect true information gain. We used this information theory approach to evaluate the information gain, in bits, for MAVEs of BRCA1, PTEN, and TP53. Here, one bit represents the amount of information required to completely classify a single variant starting from no information. Results: BRCA1 MAVEs produced a total of 831.2 bits of information, 6.58% of the total missense information in BRCA1 and a 22-fold increase over the information that only contributed to VUS reclassification. PTEN MAVEs produced 2059.6 bits of information which represents 32.8% of the total missense information in PTEN and an 85-fold increase over the information that contributed to VUS reclassification. TP53 MAVEs produced 277.8 bits of information which represents 6.22% of the total missense information in TP53 and a 3.5-fold increase over the information that contributed to VUS reclassification. Conclusions: An information content approach will more accurately portray information gained through MAVE mapping efforts than counting the number of variants reclassified. This information content approach may also help define the impact of modifying information definitions used to classify many variants, such as guideline rule changes.

3.
HGG Adv ; 4(4): 100240, 2023 Oct 12.
Article in English | MEDLINE | ID: mdl-37718511

ABSTRACT

Carriers of BRCA1 germline pathogenic variants are at substantially higher risk of developing breast and ovarian cancer than the general population. Accurate identification of at-risk individuals is crucial for risk stratification and the implementation of targeted preventive and therapeutic interventions. Despite significant progress in variant classification efforts, a sizable portion of reported BRCA1 variants remain as variants of uncertain clinical significance (VUSs). Variants leading to premature protein termination and loss of essential functional domains are typically classified as pathogenic. However, the impact of frameshift variants that result in an extended incorrect terminus is not clear. Using validated functional assays, we conducted a systematic functional assessment of 17 previously reported BRCA1 extended incorrect terminus variants (EITs) and concluded that 16 constitute loss-of-function variants. This suggests that most EITs are likely to be pathogenic. However, one variant, c.5578dup, displayed a protein expression level, affinity to known binding partners, and activity in transcription and homologous recombination assays comparable to the wild-type BRCA1 protein. Twenty-three additional carriers of c.5578dup were identified at a US clinical diagnostic lab and assessed using a family history likelihood model providing, in combination with the functional data, a likely benign interpretation. These results, consistent with family history data in the current study and available data from ClinVar, indicate that most, but not all, BRCA1 variants leading to an extended incorrect terminus constitute loss-of-function variants and underscore the need for comprehensive assessment of individual variants.


Subject(s)
Genetic Predisposition to Disease , Ovarian Neoplasms , Female , Humans , Protein C , BRCA1 Protein/genetics , Ovarian Neoplasms/epidemiology , Germ-Line Mutation/genetics
4.
Hum Mutat ; 43(5): 547-556, 2022 05.
Article in English | MEDLINE | ID: mdl-35225377

ABSTRACT

Clinical genetic sequencing tests often identify variants of uncertain significance. One source of data that can help classify the pathogenicity of variants is familial cosegregation analysis. Identifying and genotyping relatives for cosegregation analysis can be time consuming and costly. We propose an algorithm that describes a single measure of expected variant information gain from genotyping a single additional relative in a family. Then we explore the performance of this algorithm by comparing actual recruitment strategies used in 35 families who had pursued cosegregation analysis with synthetic pedigrees of possible testing outcomes if the families had pursued an optimized testing strategy instead. For each actual and synthetic pedigree, we calculated the likelihood ratio of pathogenicity as each successive test was added to the pedigree. We analyzed the differences in cosegregation likelihood ratio over time resulting from actual versus optimized testing approaches. Employing the testing strategy indicated by the algorithm would have led to maximal information more rapidly in 30 of the 35 pedigrees (86%). Many clinical and research laboratories are involved in targeted cosegregation analysis. The algorithm we present can facilitate a data driven approach to optimal relative recruitment and genotyping for cosegregation analysis and more efficient variant classification.


Subject(s)
Genetic Testing , Genetic Variation , Algorithms , Genetic Testing/methods , Humans , Pedigree
5.
Eur J Hum Genet ; 27(12): 1800-1807, 2019 12.
Article in English | MEDLINE | ID: mdl-31296927

ABSTRACT

Recent studies have reported novel cancer risk associations with incidentally tested genes on cancer risk panels using clinically ascertained cohorts. Clinically ascertained pedigrees may have unknown ascertainment biases for both patients and relatives. We used a method to assess gene and variant risk and ascertainment bias based on comparing the number of observed disease instances in a pedigree given the sex and ages of individuals with those expected given established population incidence. We assessed the performance characteristics of the method by simulating families with varying genetic risk and proportion of individuals genotyped. We implemented this method using SEER cancer incidence data to assess clinical ascertainment bias in a set of 42 pedigrees with clinical testing ordered for either breast/ovarian cancer or colorectal/endometrial cancer at the University of Washington and negative sequencing results. In addition to expected biases consistent with the stated testing purpose, there were trends suggesting increased colorectal and endometrial cancer in pedigrees tested for breast cancer risk and trends suggesting increased breast cancer in families tested for colon cancer risk. There was no observed selection bias for prostate cancer in this set of families. This analysis illustrates that clinically ascertained data sets may have subtle biases. In the future, researchers seeking to explore risk associations with clinical data sets could assess potential ascertainment bias by comparing incidence of disease in families that test negative under given ordering criteria to expected population disease frequencies. Failure to assess for ascertainment bias increases the risk of false genetic associations.


Subject(s)
Genetic Testing/standards , Neoplasms/epidemiology , Pedigree , Risk Assessment/standards , Adult , Breast Neoplasms/epidemiology , Breast Neoplasms/genetics , Colorectal Neoplasms/epidemiology , Colorectal Neoplasms/genetics , Female , Humans , Male , Middle Aged , Neoplasms/genetics , Neoplasms/pathology , Ovarian Neoplasms/epidemiology , Ovarian Neoplasms/genetics , Risk Factors , SEER Program
6.
JAMA Oncol ; 5(9): 1325-1331, 2019 Sep 01.
Article in English | MEDLINE | ID: mdl-31246251

ABSTRACT

IMPORTANCE: CDH1 pathogenic variants have been estimated to confer a 40% to 70% and 56% to 83% lifetime risk for gastric cancer in men and women, respectively. These are likely to be overestimates owing to ascertainment of families with multiple cases of gastric cancer. To our knowledge, there are no penetrance estimates for CDH1 without this ascertainment bias. OBJECTIVE: To estimate CDH1 penetrance in a patient cohort not exclusively ascertained based on strict hereditary diffuse gastric cancer (HDGC) criteria. DESIGN, SETTING, AND PARTICIPANTS: Retrospective review of 75 families found to have pathogenic variants in CDH1 through clinical ascertainment and multigene panel testing at a large commercial diagnostic laboratory from August 5, 2013, to June 30, 2018. CDH1 pathogenic variants were identified in 238 individuals from 75 families. Pedigrees from those families included cancer status for 1679 relatives. Penetrance estimates are based on 41 families for which completed pedigrees were available. MAIN OUTCOMES AND MEASURES: Gastric cancer standardized incidence ratio estimates relative to Surveillance, Epidemiology, and End Results (SEER) Program incidence for pathogenic CDH1 variants from families ascertained without regard to HDGC criteria. RESULTS: Among the 238 individuals with a CDH1 pathogenic variant, mean (SD) age was 49.3 (18.1) years and 63.4% were female. Ethnicity was reported for 67 of 75 (89%) families; of these 67 families, 51 (76%) reported European ancestry, whereas Asian, African, Latino, and 2 or more ancestries were reported for 4 families (6%) each. Standardized incidence ratios for gastric and breast cancer were significantly elevated above SEER incidence. Extrapolated cumulative incidence of gastric cancer at age 80 years was 42% (95% CI, 30%-56%) for men and 33% (95% CI, 21%-43%) for women with pathogenic variants in CDH1, whereas cumulative incidence of female breast cancer was estimated at 55% (95% CI, 39%-68%). International Gastric Cancer Linkage Consortium criteria were met in 25 of the 75 (33%) families; however, dispensing with the requirement of confirmation of HDGC histologic subtype, 43 (57%) would meet criteria. CONCLUSIONS AND RELEVANCE: The cumulative incidence of gastric cancer for individuals with pathogenic variants in CDH1 is significantly lower than previously described. Because prophylactic gastrectomy can have bearing upon both physical and psychological health, further discussion is warranted to assess whether this surgical recommendation is appropriate for all individuals with pathogenic variants in CDH1.

7.
Fam Cancer ; 18(1): 67-73, 2019 01.
Article in English | MEDLINE | ID: mdl-30019097

ABSTRACT

Past methods for estimating the population frequency of familial cancer syndromes have used cases and controls ignoring the familial nature of genetic disease. In this study we modified the capture-recapture method from ecology to estimate the number of families in central Ohio with Lynch syndrome (LS). We screened 1566 colorectal cancer cases and 545 endometrial cancer cases in central Ohio from 1999 to 2005 and identified 58 with LS. We screened an additional 3346 colorectal and 342 endometrial cancer cases from 2013 to 2016 and identified 149 with LS. We found 12 LS mutations shared between families observed in the first and second studies. We identified three individuals between studies who were closely related and eight who were more distantly related. We used identified family relationships and genetic test results to estimate family size and structure. Applying a modified capture-recapture method we estimate 1693 3-generation families in the area who have 288 unique LS causing mutations. Comprehensive colorectal and endometrial cancer screening will take about 20 years to identify 50% of families with LS. This is the first time that the capture-recapture method has been applied to estimate the burden of families with a specific heritable disease. Family structure reveals the potential extent of prevention and the time necessary to identify a proportion of families with LS.


Subject(s)
Colorectal Neoplasms, Hereditary Nonpolyposis/epidemiology , Early Detection of Cancer/methods , Genetic Testing/statistics & numerical data , Medical History Taking/statistics & numerical data , Colorectal Neoplasms, Hereditary Nonpolyposis/genetics , Colorectal Neoplasms, Hereditary Nonpolyposis/prevention & control , Cost of Illness , Early Detection of Cancer/statistics & numerical data , Endometrial Neoplasms/genetics , Female , Humans , Mutation , Ohio/epidemiology
8.
Genet Med ; 21(6): 1435-1442, 2019 06.
Article in English | MEDLINE | ID: mdl-30374176

ABSTRACT

PURPOSE: Family studies are an important but underreported source of information for reclassification of variants of uncertain significance (VUS). We evaluated outcomes of a patient-driven framework that offered familial VUS reclassification analysis to any adult with any clinically ascertained VUS from any laboratory in the United States. METHODS: With guidance from FindMyVariant.org, participants recruited their own relatives for study participation. We genotyped relatives, calculated quantitative cosegregation likelihood ratios, and evaluated variant classifications using Tavtigian's unified framework for Bayesian analysis with American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) criteria. We report participation and VUS reclassification rates from the 50 families enrolled for at least one year and reclassification results for 112 variants from the larger 92-family cohort. RESULTS: For the 50-family cohort, 6.7 relatives per family were invited to participate and 67% of relatives returned samples for genotyping. Sixty-one percent of VUS were reclassified, 84% of which were classified as benign or likely benign. Genotyping relatives identified a de novo variant, phase variants, and relatives with phenotypes highly specific for or incompatible with specific classifications. CONCLUSIONS: Motivated families can contribute to successful VUS reclassification at substantially higher rates than those previously published. Clinical laboratories could consider offering family studies to all patients with VUS.


Subject(s)
Genetic Predisposition to Disease/classification , Genetic Variation/genetics , Sequence Analysis, DNA/methods , Adult , Aged , Aged, 80 and over , Bayes Theorem , Family , Female , Genetic Testing/methods , Genomics/methods , Genotype , Genotyping Techniques/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Male , Middle Aged , Mutation/genetics , Software
9.
Am J Hum Genet ; 103(1): 19-29, 2018 07 05.
Article in English | MEDLINE | ID: mdl-29887214

ABSTRACT

Present guidelines for classification of constitutional variants do not incorporate inferences from mutations seen in tumors, even when these are associated with a specific molecular phenotype. When somatic mutations and constitutional mutations lead to the same molecular phenotype, as for the mismatch repair genes, information from somatic mutations may enable interpretation of previously unclassified variants. To test this idea, we first estimated likelihoods that somatic variants in MLH1, MSH2, MSH6, and PMS2 drive microsatellite instability and characteristic IHC staining patterns by calculating likelihoods of high versus low normalized variant read fractions of 153 mutations known to be pathogenic versus those of 760 intronic passenger mutations from 174 paired tumor-normal samples. Mutations that explained the tumor mismatch repair phenotype had likelihood ratio for high variant read fraction of 1.56 (95% CI 1.42-1.71) at sites with no loss of heterozygosity and of 26.5 (95% CI 13.2-53.0) at sites with loss of heterozygosity. Next, we applied these ratios to 165 missense, synonymous, and splice variants observed in tumors, combining in a Bayesian analysis the likelihood ratio corresponding with the adjusted variant read fraction with pretest probabilities derived from published analyses and public databases. We suggest classifications for 86 of 165 variants: 7 benign, 31 likely benign, 22 likely pathogenic, and 26 pathogenic. These results illustrate that for mismatch repair genes, characterization of tumor mutations permits tumor mutation data to inform constitutional variant classification. We suggest modifications to incorporate molecular phenotype in future variant classification guidelines.


Subject(s)
DNA Mismatch Repair/genetics , Mutation/genetics , Neoplasms/genetics , Genetic Predisposition to Disease/genetics , Heterozygote , Humans , Microsatellite Instability , Phenotype
10.
Fam Cancer ; 17(2): 295-302, 2018 04.
Article in English | MEDLINE | ID: mdl-28695303

ABSTRACT

Quantitative cosegregation analysis can help evaluate the pathogenicity of genetic variants. However, genetics professionals without statistical training often use simple methods, reporting only qualitative findings. We evaluate the potential utility of quantitative cosegregation in the clinical setting by comparing three methods. One thousand pedigrees each were simulated for benign and pathogenic variants in BRCA1 and MLH1 using United States historical demographic data to produce pedigrees similar to those seen in the clinic. These pedigrees were analyzed using two robust methods, full likelihood Bayes factors (FLB) and cosegregation likelihood ratios (CSLR), and a simpler method, counting meioses. Both FLB and CSLR outperform counting meioses when dealing with pathogenic variants, though counting meioses is not far behind. For benign variants, FLB and CSLR greatly outperform as counting meioses is unable to generate evidence for benign variants. Comparing FLB and CSLR, we find that the two methods perform similarly, indicating that quantitative results from either of these methods could be combined in multifactorial calculations. Combining quantitative information will be important as isolated use of cosegregation in single families will yield classification for less than 1% of variants. To encourage wider use of robust cosegregation analysis, we present a website ( http://www.analyze.myvariant.org ) which implements the CSLR, FLB, and Counting Meioses methods for ATM, BRCA1, BRCA2, CHEK2, MEN1, MLH1, MSH2, MSH6, and PMS2. We also present an R package, CoSeg, which performs the CSLR analysis on any gene with user supplied parameters. Future variant classification guidelines should allow nuanced inclusion of cosegregation evidence against pathogenicity.


Subject(s)
Genetic Predisposition to Disease , Genetic Testing/methods , Models, Genetic , Neoplastic Syndromes, Hereditary/diagnosis , Pedigree , BRCA1 Protein/genetics , Bayes Theorem , Computer Simulation , Female , Genetic Variation , Humans , Likelihood Functions , Male , Meiosis/genetics , MutL Protein Homolog 1/genetics , Neoplastic Syndromes, Hereditary/genetics , Software , United States
11.
Fam Cancer ; 16(4): 611-620, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28534081

ABSTRACT

Rare and private variants of uncertain significance (VUS) are routinely identified in clinical panel, exome, and genome sequencing. We investigated the power of single family co-segregation analysis to aid classification of VUS. We simulated thousands of pedigrees using demographics in China and the United States, segregating benign and pathogenic variants. Genotypes and phenotypes were simulated using penetrance models for Lynch syndrome and breast/ovarian cancer. We calculated LOD scores adjusted for proband ascertainment (LODadj), to determine power to yield quantitative evidence for, or against, pathogenicity of the VUS. Power to classify VUS was higher for Chinese than United States pedigrees. The number of affected individuals explained the most variation in LODadj (21-38%). The distance to the furthest affected relative (FAR) from the proband explained 1-7% of the variation for the benign VUS and Lynch associated cancers. Minimum age of onset (MAO) explained 5-13% of the variation in families with pathogenic breast/ovarian cancer variants. Random removal of 50% of the phenotype/genotype data reduced power and the variation in LODadj was best explained by FAR followed by the number of affected individuals and MAO when the founder was only two generations from the proband. Power to classify benign variants was ~2x power to classify pathogenic variants. Affecteds-only analysis resulted in virtually no power to correctly classify benign variants and reduced power to classify pathogenic variants. These results can be used to guide recruitment efforts to classify rare and private VUS.


Subject(s)
Breast Neoplasms/genetics , Colorectal Neoplasms, Hereditary Nonpolyposis/genetics , Models, Genetic , Ovarian Neoplasms/genetics , Age of Onset , China , Female , Genetic Predisposition to Disease , Genetic Variation , Humans , Likelihood Functions , Pedigree , Penetrance , United States
12.
PLoS Comput Biol ; 11(5): e1004228, 2015 May.
Article in English | MEDLINE | ID: mdl-25965340

ABSTRACT

The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/.


Subject(s)
Cluster Analysis , Computational Biology/methods , Pattern Recognition, Automated/methods , Algorithms , Databases, Genetic , Gene Expression Profiling/methods , Humans , Software
13.
Bioinformatics ; 30(20): 2915-22, 2014 Oct 15.
Article in English | MEDLINE | ID: mdl-25012181

ABSTRACT

MOTIVATION: Unique modeling and computational challenges arise in locating the geographic origin of individuals based on their genetic backgrounds. Single-nucleotide polymorphisms (SNPs) vary widely in informativeness, allele frequencies change non-linearly with geography and reliable localization requires evidence to be integrated across a multitude of SNPs. These problems become even more acute for individuals of mixed ancestry. It is hardly surprising that matching genetic models to computational constraints has limited the development of methods for estimating geographic origins. We attack these related problems by borrowing ideas from image processing and optimization theory. Our proposed model divides the region of interest into pixels and operates SNP by SNP. We estimate allele frequencies across the landscape by maximizing a product of binomial likelihoods penalized by nearest neighbor interactions. Penalization smooths allele frequency estimates and promotes estimation at pixels with no data. Maximization is accomplished by a minorize-maximize (MM) algorithm. Once allele frequency surfaces are available, one can apply Bayes' rule to compute the posterior probability that each pixel is the pixel of origin of a given person. Placement of admixed individuals on the landscape is more complicated and requires estimation of the fractional contribution of each pixel to a person's genome. This estimation problem also succumbs to a penalized MM algorithm. RESULTS: We applied the model to the Population Reference Sample (POPRES) data. The model gives better localization for both unmixed and admixed individuals than existing methods despite using just a small fraction of the available SNPs. Computing times are comparable with the best competing software. AVAILABILITY AND IMPLEMENTATION: Software will be freely available as the OriGen package in R. CONTACT: ranolaj@uw.edu or klange@ucla.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Frequency , Phylogeography/methods , Algorithms , Bayes Theorem , Genome, Human/genetics , Humans , Polymorphism, Single Nucleotide/genetics , Software , Time Factors
14.
BMC Syst Biol ; 7: 21, 2013 Mar 14.
Article in English | MEDLINE | ID: mdl-23497424

ABSTRACT

BACKGROUND: The models in this article generalize current models for both correlation networks and multigraph networks. Correlation networks are widely applied in genomics research. In contrast to general networks, it is straightforward to test the statistical significance of an edge in a correlation network. It is also easy to decompose the underlying correlation matrix and generate informative network statistics such as the module eigenvector. However, correlation networks only capture the connections between numeric variables. An open question is whether one can find suitable decompositions of the similarity measures employed in constructing general networks. Multigraph networks are attractive because they support likelihood based inference. Unfortunately, it is unclear how to adjust current statistical methods to detect the clusters inherent in many data sets. RESULTS: Here we present an intuitive and parsimonious parametrization of a general similarity measure such as a network adjacency matrix. The cluster and propensity based approximation (CPBA) of a network not only generalizes correlation network methods but also multigraph methods. In particular, it gives rise to a novel and more realistic multigraph model that accounts for clustering and provides likelihood based tests for assessing the significance of an edge after controlling for clustering. We present a novel Majorization-Minimization (MM) algorithm for estimating the parameters of the CPBA. To illustrate the practical utility of the CPBA of a network, we apply it to gene expression data and to a bi-partite network model for diseases and disease genes from the Online Mendelian Inheritance in Man (OMIM). CONCLUSIONS: The CPBA of a network is theoretically appealing since a) it generalizes correlation and multigraph network methods, b) it improves likelihood based significance tests for edge counts, c) it directly models higher-order relationships between clusters, and d) it suggests novel clustering algorithms. The CPBA of a network is implemented in Fortran 95 and bundled in the freely available R package PropClust.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks , Algorithms , Animals , Brain/metabolism , Brain Neoplasms/genetics , Cluster Analysis , Databases, Genetic , Humans , Models, Biological , Pan troglodytes , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...