Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 469
Filtrar
1.
Pharm Stat ; 2024 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-38992926

RESUMO

Clinical trials with continuous primary endpoints typically measure outcomes at baseline, at a fixed timepoint (denoted Tmin), and at intermediate timepoints. The analysis is commonly performed using the mixed model repeated measures method. It is sometimes expected that the effect size will be larger with follow-up longer than Tmin. But extending the follow-up for all patients delays trial completion. We propose an alternative trial design and analysis method that potentially increases statistical power without extending the trial duration or increasing the sample size. We propose following the last enrolled patient until Tmin, with earlier enrollees having variable follow-up durations up to a maximum of Tmax. The sample size at Tmax will be smaller than at Tmin, and due to staggered enrollment, data missing at Tmax will be missing completely at random. For analysis, we propose an alpha-adjusted procedure based on the smaller of the p values at Tmin and Tmax, termed minP $$ minP $$ . This approach can provide the highest power when the powers at Tmin and Tmax are similar. If the power at Tmin and Tmax differ significantly, the power of minP $$ minP $$ is modestly reduced compared with the larger of the two powers. Rare disease trials, due to the limited size of the patient population, may benefit the most with this design.

2.
Cureus ; 16(5): e61457, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38953092

RESUMO

This study investigates the effectiveness of multiple COVID-19 vaccinations on daily confirmed cases in Seoul City. Utilizing comprehensive data on vaccinated individuals and confirmed cases sourced from the official website of the Korean Ministry of the Interior and Safety, we conducted detailed statistical analyses to assess the impact of each vaccination dose. The study covers data from April 21, 2021, to September 29, 2022. Statistical multiple linear regression was employed to analyze the relationship between daily confirmed cases (positive outcomes from PCR tests) and multiple vaccine doses, using p-values as the criteria for determining the effectiveness of each dose. The analysis included data from four vaccination doses. The analysis reveals that the first, second, and third doses of the COVID-19 vaccines have a statistically significant positive effect associated with the daily confirmed cases. However, the study finds that the fourth dose does not show a statistically significant impact on the reduction of daily confirmed cases. This suggests that while the initial three doses are crucial for establishing and maintaining high levels of immunity, the incremental benefit of subsequent doses may diminish.

3.
J Pers Med ; 14(6)2024 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-38929876

RESUMO

BACKGROUND/OBJECTIVES: Temporomandibular disorder (TMD) is the term used to describe a pathology (dysfunction and pain) in the masticatory muscles and temporomandibular joint (TMJ). There is an apparent upward trend in the publication of dental research and a need to continually improve the quality of research. Therefore, this study was conducted to analyse the use of sample size and effect size calculations in a TMD randomised controlled trial. METHODS: The period was restricted to the full 5 years, i.e., papers published in 2019, 2020, 2021, 2022, and 2023. The filter article type-"Randomized Controlled Trial" was used. The studies were graded on a two-level scale: 0-1. In the case of 1, sample size (SS) and effect size (ES) were calculated. RESULTS: In the entire study sample, SS was used in 58% of studies, while ES was used in 15% of studies. CONCLUSIONS: Quality should improve as research increases. One factor that influences quality is the level of statistics. SS and ES calculations provide a basis for understanding the results obtained by the authors. Access to formulas, online calculators and software facilitates these analyses. High-quality trials provide a solid foundation for medical progress, fostering the development of personalized therapies that provide more precise and effective treatment and increase patients' chances of recovery. Improving the quality of TMD research, and medical research in general, helps to increase public confidence in medical advances and raises the standard of patient care.

4.
Int J Mol Sci ; 25(12)2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38928295

RESUMO

The genomic analyses of pediatric acute lymphoblastic leukemia (ALL) subtypes, particularly T-cell and B-cell lineages, have been pivotal in identifying potential therapeutic targets. Typical genomic analyses have directed attention toward the most commonly mutated genes. However, assessing the contribution of mutations to cancer phenotypes is crucial. Therefore, we estimated the cancer effects (scaled selection coefficients) for somatic substitutions in T-cell and B-cell cohorts, revealing key insights into mutation contributions. Cancer effects for well-known, frequently mutated genes like NRAS and KRAS in B-ALL were high, which underscores their importance as therapeutic targets. However, less frequently mutated genes IL7R, XBP1, and TOX also demonstrated high cancer effects, suggesting pivotal roles in the development of leukemia when present. In T-ALL, KRAS and NRAS are less frequently mutated than in B-ALL. However, their cancer effects when present are high in both subtypes. Mutations in PIK3R1 and RPL10 were not at high prevalence, yet exhibited some of the highest cancer effects in individual T-cell ALL patients. Even CDKN2A, with a low prevalence and relatively modest cancer effect, is potentially highly relevant for the epistatic effects that its mutated form exerts on other mutations. Prioritizing investigation into these moderately frequent but potentially high-impact targets not only presents novel personalized therapeutic opportunities but also enhances the understanding of disease mechanisms and advances precision therapeutics for pediatric ALL.


Assuntos
Mutação , Humanos , Criança , Leucemia-Linfoma Linfoblástico de Células Precursoras B/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras B/epidemiologia , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Linfócitos T/imunologia , Linfócitos T/metabolismo , Linfócitos B/imunologia , Linfócitos B/metabolismo
5.
J Biopharm Stat ; : 1-20, 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38853696

RESUMO

The main idea of this paper is to approximate the exact p-value of a class of non-parametric, two-sample location-scale tests. In this paper, the most famous non-parametric two-sample location-scale tests are formulated in a class of linear rank tests. The permutation distribution of this class is derived from a random allocation design. This allows us to approximate the exact p-value of the non-parametric two-sample location-scale tests of the considered class using the saddlepoint approximation method. The proposed method shows high accuracy in approximating the exact p-value compared to the normal approximation method. Moreover, the proposed method only requires a few calculations and time, as in the case of the simulated method. The procedures of the proposed method are clarified through four sets of real data that represent applications for a number of different fields. In addition, a simulation study compares the proposed method with the traditional methods to approximate the exact p-value of the specified class of the non-parametric two-sample location-scale tests.

6.
Korean J Anesthesiol ; 77(3): 316-325, 2024 06.
Artigo em Inglês | MEDLINE | ID: mdl-38835136

RESUMO

The statistical significance of a clinical trial analysis result is determined by a mathematical calculation and probability based on null hypothesis significance testing. However, statistical significance does not always align with meaningful clinical effects; thus, assigning clinical relevance to statistical significance is unreasonable. A statistical result incorporating a clinically meaningful difference is a better approach to present statistical significance. Thus, the minimal clinically important difference (MCID), which requires integrating minimum clinically relevant changes from the early stages of research design, has been introduced. As a follow-up to the previous statistical round article on P values, confidence intervals, and effect sizes, in this article, we present hands-on examples of MCID and various effect sizes and discuss the terms statistical significance and clinical relevance, including cautions regarding their use.


Assuntos
Diferença Mínima Clinicamente Importante , Humanos , Probabilidade , Projetos de Pesquisa , Ensaios Clínicos como Assunto/métodos , Interpretação Estatística de Dados , Intervalos de Confiança
7.
J Evol Biol ; 37(8): 986-993, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-38843076

RESUMO

Statistical analysis and data visualization are integral parts of science communication. One of the major issues in current data analysis practice is an overdependency on-and misuse of-p-values. Researchers have been advocating for the estimation and reporting of effect sizes for quantitative research to enhance the clarity and effectiveness of data analysis. Reporting effect sizes in scientific publications has until now been mainly limited to numeric tables, even though effect size plotting is a more effective means of communicating results. We have developed the Durga R package for estimating and plotting effect sizes for paired and unpaired group comparisons. Durga allows users to estimate unstandardized and standardized effect sizes and bootstrapped confidence intervals of the effect sizes. The central functionality of Durga is to combine effect size visualizations with traditional plotting methods. Durga is a powerful statistical and data visualization package that is easy to use, providing the flexibility to estimate effect sizes of paired and unpaired data using different statistical methods. Durga provides a plethora of options for plotting effect size, which allows users to plot data in the most informative and aesthetic way. Here, we introduce the package and its various functions. We further describe a workflow for estimating and plotting effect sizes using example data sets.


Assuntos
Software , Interpretação Estatística de Dados , Visualização de Dados
8.
Rev. neurol. (Ed. impr.) ; 78(7): 209-211, Ene-Jun, 2024.
Artigo em Espanhol | IBECS | ID: ibc-232183

RESUMO

Las revistas científicas más importantes en campos como medicina, biología y sociología publican reiteradamente artículos y editoriales denunciando que un gran porcentaje de médicos no entiende los conceptos básicos del análisis estadístico, lo que favorece el riesgo de cometer errores al interpretar los datos, los hace más vulnerables frente a informaciones falsas y reduce la eficacia de la investigación. Este problema se extiende a lo largo de toda su carrera profesional y se debe, en gran parte, a una enseñanza deficiente en estadística que es común en países desarrollados. En palabras de H. Halle y S. Krauss, ‘el 90% de los profesores universitarios alemanes que usan con asiduidad el valor de p de los test no entiende lo que mide ese valor’. Es importante destacar que los razonamientos básicos del análisis estadístico son similares a los que realizamos en nuestra vida cotidiana y que comprender los conceptos básicos del análisis estadístico no requiere conocimiento matemático alguno. En contra de lo que muchos investigadores creen, el valor de p del test no es un ‘índice matemático’ que nos permita concluir claramente si, por ejemplo, un fármaco es más efectivo que el placebo. El valor de p del test es simplemente un porcentaje.(AU)


Abstract. Leading scientific journals in fields such as medicine, biology and sociology repeatedly publish articles and editorials claiming that a large percentage of doctors do not understand the basics of statistical analysis, which increases the risk of errors in interpreting data, makes them more vulnerable to misinformation and reduces the effectiveness of research. This problem extends throughout their careers and is largely due to the poor training they receive in statistics – a problem that is common in developed countries. As stated by H. Halle and S. Krauss, ‘90% of German university lecturers who regularly use the p-value in tests do not understand what that value actually measures’. It is important to note that the basic reasoning of statistical analysis is similar to what we do in our daily lives and that understanding the basic concepts of statistical analysis does not require any knowledge of mathematics. Contrary to what many researchers believe, the p-value of the test is not a ‘mathematical index’ that allows us to clearly conclude whether, for example, a drug is more effective than a placebo. The p-value of the test is simply a percentage.(AU)


Assuntos
Humanos , Masculino , Feminino , Pesquisa Biomédica , Publicação Periódica , Publicações Científicas e Técnicas , Testes de Hipótese , Valor Preditivo dos Testes
9.
Genome Med ; 16(1): 56, 2024 04 16.
Artigo em Inglês | MEDLINE | ID: mdl-38627848

RESUMO

Despite the abundance of genotype-phenotype association studies, the resulting association outcomes often lack robustness and interpretations. To address these challenges, we introduce PheSeq, a Bayesian deep learning model that enhances and interprets association studies through the integration and perception of phenotype descriptions. By implementing the PheSeq model in three case studies on Alzheimer's disease, breast cancer, and lung cancer, we identify 1024 priority genes for Alzheimer's disease and 818 and 566 genes for breast cancer and lung cancer, respectively. Benefiting from data fusion, these findings represent moderate positive rates, high recall rates, and interpretation in gene-disease association studies.


Assuntos
Doença de Alzheimer , Neoplasias da Mama , Aprendizado Profundo , Neoplasias Pulmonares , Humanos , Feminino , Doença de Alzheimer/genética , Teorema de Bayes , Estudos de Associação Genética , Neoplasias da Mama/genética
10.
Cureus ; 16(3): e56418, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38638715

RESUMO

Background Organ and body development greatly varies in pediatric patients from year to year. Therefore, the incidence of each adverse event following phenobarbital (PB) administration would vary with age. However, in clinical trials, increasing the sample size of pediatric patients in each age group has been challenging. Therefore, previous studies were conducted by dividing pediatric patients into three or four age groups based on the development stage. Although these results were useful in clinical settings, information on adverse events that occurred at one-year age increments in pediatric patients could further enhance treatment and care. Objectives This study investigated in one-year age increments the occurrence tendency of each adverse event following PB administration in pediatric patients. Methods This study used data obtained from the U.S. Food and Drug Administration Adverse Event Reporting System (FAERS). Two inclusion criteria were set: (1) treatment with PB between January 2004 and June 2023 and (2) age 0-15 years. Using the cutoff value obtained using the Wilcoxon-Mann-Whitney test by the minimum p-value approach, this study explored changes in the occurrence tendency of each adverse event in one-year age increments. At the minimum p-value of <0.05, the age corresponding to this p-value was determined as the cutoff value. Conversely, at the minimum p-value of ≥0.05, the cutoff value was considered nonexistent. Results This study investigated all types of adverse events and explored the cutoff value for each adverse event. We identified 34, 16, 15, nine, five, five, eight, three, and eight types of adverse events for the cutoff values of ≤3/>3, ≤4/>4, ≤5/>5, ≤6/>6, ≤7/>7, ≤8/>8, ≤9/>9, ≤10/>10, and ≤11/>11 years, respectively. Conclusions This study demonstrated that adverse events requiring attention in pediatric patients varied with age. The findings help in the improvement of treatment and care in the pediatric clinical settings.

11.
Proc Natl Acad Sci U S A ; 121(15): e2304671121, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38564640

RESUMO

Contingency tables, data represented as counts matrices, are ubiquitous across quantitative research and data-science applications. Existing statistical tests are insufficient however, as none are simultaneously computationally efficient and statistically valid for a finite number of observations. In this work, motivated by a recent application in reference-free genomic inference [K. Chaung et al., Cell 186, 5440-5456 (2023)], we develop Optimized Adaptive Statistic for Inferring Structure (OASIS), a family of statistical tests for contingency tables. OASIS constructs a test statistic which is linear in the normalized data matrix, providing closed-form P-value bounds through classical concentration inequalities. In the process, OASIS provides a decomposition of the table, lending interpretability to its rejection of the null. We derive the asymptotic distribution of the OASIS test statistic, showing that these finite-sample bounds correctly characterize the test statistic's P-value up to a variance term. Experiments on genomic sequencing data highlight the power and interpretability of OASIS. Using OASIS, we develop a method that can detect SARS-CoV-2 and Mycobacterium tuberculosis strains de novo, which existing approaches cannot achieve. We demonstrate in simulations that OASIS is robust to overdispersion, a common feature in genomic data like single-cell RNA sequencing, where under accepted noise models OASIS provides good control of the false discovery rate, while Pearson's [Formula: see text] consistently rejects the null. Additionally, we show in simulations that OASIS is more powerful than Pearson's [Formula: see text] in certain regimes, including for some important two group alternatives, which we corroborate with approximate power calculations.


Assuntos
Genoma , Genômica , Mapeamento Cromossômico
12.
J Transl Med ; 22(1): 258, 2024 03 09.
Artigo em Inglês | MEDLINE | ID: mdl-38461317

RESUMO

BACKGROUND: The term eGene has been applied to define a gene whose expression level is affected by at least one independent expression quantitative trait locus (eQTL). It is both theoretically and empirically important to identify eQTLs and eGenes in genomic studies. However, standard eGene detection methods generally focus on individual cis-variants and cannot efficiently leverage useful knowledge acquired from auxiliary samples into target studies. METHODS: We propose a multilocus-based eGene identification method called TLegene by integrating shared genetic similarity information available from auxiliary studies under the statistical framework of transfer learning. We apply TLegene to eGene identification in ten TCGA cancers which have an explicit relevant tissue in the GTEx project, and learn genetic effect of variant in TCGA from GTEx. We also adopt TLegene to the Geuvadis project to evaluate its usefulness in non-cancer studies. RESULTS: We observed substantial genetic effect correlation of cis-variants between TCGA and GTEx for a larger number of genes. Furthermore, consistent with the results of our simulations, we found that TLegene was more powerful than existing methods and thus identified 169 distinct candidate eGenes, which was much larger than the approach that did not consider knowledge transfer across target and auxiliary studies. Previous studies and functional enrichment analyses provided empirical evidence supporting the associations of discovered eGenes, and it also showed evidence of allelic heterogeneity of gene expression. Furthermore, TLegene identified more eGenes in Geuvadis and revealed that these eGenes were mainly enriched in cells EBV transformed lymphocytes tissue. CONCLUSION: Overall, TLegene represents a flexible and powerful statistical method for eGene identification through transfer learning of genetic similarity shared across auxiliary and target studies.


Assuntos
Neoplasias , Polimorfismo de Nucleotídeo Único , Humanos , Locos de Características Quantitativas/genética , Genômica , Neoplasias/genética , Aprendizado de Máquina , Estudo de Associação Genômica Ampla/métodos
13.
Biometrics ; 80(2)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38536747

RESUMO

We develop a method for hybrid analyses that uses external controls to augment internal control arms in randomized controlled trials (RCTs) where the degree of borrowing is determined based on similarity between RCT and external control patients to account for systematic differences (e.g., unmeasured confounders). The method represents a novel extension of the power prior where discounting weights are computed separately for each external control based on compatibility with the randomized control data. The discounting weights are determined using the predictive distribution for the external controls derived via the posterior distribution for time-to-event parameters estimated from the RCT. This method is applied using a proportional hazards regression model with piecewise constant baseline hazard. A simulation study and a real-data example are presented based on a completed trial in non-small cell lung cancer. It is shown that the case weighted power prior provides robust inference under various forms of incompatibility between the external controls and RCT population.


Assuntos
Projetos de Pesquisa , Humanos , Simulação por Computador , Modelos de Riscos Proporcionais , Teorema de Bayes
14.
Postgrad Med J ; 100(1185): 451-460, 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38330498

RESUMO

First popularized almost a century ago in epidemiologic research by Ronald Fisher and Jerzy Neyman, the P-value has become perhaps the most misunderstood and even misused statistical value or descriptor. Indeed, modern clinical research has now come to be centered around and guided by an arbitrary P-value of <0.05 as a magical threshold for significance, so much so that experimental design, reporting of experimental findings, and interpretation and adoption of such findings have become largely dependent on this "significant" P-value. This has given rise to multiple biases in the overall body of biomedical literature that threatens the very validity of clinical research. Ultimately, a drive toward reporting a "significant" P-value (by various statistical manipulations) risks creating a falsely positive body of science, leading to (i) wasted resources in pursuing fruitless research and (ii) futile or even harmful policies/therapeutic recommendations. This article reviews the history of the P-value, the conceptual basis of P-value in the context of hypothesis testing and challenges in critically appraising clinical evidence vis-à-vis the P-value. This review is aimed at raising awareness of the pitfalls of this rigid observation of the threshold of statistical significance when evaluating clinical trials and to generate discussion regarding whether the scientific body needs a rethink about how we decide clinical significance.


Assuntos
Medicina Baseada em Evidências , Humanos , Pesquisa Biomédica , Projetos de Pesquisa , Interpretação Estatística de Dados
15.
Biom J ; 66(2): e2200204, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38356198

RESUMO

Storey's estimator for the proportion of true null hypotheses, originally proposed under the continuous framework, has been modified in this work under the discrete framework. The modification results in improved estimation of the parameter of interest. The proposed estimator is used to formulate an adaptive version of the Benjamini-Hochberg procedure. Control over the false discovery rate by the proposed adaptive procedure has been proved analytically. The proposed estimate is also used to formulate an adaptive version of the Benjamini-Hochberg-Heyse procedure. Simulation experiments establish the conservative nature of this new adaptive procedure. Substantial amount of gain in power is observed for the new adaptive procedures over the standard procedures. For demonstration of the proposed method, two important real life gene expression data sets, one related to the study of HIV and the other related to methylation study, are used.


Assuntos
Simulação por Computador
16.
Clin Interv Aging ; 19: 277-287, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38380229

RESUMO

Null hypothesis significant testing (NHST) is the dominant statistical approach in the geriatric and rehabilitation fields. However, NHST is routinely misunderstood or misused. In this case, the findings from clinical trials would be taken as evidence of no effect, when in fact, a clinically relevant question may have a "non-significant" p-value. Conversely, findings are considered clinically relevant when significant differences are observed between groups. To assume that p-value is not an exclusive indicator of an association or the existence of an effect, researchers should be encouraged to report other statistical analysis approaches as Bayesian analysis and complementary statistical tools alongside the p-value (eg, effect size, confidence intervals, minimal clinically important difference, and magnitude-based inference) to improve interpretation of the findings of clinical trials by presenting a more efficient and comprehensive analysis. However, the focus on Bayesian analysis and secondary statistical analyses does not mean that NHST is less important. Only that, to observe a real intervention effect, researchers should use a combination of secondary statistical analyses in conjunction with NHST or Bayesian statistical analysis to reveal what p-values cannot show in the geriatric and rehabilitation studies (eg, the clinical importance of 1kg increase in handgrip strength in the intervention group of long-lived older adults compared to a control group). This paper provides potential insights for improving the interpretation of scientific data in rehabilitation and geriatric fields by utilizing Bayesian and secondary statistical analyses to better scrutinize the results of clinical trials where a p-value alone may not be appropriate to determine the efficacy of an intervention.


Assuntos
Força da Mão , Projetos de Pesquisa , Humanos , Idoso , Teorema de Bayes , Interpretação Estatística de Dados
17.
J Clin Transl Sci ; 8(1): e9, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38384917

RESUMO

The proposal of improving reproducibility by lowering the significance threshold to 0.005 has been discussed, but the impact on conducting clinical trials has yet to be examined from a study design perspective. The impact on sample size and study duration was investigated using design setups from 125 phase II studies published between 2015 and 2022. The impact was assessed using percent increase in sample size and additional years of accrual with the medians being 110.97% higher and 2.65 years longer respectively. The results indicated that this proposal causes additional financial burdens that reduce the efficiency of conducting clinical trials.

18.
Rev. neurol. (Ed. impr.) ; 78(1)1 - 15 de Enero 2024. tab
Artigo em Espanhol | IBECS | ID: ibc-229062

RESUMO

Una práctica muy habitual en la investigación médica, durante el proceso de análisis de los datos, es dicotomizar variables numéricas en dos grupos. Dicha práctica conlleva la pérdida de información muy útil que puede restar eficacia a la investigación. A través de varios ejemplos, se muestra cómo con la dicotomización de variables numéricas los estudios pierden potencia estadística. Esto puede ser un aspecto crítico que impida valorar, por ejemplo, si un procedimiento terapéutico es más efectivo o si un determinado factor es de riesgo. Por tanto, se recomienda no dicotomizar las variables continuas si no existe un motivo muy concreto para ello. (AU)


Abstract. A very common practice in medical research, during the process of data analysis, is to dichotomise numerical variables in two groups. This leads to the loss of very useful information that can undermine the effectiveness of the research. Several examples are used to show how the dichotomisation of numerical variables can lead to a loss of statistical power in studies. This can be a critical aspect in assessing, for example, whether a therapeutic procedure is more effective or whether a certain factor is a risk factor. Dichotomising continuous variables is therefore not recommended unless there is a very specific reason to do so. (AU)


Assuntos
Pesquisa Biomédica/estatística & dados numéricos , Modelos Estatísticos
19.
J Biopharm Stat ; 34(1): 127-135, 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-36710407

RESUMO

The paper provides computations comparing the accuracy of the saddlepoint approximation approach and the normal approximation method in approximating the mid-p-value of Wilcoxon and log-rank tests for the left-truncated data using a truncated binomial design. The paper uses real data examples to apply the comparison, along with some simulated studies. Confidence intervals are provided by the inversion of the tests under consideration.


Assuntos
Intervalos de Confiança , Humanos , Tamanho da Amostra
20.
Proteomics ; 24(5): e2300145, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37726251

RESUMO

Exact p-value (XPV)-based methods for dot product-like score functions-such as the XCorr score implemented in Tide, SEQUEST, Comet or shared peak count-based scoring in MSGF+ and ASPV-provide a fairly good calibration for peptide-spectrum-match (PSM) scoring in database searching-based MS/MS spectrum data identification. Unfortunately, standard XPV methods, in practice, cannot handle high-resolution fragmentation data produced by state-of-the-art mass spectrometers because having smaller bins increases the number of fragment matches that are assigned to incorrect bins and scored improperly. In this article, we present an extension of the XPV method, called the high-resolution exact p-value (HR-XPV) method, which can be used to calibrate PSM scores of high-resolution MS/MS spectra obtained with dot product-like scoring such as the XCorr. The HR-XPV carries remainder masses throughout the fragmentation, allowing them to greatly increase the number of fragments that are properly assigned to the correct bin and, thus, taking advantage of high-resolution data. Using four mass spectrometry data sets, our experimental results demonstrate that HR-XPV produces well-calibrated scores, which in turn results in more trusted spectrum annotations at any false discovery rate level.


Assuntos
Algoritmos , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Software , Peptídeos/química , Calibragem , Bases de Dados de Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...