Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-32165781

RESUMO

ROC analysis involving two large datasets is an important method for analyzing statistics of interest for decision making of a classifier in many disciplines. And data dependency due to multiple use of the same subjects exists ubiquitously in order to generate more samples because of limited resources. Hence, a two-layer data structure is constructed and the nonparametric two-sample two-layer bootstrap is employed to estimate standard errors of statistics of interest derived from two sets of data, such as a weighted sum of two probabilities. In this article, to reduce the bootstrap variance and ensure the accuracy of computation, Monte Carlo studies of bootstrap variability were carried out to determine the appropriate number of bootstrap replications in ROC analysis with data dependency. It is suggested that with a tolerance 0.02 of the coefficient of variation, 2,000 bootstrap replications be appropriate under such circumstances.

2.
Artigo em Inglês | MEDLINE | ID: mdl-28660231

RESUMO

The data dependency due to multiple use of the same subjects has impact on the standard error (SE) of the detection cost function (DCF) in speaker recognition evaluation. The DCF is defined as a weighted sum of the probabilities of type I and type II errors at a given threshold. A two-layer data structure is constructed: target scores are grouped into target sets based on the dependency, and likewise for non-target scores. On account of the needed equal probabilities for scores being selected when resampling, target sets must contain the same number of target scores, and so must non-target sets. In addition to the bootstrap method with i.i.d. assumption, the nonparametric two-sample one-layer and two-layer bootstrap methods are carried out based on whether the resampling takes place only on sets, or subsequently on scores within the sets. Due to the stochastic nature of the bootstrap, the distributions of the SEs of the DCF estimated using the three different bootstrap methods are created and compared. After performing hypothesis testing, it is found that data dependency increases not only the SE but also the variation of the SE, and the two-layer bootstrap is more conservative than the one-layer bootstrap. The rationale regarding the different impacts of the three bootstrap methods on the estimated SEs is investigated.

3.
BMC Bioinformatics ; 18(1): 168, 2017 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-28292256

RESUMO

BACKGROUND: Cell image segmentation (CIS) is an essential part of quantitative imaging of biological cells. Designing a performance measure and conducting significance testing are critical for evaluating and comparing the CIS algorithms for image-based cell assays in cytometry. Many measures and methods have been proposed and implemented to evaluate segmentation methods. However, computing the standard errors (SE) of the measures and their correlation coefficient is not described, and thus the statistical significance of performance differences between CIS algorithms cannot be assessed. RESULTS: We propose the total error rate (TER), a novel performance measure for segmenting all cells in the supervised evaluation. The TER statistically aggregates all misclassification error rates (MER) by taking cell sizes as weights. The MERs are for segmenting each single cell in the population. The TER is fully supported by the pairwise comparisons of MERs using 106 manually segmented ground-truth cells with different sizes and seven CIS algorithms taken from ImageJ. Further, the SE and 95% confidence interval (CI) of TER are computed based on the SE of MER that is calculated using the bootstrap method. An algorithm for computing the correlation coefficient of TERs between two CIS algorithms is also provided. Hence, the 95% CI error bars can be used to classify CIS algorithms. The SEs of TERs and their correlation coefficient can be employed to conduct the hypothesis testing, while the CIs overlap, to determine the statistical significance of the performance differences between CIS algorithms. CONCLUSIONS: A novel measure TER of CIS is proposed. The TER's SEs and correlation coefficient are computed. Thereafter, CIS algorithms can be evaluated and compared statistically by conducting the significance testing.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador , Animais , Camundongos , Microscopia de Fluorescência , Miócitos de Músculo Liso/citologia
4.
Commun Stat Simul Comput ; 45(5): 1689-1703, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27499571

RESUMO

The nonparametric two-sample bootstrap is applied to computing uncertainties of measures in ROC analysis on large datasets in areas such as biometrics, speaker recognition, etc., when the analytical method cannot be used. Its validation was studied by computing the SE of the area under ROC curve using the well-established analytical Mann-Whitney-statistic method and also using the bootstrap. The analytical result is unique. The bootstrap results are expressed as a probability distribution due to its stochastic nature. The comparisons were carried out using relative errors and hypothesis testing. They match very well. This validation provides a sound foundation for such computations.

5.
Innov Syst Softw Eng ; 12(4): 249-261, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28133442

RESUMO

A key issue in testing is how many tests are needed for a required level of coverage or fault detection. Estimates are often based on error rates in initial testing, or on code coverage. For example, tests may be run until a desired level of statement or branch coverage is achieved. Combinatorial methods present an opportunity for a different approach to estimating required test set size, using characteristics of the test set. This paper describes methods for estimating the coverage of, and ability to detect, t-way interaction faults of a test set based on a covering array. We also develop a connection between (static) combinatorial coverage and (dynamic) code coverage, such that if a specific condition is satisfied, 100% branch coverage is assured. Using these results, we propose practical recommendations for using combinatorial coverage in specifying test requirements.

6.
Artigo em Inglês | MEDLINE | ID: mdl-35528610

RESUMO

The mission of the Joint Committee for Guides in Metrology (JCGM) is to maintain and promote the use of the Guide to the Expression of Uncertainty in Measurement (GUM) and the International Vocabulary of Metrology (VIM, second edition). The JCGM has produced the third edition of the VIM (referred to as VIM3) and a number of documents; some of which are referred to as supplements to the GUM. We are concerned with the Supplement 1 (GUM-S1) and the document JCGM 104. The signal contribution of the GUM is its operational view of the uncertainty in measurement (as a parameter that characterizes the dispersion of the values that could be attributed to an unknown quantity). The operational view promulgated by the GUM had disconnected the uncertainty in measurement from the unknowable quantities true value and error. The GUM-S1 has diverged from the operational view of the uncertainty in measurement. Either the disparities should be removed or the GUM-S1 should not be referred to as a supplement to the GUM. Also, the GUM-S1 has misinterpreted the Bayesian concept of a statistical parameter and the VIM3 definitions of coverage interval and coverage probability are mathematically defective. We offer practical suggestions for revising the GUM-S1 and the VIM3 to remove their divergence from the GUM and to repair their defects.

7.
J Chem Theory Comput ; 9(2): 951-4, 2013 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-26588738

RESUMO

Anharmonic calculations using vibrational perturbation theory are known to provide near-spectroscopic accuracy when combined with high-level ab initio potential energy functions. However, performance with economical, popular electronic structure methods is less well characterized. We compare the accuracy of harmonic and anharmonic predictions from Hartree-Fock, second-order perturbation, and density functional theories combined with 6-31G(d) and 6-31+G(d,p) basis sets. As expected, anharmonic frequencies are closer than harmonic frequencies to experimental fundamentals. However, common practice is to correct harmonic predictions using multiplicative scaling. The surprising conclusion is that scaled anharmonic calculations are no more accurate than scaled harmonic calculations for the basis sets we used. The data used are from the Computational Chemistry Comparison and Benchmark Database (CCCBDB), maintained by the National Institute of Standards and Technology, which includes more than 3939 independent vibrations for 358 molecules.

8.
J Res Natl Inst Stand Technol ; 116(1): 517-37, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-26989582

RESUMO

In receiver operating characteristic (ROC) analysis, the sampling variability can result in uncertainties of performance measures. Thus, while evaluating and comparing the performances of algorithms, the measurement uncertainties must be taken into account. The key issue is how to calculate the uncertainties of performance measures in ROC analysis. Our ultimate goal is to perform the significance test in evaluation and comparison using the standard errors computed. From the operational perspective, based on fingerprint-image matching algorithms on large datasets, the measures and their uncertainties are investigated in the three scenarios: 1) the true accept rate (TAR) of genuine scores at a specified false accept rate (FAR) of impostor scores, 2) the TAR and FAR at a given threshold, and 3) the equal error rate. The uncertainties of measures are calculated using the nonparametric two-sample bootstrap based on our extensive studies of bootstrap variability on large datasets. The significance test is carried out to determine whether the difference between the performance of one algorithm and a hypothesized value, or the difference between the performances of two algorithms where the correlation is taken into account is statistically significant. Examples are provided.

9.
J Res Natl Inst Stand Technol ; 116(6): 809-20, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-26989601

RESUMO

According to the Guide to the Expression of Uncertainty in Measurement (GUM), a result of measurement consists of a measured value together with its associated standard uncertainty. The measured value and the standard uncertainty are interpreted as the expected value and the standard deviation of a state-of-knowledge probability distribution attributed to the measurand. We discuss the term metrological compatibility introduced by the International Vocabulary of Metrology, third edition (VIM3) for lack of significant differences between two or more results of measurement for the same measurand. Sometimes a combined result of measurement from multiple evaluations of the same measurand is needed. We propose an approach for determining a combined result which is metrologically compatible with the contributing results.

10.
J Chem Theory Comput ; 6(9): 2822-8, 2010 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-26616083

RESUMO

To predict the vibrational spectra of molecules, ab initio calculations are often used to compute harmonic frequencies, which are usually scaled by empirical factors as an approximate correction for errors in the force constants and for anharmonic effects. Anharmonic computations of fundamental frequencies are becoming increasingly popular. We report scaling factors, along with their associated uncertainties, for anharmonic (second-order perturbation theory) predictions from HF, MP2, and B3LYP calculations using the 6-31G(d) and 6-31+G(d,p) basis sets. Different scaling factors are appropriate for low- and high-frequency vibrations. The method of analysis is based upon the Guide to the Expression of Uncertainty in Measurement, published by the International Organization for Standardization (ISO). The data used are from the Computational Chemistry Comparison and Benchmark Database (CCCBDB), maintained by the National Institute of Standards and Technology, which includes more than 3939 independent vibrations for 358 molecules.

11.
J Res Natl Inst Stand Technol ; 115(6): 453-9, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-27134797

RESUMO

In some metrology applications multiple results of measurement for a common measurand are obtained and it is necessary to determine whether the results agree with each other. A result of measurement based on the Guide to the Expression of Uncertainty in Measurement (GUM) consists of a measured value together with its associated standard uncertainty. In the GUM, the measured value is regarded as the expected value and the standard uncertainty is regarded as the standard deviation, both known values, of a state-of-knowledge probability distribution. A state-of-knowledge distribution represented by a result need not be completely known. Then how can one assess the differences between the results based on the GUM? Metrologists have for many years used the Birge chisquare test as 'a rule of thumb' to assess the differences between two or more measured values for the same measurand by pretending that the standard uncertainties were the standard deviations of the presumed sampling probability distributions from random variation of the measured values. We point out that this is misuse of the standard uncertainties; the Birge test and the concept of statistical consistency motivated by it do not apply to the results of measurement based on the GUM. In 2008, the International Vocabulary of Metrology, third edition (VIM3) introduced the concept of metrological compatibility. We propose that the concept of metrological compatibility be used to assess the differences between results based on the GUM for the same measurand. A test of the metrological compatibility of two results of measurement does not conflict with a pairwise Birge test of the statistical consistency of the corresponding measured values.

12.
J Chem Phys ; 130(11): 114102, 2009 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-19317526

RESUMO

Vibrational zero-point energies (ZPEs) determined from ab initio calculations are often scaled by empirical factors. An empirical scaling factor partially compensates for the effects arising from vibrational anharmonicity and incomplete treatment of electron correlation. These effects are not random but are systematic. We report scaling factors for 32 combinations of theory and basis set, intended for predicting ZPEs from computed harmonic frequencies. An empirical scaling factor carries uncertainty. We quantify and report, for the first time, the uncertainties associated with scaling factors for ZPE. The uncertainties are larger than generally acknowledged; the scaling factors have only two significant digits. For example, the scaling factor for B3LYP/6-31G(d) is 0.9757+/-0.0224 (standard uncertainty). The uncertainties in the scaling factors lead to corresponding uncertainties in predicted ZPEs. The proposed method for quantifying the uncertainties associated with scaling factors is based upon the Guide to the Expression of Uncertainty in Measurement, published by the International Organization for Standardization. We also present a new reference set of 60 diatomic and 15 polyatomic "experimental" ZPEs that includes estimated uncertainties.


Assuntos
Modelos Moleculares , Incerteza , Simulação por Computador , Estrutura Molecular , Teoria Quântica , Termodinâmica
13.
J Res Natl Inst Stand Technol ; 113(5): 287-97, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-27096128

RESUMO

Covering arrays are structures for well-representing extremely large input spaces and are used to efficiently implement blackbox testing for software and hardware. This paper proposes refinements over the In-Parameter-Order strategy (for arbitrary t). When constructing homogeneous-alphabet covering arrays, these refinements reduce runtime in nearly all cases by a factor of more than 5 and in some cases by factors as large as 280. This trend is increasing with the number of columns in the covering array. Moreover, the resulting covering arrays are about 5 % smaller. Consequently, this new algorithm has constructed many covering arrays that are the smallest in the literature. A heuristic variant of the algorithm sometimes produces comparably sized covering arrays while running significantly faster.

14.
J Phys Chem A ; 109(37): 8430-7, 2005 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-16834237

RESUMO

Vibrational frequencies determined from ab initio calculations are often scaled by empirical factors. An empirical scaling factor partially compensates for the errors arising from vibrational anharmonicity and incomplete treatment of electron correlation. These errors are not random but are systematic biases. We report scaling factors for 40 combinations of theory and basis set, intended for predicting the fundamental frequencies from computed harmonic frequencies. An empirical scaling factor carries uncertainty. We quantify and report, for the first time, the uncertainties associated with the scaling factors. The uncertainties are larger than generally acknowledged; the scaling factors have only two significant digits. For example, the scaling factor for HF/6-31G(d) is 0.8982 +/- 0.0230 (standard uncertainty). The uncertainties in the scaling factors lead to corresponding uncertainties in predicted vibrational frequencies. The proposed method for quantifying the uncertainties associated with scaling factors is based on the Guide to the Expression of Uncertainty in Measurement, published by the International Organization for Standardization (ISO). The data used are from the Computational Chemistry Comparison and Benchmark Database (CCCBDB), maintained by the National Institute of Standards and Technology, which includes more than 3939 independent vibrations for 358 molecules.

15.
J Res Natl Inst Stand Technol ; 96(5): 577-591, 1991.
Artigo em Inglês | MEDLINE | ID: mdl-28184132

RESUMO

Taguchi's catalog of orthogonal arrays is based on the mathematical theory of factorial designs and difference sets developed by R. C. Bose and his associates. These arrays evolved as extensions of factorial designs and latin squares. This paper (1) describes the structure and constructions of Taguchi's orthogonal arrays, (2) illustrates their fractional factorial nature, and (3) points out that Taguchi's catalog can be expanded to include orthogonal arrays developed since 1960.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...