Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
J Cheminform ; 6(1): 10, 2014 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-24678909

RESUMO

BACKGROUND: We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. METHODS: We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. RESULTS: We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. CONCLUSIONS: We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.

2.
Diagn Pathol ; 8: 44, 2013 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-23497426

RESUMO

BACKGROUND: The differential diagnosis between metastatic head & neck squamous cell carcinomas (HNSCC) and lung squamous cell carcinomas (lung SCC) is often unresolved because the histologic appearance of these two tumor types is similar. We have developed and validated a gene expression profile test (GEP-HN-LS) that distinguishes HNSCC and lung SCC in formalin-fixed, paraffin-embedded (FFPE) specimens using a 2160-gene classification model. METHODS: The test was validated in a blinded study using a pre-specified algorithm and microarray data files for 76 metastatic or poorly-differentiated primary tumors with a known HNSCC or lung SCC diagnosis. RESULTS: The study met the primary Bayesian statistical endpoint for acceptance. Measures of test performance include overall agreement with the known diagnosis of 82.9% (95% CI, 72.5% to 90.6%), an area under the ROC curve (AUC) of 0.91 and a diagnostics odds ratio (DOR) of 23.6. HNSCC (N = 38) gave an agreement with the known diagnosis of 81.6% and lung SCC (N = 38) gave an agreement of 84.2%. Reproducibility in test results between three laboratories had a concordance of 91.7%. CONCLUSION: GEP-HN-LS can aid in resolving the important differential diagnosis between HNSCC and lung SCC tumors. VIRTUAL SLIDES: The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1753227817890930.


Assuntos
Biomarcadores Tumorais/genética , Carcinoma de Células Escamosas/genética , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Testes Genéticos , Neoplasias de Cabeça e Pescoço/genética , Neoplasias Pulmonares/genética , Adulto , Idoso , Algoritmos , Área Sob a Curva , Teorema de Bayes , Carcinoma de Células Escamosas/secundário , Diagnóstico Diferencial , Fixadores , Formaldeído , Perfilação da Expressão Gênica/métodos , Testes Genéticos/métodos , Neoplasias de Cabeça e Pescoço/patologia , Humanos , Neoplasias Pulmonares/patologia , Pessoa de Meia-Idade , Razão de Chances , Análise de Sequência com Séries de Oligonucleotídeos , Inclusão em Parafina , Valor Preditivo dos Testes , Curva ROC , Reprodutibilidade dos Testes , Fixação de Tecidos
3.
Oncotarget ; 3(2): 212-23, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22371431

RESUMO

We have developed a gene expression profile test (Pathwork Tissue of Origin Endometrial Test) that distinguishes primary epithelial ovarian and endometrial cancers in formalin-fixed, paraffin-embedded (FFPE) specimens using a 316-gene classification model. The test was validated in a blinded study using a pre-specified algorithm and microarray files for 75 metastatic, poorly differentiated or undifferentiated specimens with a known ovarian or endometrial cancer diagnosis. Measures of test performance include a 94.7% overall agreement with the known diagnosis, an area under the ROC curve (AUC) of 0.997 and a diagnostic odds ratio (DOR) of 406. Ovarian cancers (n=30) gave an agreement of 96.7% with the known diagnosis while endometrial cancers (n=45) gave an agreement of 93.3%. In a precision study, concordance in test results was 100%. Reproducibility in test results between three laboratories was 94.3%. The Tissue of Origin Endometrial Test can aid in resolving important differential diagnostic questions in gynecologic oncology.


Assuntos
Neoplasias do Endométrio/diagnóstico , Perfilação da Expressão Gênica/métodos , Técnicas de Diagnóstico Molecular/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Neoplasias Ovarianas/diagnóstico , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias do Endométrio/genética , Feminino , Testes Genéticos , Humanos , Pessoa de Meia-Idade , Gradação de Tumores , Neoplasias Ovarianas/genética
4.
J Clin Oncol ; 27(15): 2503-8, 2009 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-19332734

RESUMO

PURPOSE: Malignancies found in unexpected locations or with poorly differentiated morphologies can pose a significant challenge for tissue of origin determination. Current histologic and imaging techniques fail to yield definitive identification of the tissue of origin in a significant number of cases. The aim of this study was to validate a predefined 1,550-gene expression profile for this purpose. METHODS: Four institutions processed 547 frozen specimens representing 15 tissues of origin using oligonucleotide microarrays. Half of the specimens were metastatic tumors, with the remainder being poorly differentiated and undifferentiated primary cancers chosen to resemble those that present as a clinical challenge. RESULTS: In this blinded multicenter validation study the 1,550-gene expression profile was highly informative in tissue determination. The study found overall sensitivity (positive percent agreement with reference diagnosis) of 87.8% (95% CI, 84.7% to 90.4%) and overall specificity (negative percent agreement with reference diagnosis) of 99.4% (95% CI, 98.3% to 99.9%). Performance within the subgroup of metastatic tumors (n = 258) was found to be slightly lower than that of the poorly differentiated and undifferentiated primary tumor subgroup, 84.5% and 90.7%, respectively (P = .04). Differences between individual laboratories were not statistically significant. CONCLUSION: This study represents the first adequately sized, multicenter validation of a gene-expression profile for tissue of origin determination restricted to poorly differentiated and undifferentiated primary cancers and metastatic tumors. These results indicate that this profile should be a valuable addition or alternative to currently available diagnostic methods for the evaluation of uncertain primary cancers.


Assuntos
Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/normas , Neoplasias/diagnóstico , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/normas , Idoso , Medicina Baseada em Evidências , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Sensibilidade e Especificidade , Manejo de Espécimes
5.
J Mol Diagn ; 10(1): 67-77, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18083688

RESUMO

Clinical workup of metastatic malignancies of unknown origin is often arduous and expensive and is reported to be unsuccessful in 30 to 60% of cases. Accurate classification of uncertain primary cancers may improve with microarray-based gene expression testing. We evaluated the analytical performance characteristics of the Pathwork tissue of origin test, which uses expression signals from 1668 probe sets in a gene expression microarray, to quantify the similarity of tumor specimens to 15 known tissues of origin. Sixty archived tissue specimens from poorly and undifferentiated tumors (metastatic and primary) were analyzed at four laboratories representing a wide range of preanalytical conditions (eg, personnel, reagents, instrumentation, and protocols). Cross-laboratory comparisons showed highly reproducible results between laboratories, with correlation coefficients between 0.95 to 0.97 for measurements of similarity scores, and an average 93.8% overall concordance between laboratories in terms of final tissue calls. Bland-Altman plots (mean coefficients of reproducibility of 32.48+/-3.97) and kappa statistics (kappa >0.86) also indicated a high level of agreement between laboratories. We conclude that the Pathwork tissue of origin test is a robust assay that produces consistent results in diverse laboratory conditions reflecting the preanalytical variations found in the everyday clinical practice of molecular diagnostics laboratories.


Assuntos
Neoplasias/diagnóstico , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/normas , Humanos , RNA Neoplásico/genética , Reprodutibilidade dos Testes
6.
Bioinformatics ; 22(2): 245-7, 2006 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-16278240

RESUMO

UNLABELLED: PCP (Pattern Classification Program) is an open-source machine learning program for supervised classification of patterns (vectors of measurements). The principal use of PCP in bioinformatics is design and evaluation of classifiers for use in clinical diagnostic tests based on measurements of gene expression. PCP implements leading pattern classification and gene selection algorithms and incorporates cross-validation estimation of classifier performance. Importantly, the implementation integrates gene selection and class prediction stages, which is vital for computing reliable performance estimates in small-sample scenarios. Additionally, the program includes automated and efficient model selection (optimization of parameters) for support vector machine (SVM) classifier. The distribution includes Linux and Windows/Cygwin binaries. The program can easily be ported to other platforms. AVAILABILITY: Free download at http://pcp.sourceforge.net


Assuntos
Algoritmos , Inteligência Artificial , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reconhecimento Automatizado de Padrão/métodos , Software , Linguagens de Programação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...