Pesquisa | Portal Regional da BVS

1.

Hybrid Stem Cell States: Insights Into the Relationship Between Mammary Development and Breast Cancer Using Single-Cell Transcriptomics.

Thong, Tasha; Wang, Yutong; Brooks, Michael D; Lee, Christopher T; Scott, Clayton; Balzano, Laura; Wicha, Max S; Colacino, Justin A.

Front Cell Dev Biol ; 8: 288, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32457901

RESUMO

Similarities between stem cells and cancer cells have implicated mammary stem cells in breast carcinogenesis. Recent evidence suggests that normal breast stem cells exist in multiple phenotypic states: epithelial, mesenchymal, and hybrid epithelial/mesenchymal (E/M). Hybrid E/M cells in particular have been implicated in breast cancer metastasis and poor prognosis. Mounting evidence also suggests that stem cell phenotypes change throughout the life course, for example, through embryonic development and pregnancy. The goal of this study was to use single cell RNA-sequencing to quantify cell state distributions of the normal mammary (NM) gland throughout developmental stages and when perturbed into a stem-like state in vitro using conditional reprogramming (CR). Using machine learning based dataset alignment, we integrate multiple mammary gland single cell RNA-seq datasets from human and mouse, along with bulk RNA-seq data from breast tumors in the Cancer Genome Atlas (TCGA), to interrogate hybrid stem cell states in the normal mammary gland and cancer. CR of human mammary cells induces an expanded stem cell state, characterized by increased expression of embryonic stem cell associated genes. Alignment to a mouse single-cell transcriptome atlas spanning mammary gland development from in utero to adulthood revealed that NM cells align to adult mouse cells and CR cells align across the pseudotime trajectory with a stem-like population aligning to the embryonic mouse cells. Three hybrid populations emerge after CR that are rare in NM: KRT18+/KRT14+ (hybrid luminal/basal), EPCAM+/VIM+ (hybrid E/M), and a quadruple positive population, expressing all four markers. Pseudotime analysis and alignment to the mouse developmental trajectory revealed that E/M hybrids are the most developmentally immature. Analyses of single cell mouse mammary RNA-seq throughout pregnancy show that during gestation, there is an enrichment of hybrid E/M cells, suggesting that these cells play an important role in mammary morphogenesis during lactation. Finally, pseudotime analysis and alignment of TCGA breast cancer expression data revealed that breast cancer subtypes express distinct developmental signatures, with basal tumors representing the most "developmentally immature" phenotype. These results highlight phenotypic plasticity of normal mammary stem cells and provide insight into the relationship between hybrid cell populations, stemness, and cancer.

2.

The Consequences of Performance Standards in Need-Based Aid: Evidence from Community Colleges.

Scott-Clayton, Judith; Schudde, Lauren.

J Hum Resour ; 55(4): 1105-1136, 2020 Oct 02.

Artigo em Inglês | MEDLINE | ID: mdl-38464679

RESUMO

Even need-based financial aid programs typically require recipients to meet Satisfactory Academic Progress (SAP) requirements. Using regression discontinuity and difference-in-difference designs, we examine the consequences of failing SAP for community college entrants in one state. We find heterogeneous academic effects in the short term, but, after six years, negative effects on academic and labor market outcomes dominate. Declines in credits attempted are two to three times as large as declines in credits earned, suggesting that SAP may increase aid efficiency. But students themselves are worse off, and the policy exacerbates inequality by pushing out low-income students faster than their higher-income peers.

3.

Estimating Returns to College Attainment: Comparing Survey and State Administrative Data-Based Estimates.

Scott-Clayton, Judith; Wen, Qiao.

Eval Rev ; 43(5): 266-306, 2019 10.

Artigo em Inglês | MEDLINE | ID: mdl-30453755

RESUMO

BACKGROUND: The increasing availability of massive administrative data sets linking postsecondary enrollees with postcollege earnings records has stimulated a wealth of new research on the returns to college and has accelerated state and federal efforts to hold institutions accountable for students' labor market outcomes. Many of these new research and policy efforts rely on state databases limited to postsecondary enrollees who work in the same state postcollege, with limited information regarding family background and precollege ability. OBJECTIVES: In this article, we use recent waves of data from the National Longitudinal Survey of Youth 1997 to provide new, nationally representative, nonexperimental estimates of the returns to degrees, as well as to assess the possible limitations of single-state, administrative data-based estimates. RESEARCH DESIGN: To do this, we explore the sensitivity of estimated returns to college, by testing different sample restrictions, inclusion of different sets of covariates, and alternative ways of treating out-of-state earnings to approximate the real-world limitations of state administrative databases. RESULTS: We find that failure to control for measures of student ability leads to upward bias, while limiting the sample to college enrollees only leads to an understatement of degree returns. On net, these two biases roughly balance out, suggesting that administrative data-based estimates may reasonably approximate true returns. CONCLUSIONS: We conclude with a discussion of the relative advantages and disadvantages of survey versus administrative data for estimating returns to college as well as implications for research and policy efforts based upon single-state administrative databases.

Assuntos

Universidades/estatística & dados numéricos , Adolescente , Escolaridade , Emprego/estatística & dados numéricos , Estudos de Avaliação como Assunto , Humanos , Renda/estatística & dados numéricos , Estudos Longitudinais , Política Pública , Inquéritos e Questionários , Estados Unidos , Universidades/organização & administração

4.

Dictionary-Free MRI PERK: Parameter Estimation via Regression with Kernels.

Nataraj, Gopal; Nielsen, Jon-Fredrik; Scott, Clayton; Fessler, Jeffrey A.

IEEE Trans Med Imaging ; 37(9): 2103-2114, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-29994085

RESUMO

This paper introduces a fast, general method for dictionary-free parameter estimation in quantitative magnetic resonance imaging (QMRI) parameter estimation via regression with kernels (PERK). PERK first uses prior distributions and the nonlinear MR signal model to simulate many parameter-measurement pairs. Inspired by machine learning, PERK then takes these parameter-measurement pairs as labeled training points and learns from them a nonlinear regression function using kernel functions and convex optimization. PERK admits a simple implementation as per-voxel nonlinear lifting of MRI measurements followed by linear minimum mean-squared error regression. We demonstrate PERK for $ {\textit {T}_{1}}, {\textit {T}_{2}}$ estimation, a well-studied application where it is simple to compare PERK estimates against dictionary-based grid search estimates and iterative optimization estimates. Numerical simulations as well as single-slice phantom and in vivo experiments demonstrate that PERK and other tested methods produce comparable $ {\textit {T}_{1}}, {\textit {T}_{2}}$ estimates in white and gray matter, but PERK is consistently at least $140\times $ faster. This acceleration factor may increase by several orders of magnitude for full-volume QMRI estimation problems involving more latent parameters per voxel.

Assuntos

Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Humanos , Dinâmica não Linear

5.

Disease prediction based on functional connectomes using a scalable and spatially-informed support vector machine.

Watanabe, Takanori; Kessler, Daniel; Scott, Clayton; Angstadt, Michael; Sripada, Chandra.

Neuroimage ; 96: 183-202, 2014 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-24704268

RESUMO

Substantial evidence indicates that major psychiatric disorders are associated with distributed neural dysconnectivity, leading to a strong interest in using neuroimaging methods to accurately predict disorder status. In this work, we are specifically interested in a multivariate approach that uses features derived from whole-brain resting state functional connectomes. However, functional connectomes reside in a high dimensional space, which complicates model interpretation and introduces numerous statistical and computational challenges. Traditional feature selection techniques are used to reduce data dimensionality, but are blind to the spatial structure of the connectomes. We propose a regularization framework where the 6-D structure of the functional connectome (defined by pairs of points in 3-D space) is explicitly taken into account via the fused Lasso or the GraphNet regularizer. Our method only restricts the loss function to be convex and margin-based, allowing non-differentiable loss functions such as the hinge-loss to be used. Using the fused Lasso or GraphNet regularizer with the hinge-loss leads to a structured sparse support vector machine (SVM) with embedded feature selection. We introduce a novel efficient optimization algorithm based on the augmented Lagrangian and the classical alternating direction method, which can solve both fused Lasso and GraphNet regularized SVM with very little modification. We also demonstrate that the inner subproblems of the algorithm can be solved efficiently in analytic form by coupling the variable splitting strategy with a data augmentation scheme. Experiments on simulated data and resting state scans from a large schizophrenia dataset show that our proposed approach can identify predictive regions that are spatially contiguous in the 6-D "connectome space," offering an additional layer of interpretability that could provide new insights about various disease processes.

Assuntos

Conectoma/métodos , Imageamento por Ressonância Magnética/métodos , Rede Nervosa/fisiopatologia , Esquizofrenia/diagnóstico , Esquizofrenia/fisiopatologia , Máquina de Vetores de Suporte , Adulto , Feminino , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Masculino , Pessoa de Meia-Idade , Rede Nervosa/patologia , Reprodutibilidade dos Testes , Esquizofrenia/patologia , Sensibilidade e Especificidade , Análise Espaço-Temporal , Adulto Jovem

6.

SCALABLE FUSED LASSO SVM FOR CONNECTOME-BASED DISEASE PREDICTION.

Watanabe, Takanori; Scott, Clayton D; Kessler, Daniel; Angstadt, Michael; Sripada, Chandra S.

Proc IEEE Int Conf Acoust Speech Signal Process ; 2014: 5989-5993, 2014 May.

Artigo em Inglês | MEDLINE | ID: mdl-25892971

RESUMO

There is substantial interest in developing machine-based methods that reliably distinguish patients from healthy controls using high dimensional correlation maps known as functional connectomes (FC's) generated from resting state fMRI. To address the dimensionality of FC's, the current body of work relies on feature selection techniques that are blind to the spatial structure of the data. In this paper, we propose to use the fused Lasso regularized support vector machine to explicitly account for the 6-D structure of the FC (defined by pairs of points in 3-D brain space). In order to solve the resulting nonsmooth and large-scale optimization problem, we introduce a novel and scalable algorithm based on the alternating direction method. Experiments on real resting state scans show that our approach can recover results that are more neuroscientifically informative than previous methods.

7.

A rank-based approach to active diagnosis.

Bellala, Gowtham; Stanley, Jason; Bhavnani, Suresh K; Scott, Clayton.

IEEE Trans Pattern Anal Mach Intell ; 35(9): 2078-90, 2013 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-23868771

RESUMO

The problem of active diagnosis arises in several applications such as disease diagnosis and fault diagnosis in computer networks, where the goal is to rapidly identify the binary states of a set of objects (e.g., faulty or working) by sequentially selecting, and observing, potentially noisy responses to binary valued queries. Previous work in this area chooses queries sequentially based on Information gain, and the object states are inferred by maximum a posteriori (MAP) estimation. In this work, rather than MAP estimation, we aim to rank objects according to their posterior fault probability. We propose a greedy algorithm to choose queries sequentially by maximizing the area under the ROC curve associated with the ranked list. The proposed algorithm overcomes limitations of existing work. When multiple faults may be present, the proposed algorithm does not rely on belief propagation, making it feasible for large scale networks with little loss in performance. When a single fault is present, the proposed algorithm can be implemented without knowledge of the underlying query noise distribution, making it robust to any misspecification of these noise parameters. We demonstrate the performance of the proposed algorithm through experiments on computer networks, a toxic chemical database, and synthetic datasets.

Assuntos

Algoritmos , Diagnóstico por Computador/métodos , Área Sob a Curva , Inteligência Artificial , Teorema de Bayes , Curva ROC

8.

Distributed effects of methylphenidate on the network structure of the resting brain: a connectomic pattern classification analysis.

Sripada, Chandra Sekhar; Kessler, Daniel; Welsh, Robert; Angstadt, Michael; Liberzon, Israel; Phan, K Luan; Scott, Clayton.

Neuroimage ; 81: 213-221, 2013 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-23684862

RESUMO

Methylphenidate is a psychostimulant medication that produces improvements in functions associated with multiple neurocognitive systems. To investigate the potentially distributed effects of methylphenidate on the brain's intrinsic network architecture, we coupled resting state imaging with multivariate pattern classification. In a within-subject, double-blind, placebo-controlled, randomized, counterbalanced, cross-over design, 32 healthy human volunteers received either methylphenidate or placebo prior to two fMRI resting state scans separated by approximately one week. Resting state connectomes were generated by placing regions of interest at regular intervals throughout the brain, and these connectomes were submitted for support vector machine analysis. We found that methylphenidate produces a distributed, reliably detected, multivariate neural signature. Methylphenidate effects were evident across multiple resting state networks, especially visual, somatomotor, and default networks. Methylphenidate reduced coupling within visual and somatomotor networks. In addition, default network exhibited decoupling with several task positive networks, consistent with methylphenidate modulation of the competitive relationship between these networks. These results suggest that connectivity changes within and between large-scale networks are potentially involved in the mechanisms by which methylphenidate improves attention functioning.

Assuntos

Encéfalo/efeitos dos fármacos , Estimulantes do Sistema Nervoso Central/farmacologia , Conectoma , Metilfenidato/farmacologia , Adolescente , Adulto , Estudos Cross-Over , Feminino , Humanos , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Masculino , Vias Neurais/efeitos dos fármacos , Descanso , Adulto Jovem

9.

Financial aid policy: lessons from research.

Dynarski, Susan; Scott-Clayton, Judith.

Future Child ; 23(1): 67-91, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-25522646

RESUMO

In the nearly fifty years since the adoption of the Higher Education Act of 1965, financial aid programs have grown in scale, expanded in scope, and multiplied in form. As a result, financial aid has become the norm among college enrollees. Aid now flows not only to traditional college students but also to part-time students, older students, and students who never graduated from high school. Today aid is available not only to low-income students but also to middle- and even high-income families, in the form of grants, subsidized loans, and tax credits. The increasing size and complexity of the nation's student aid system has generated questions about effectiveness, heightened confusion among students and parents, and raised concerns about how program rules may interact. In this article, Susan Dynarski and Judith Scott-Clayton review what is known, and just as important, what is not known, about how well various student aid programs work. The evidence, the authors write, clearly shows that lowering costs can improve college access and completion. But this general rule is not without exception. First, they note, the complexity of program eligibility and delivery appears to moderate the impact of aid on college enrollment and persistence after enrollment. Second, for students who have already decided to enroll, grants that tie financial aid to academic achievement appear to boost college outcomes such as persistence more than do grants with no strings attached. Third, compared with grant aid, relatively little rigorous research has been conducted on the effectiveness of student loans. The paucity of evidence on student loans is particularly problematic both because they represent a large share of student aid overall and because their low cost (relative to grant aid) makes them an attractive option for policy makers. Future research is likely to focus on several issues: the importance of program design and delivery, whether there are unanticipated interactions between programs, and to what extent program effects vary across different types of students. The results of this evidence will be critical, the authors say, as politicians look for ways to control spending.

Assuntos

Educação/economia , Educação/tendências , Política Pública/economia , Política Pública/tendências , Apoio ao Desenvolvimento de Recursos Humanos , Adolescente , Adulto , Análise Custo-Benefício/economia , Análise Custo-Benefício/tendências , Previsões , Humanos , Renda/tendências , Apoio ao Desenvolvimento de Recursos Humanos/economia , Apoio ao Desenvolvimento de Recursos Humanos/tendências , Estados Unidos , Adulto Jovem

10.

Automated analysis of the 12-lead electrocardiogram to identify the exit site of postinfarction ventricular tachycardia.

Yokokawa, Miki; Liu, Tzu-Yu; Yoshida, Kentaro; Scott, Clayton; Hero, Alfred; Good, Eric; Morady, Fred; Bogun, Frank.

Heart Rhythm ; 9(3): 330-4, 2012 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-22001707

RESUMO

BACKGROUND: The value of the 12-lead electrocardiogram (ECG) to identify the exit site of postinfarction ventricular tachycardia (VT) has been questioned. The purpose of this study was to assess the accuracy of a computerized algorithm for identifying a VT exit site on the basis of the 12-lead ECG. METHODS AND RESULTS: In 34 postinfarction patients, pace mapping was performed from within scar tissue. A computerized algorithm that used a supervised learning method (support vector machine) received the digitized pace-map morphologies combined with the pacing sites as training data. No other information (ie, infarct localization, bundle branch block morphology, axis, or R-wave pattern) was used in the algorithm. The training data were validated in 58 VTs in 33 patients. The sizes of 10 different anatomic sections within the heart were determined by using the pace maps as the determining factor. Accuracy was found to be 69% for pace maps, and when 2 adjacent regions were combined, accuracy improved to 88%. Validation of the data in 33 patients showed an accuracy of 71% for localizing a VT exit site to 1 of the 10 regions within the left ventricle. If combined with the best adjacent region, accuracy improved to 88%. The median anatomic size of each section was 21 cm(2). The median spatial resolution of the 12-lead ECG pattern of the pace maps for a particular region was 15 cm(2). CONCLUSION: The 12-lead ECG of postinfarction VT contains localizing information that enables determination of a region of interest in the 10-20 cm(2) range for more than 70% of VT exit sites in a given sector.

Assuntos

Cicatriz/etiologia , Diagnóstico por Computador , Eletrocardiografia/métodos , Infarto do Miocárdio/complicações , Máquina de Vetores de Suporte , Taquicardia Ventricular , Idoso , Cicatriz/patologia , Cicatriz/fisiopatologia , Feminino , Sistema de Condução Cardíaco/patologia , Sistema de Condução Cardíaco/fisiopatologia , Humanos , Masculino , Pessoa de Meia-Idade , Infarto do Miocárdio/patologia , Infarto do Miocárdio/fisiopatologia , Reprodutibilidade dos Testes , Taquicardia Ventricular/diagnóstico , Taquicardia Ventricular/etiologia , Taquicardia Ventricular/fisiopatologia

11.

Spatial Confidence Regions for Quantifying and Visualizing Registration Uncertainty.

Watanabe, Takanori; Scott, Clayton.

Biomed Image Regist Proc ; 7359: 120-130, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-26005720

RESUMO

For image registration to be applicable in a clinical setting, it is important to know the degree of uncertainty in the returned point-correspondences. In this paper, we propose a data-driven method that allows one to visualize and quantify the registration uncertainty through spatially adaptive confidence regions. The method applies to various parametric deformation models and to any choice of the similarity criterion. We adopt the B-spline model and the negative sum of squared differences for concreteness. At the heart of the proposed method is a novel shrinkage-based estimate of the distribution on deformation parameters. We present some empirical evaluations of the method in 2-D using images of the lung and liver, and the method generalizes to 3-D.

12.

Statistical file matching of flow cytometry data.

Lee, Gyemin; Finn, William; Scott, Clayton.

J Biomed Inform ; 44(4): 663-76, 2011 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-21406248

RESUMO

Flow cytometry is a technology that rapidly measures antigen-based markers associated to cells in a cell population. Although analysis of flow cytometry data has traditionally considered one or two markers at a time, there has been increasing interest in multidimensional analysis. However, flow cytometers are limited in the number of markers they can jointly observe, which is typically a fraction of the number of markers of interest. For this reason, practitioners often perform multiple assays based on different, overlapping combinations of markers. In this paper, we address the challenge of imputing the high-dimensional jointly distributed values of marker attributes based on overlapping marginal observations. We show that simple nearest neighbor based imputation can lead to spurious subpopulations in the imputed data and introduce an alternative approach based on nearest neighbor imputation restricted to a cell's subpopulation. This requires us to perform clustering with missing data, which we address with a mixture model approach and novel EM algorithm. Since mixture model fitting may be ill-posed in this context, we also develop techniques to initialize the EM algorithm using domain knowledge. We demonstrate our approach on real flow cytometry data.

Assuntos

Algoritmos , Biologia Computacional/métodos , Citometria de Fluxo/métodos , Análise por Conglomerados , Bases de Dados Factuais , Humanos , Imunofenotipagem/métodos , Leucócitos/química , Leucócitos/classificação , Análise de Componente Principal

13.

The value of defibrillator electrograms for recognition of clinical ventricular tachycardias and for pace mapping of post-infarction ventricular tachycardia.

Yoshida, Kentaro; Liu, Tzu-Yu; Scott, Clayton; Hero, Alfred; Yokokawa, Miki; Gupta, Sanjaya; Good, Eric; Morady, Fred; Bogun, Frank.

J Am Coll Cardiol ; 56(12): 969-79, 2010 Sep 14.

Artigo em Inglês | MEDLINE | ID: mdl-20828650

RESUMO

OBJECTIVES: The purpose of this study was to assess the value of implantable cardioverter-defibrillator (ICD) electrograms (EGMs) in identifying clinically documented ventricular tachycardias (VTs). BACKGROUND: Twelve-lead electrocardiograms (ECG) of spontaneous VT often are not available in patients referred for catheter ablation of post-infarction VT. Many of these patients have ICDs, and the ability of ICD EGMs to identify a specific configuration of VT has not been described. METHODS: In 21 consecutive patients referred for catheter ablation of post-infarction VT, 124 VTs (mean cycle length: 393 ± 103 ms) were induced, and ICD EGMs were recorded during VT. Clinical VT had been documented with 12-lead ECGs in 15 of 21 patients. The 12-lead ECGs of the clinical VTs were compared with 64 different inducible VTs (mean cycle length: 390 ± 91 ms) to assess how well the ICD EGMs differentiated the clinical VTs from the other induced VTs. The exit site of 62 VTs (mean cycle length: 408 ± 112 ms) was identified by pace mapping (10 to 12 of 12 matching leads). The spatial resolution of pace mapping to identify a VT exit site was determined for both the 12-lead ECGs and the ICD EGMs using a customized MATLAB program (version 7.5, The MathWorks, Inc., Natick, Massachusetts). RESULTS: Analysis of stored EGMs by comparison of receiver-operating characteristic curve cutoff values accurately distinguished the clinical VTs from 98% of the other inducible VTs. The mean spatial resolution of a 12-lead ECG pace map for the VT exit site was 2.9 ± 4.0 cm(2) (range 0 to 17.5 cm(2)) compared with 8.9 ± 9.0 cm(2) (range 0 to 35 cm(2)) for ICD EGM pace maps. The spatial resolution of pace mapping varied greatly between patients and between VTs. The spatial resolution of ICD EGMs was < 1.0 cm(2) for ≥ 1 of the target VTs in 12 of 21 patients and 19 of 62 VTs. By visual inspection of the ICD EGMs, 96% of the clinical VTs were accurately differentiated from previously undocumented VTs. CONCLUSIONS: Stored ICD EGMs usually are an accurate surrogate for 12-lead ECGs for differentiating clinical VTs from other VTs. Pace mapping based on ICD EGMs has variable resolution but may be useful for identifying a VT exit site.

Assuntos

Desfibriladores Implantáveis , Eletrocardiografia , Infarto do Miocárdio/complicações , Taquicardia Ventricular/diagnóstico , Idoso , Mapeamento Potencial de Superfície Corporal , Ablação por Cateter , Feminino , Humanos , Masculino , Curva ROC , Taquicardia Ventricular/cirurgia

14.

L2 kernel classification.

Kim, JooSeuk; Scott, Clayton D.

IEEE Trans Pattern Anal Mach Intell ; 32(10): 1822-31, 2010 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-20724759

RESUMO

Nonparametric kernel methods are widely used and proven to be successful in many statistical learning problems. Well-known examples include the kernel density estimate (KDE) for density estimation and the support vector machine (SVM) for classification. We propose a kernel classifier that optimizes the L2 or integrated squared error (ISE) of a "difference of densities." We focus on the Gaussian kernel, although the method applies to other kernels suitable for density estimation. Like a support vector machine (SVM), the classifier is sparse and results from solving a quadratic program. We provide statistical performance guarantees for the proposed L2 kernel classifier in the form of a finite sample oracle inequality and strong consistency in the sense of both ISE and probability of error. A special case of our analysis applies to a previously introduced ISE-based method for kernel density estimation. For dimensionality greater than 15, the basic L2 kernel classifier performs poorly in practice. Thus, we extend the method through the introduction of a natural regularization parameter, which allows it to remain competitive with the SVM in high dimensions. Simulation results for both synthetic and real-world data are presented.

15.

Tuning support vector machines for minimax and Neyman-Pearson classification.

Davenport, Mark A; Baraniuk, Richard G; Scott, Clayton D.

IEEE Trans Pattern Anal Mach Intell ; 32(10): 1888-98, 2010 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-20724764

RESUMO

This paper studies the training of support vector machine (SVM) classifiers with respect to the minimax and Neyman-Pearson criteria. In principle, these criteria can be optimized in a straightforward way using a cost-sensitive SVM. In practice, however, because these criteria require especially accurate error estimation, standard techniques for tuning SVM parameters, such as cross-validation, can lead to poor classifier performance. To address this issue, we first prove that the usual cost-sensitive SVM, here called the 2C-SVM, is equivalent to another formulation called the 2nu-SVM. We then exploit a characterization of the 2nu-SVM parameter space to develop a simple yet powerful approach to error estimation based on smoothing. In an extensive experimental study, we demonstrate that smoothing significantly improves the accuracy of cross-validation error estimates, leading to dramatic performance gains. Furthermore, we propose coordinate descent strategies that offer significant gains in computational efficiency, with little to no loss in performance.

16.

A novel system for rapidly identifying toxic chemicals.

Bhavnani, Suresh K; Ganesan, Arun; Scott, Clayton; Weber, Chris; Saxman, Paul.

AMIA Annu Symp Proc ; : 878, 2008 Nov 06.

Artigo em Inglês | MEDLINE | ID: mdl-18999084

RESUMO

First-responders have a critical need to rapidly identify toxic chemicals during emergencies. However, current systems such as WISER require a large number of inputs before a chemical can be identified. Here we present a novel system which significantly reduces the number of inputs required to identify a toxic chemical.

Assuntos

Dicionários Farmacêuticos como Assunto , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Intoxicação/classificação , Intoxicação/prevenção & controle , Venenos/classificação , Software , Serviços Médicos de Emergência/métodos , Estados Unidos

17.

Semi-parametric differential expression analysis via partial mixture estimation.

Rossell, David; Guerra, Rudy; Scott, Clayton.

Stat Appl Genet Mol Biol ; 7(1): Article15, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18454730

RESUMO

We develop an approach for microarray differential expression analysis, i.e. identifying genes whose expression levels differ between two or more groups. Current approaches to inference rely either on full parametric assumptions or on permutation-based techniques for sampling under the null distribution. In some situations, however, a full parametric model cannot be justified, or the sample size per group is too small for permutation methods to be valid. We propose a semi-parametric framework based on partial mixture estimation which only requires a parametric assumption for the null (equally expressed) distribution and can handle small sample sizes where permutation methods break down. We develop two novel improvements of Scott's minimum integrated square error criterion for partial mixture estimation [Scott, 2004a,b]. As a side benefit, we obtain interpretable and closed-form estimates for the proportion of EE genes. Pseudo-Bayesian and frequentist procedures for controlling the false discovery rate are given. Results from simulations and real datasets indicate that our approach can provide substantial advantages for small sample sizes over the SAM method of Tusher et al. [2001], the empirical Bayes procedure of Efron and Tibshirani [2002], the mixture of normals of Pan et al. [2003] and a t-test with p-value adjustment [Dudoit et al., 2003] to control the FDR [Benjamini and Hochberg, 1995].

Assuntos

Algoritmos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos , Teorema de Bayes , Simulação por Computador , Expressão Gênica , Tamanho da Amostra

18.

Robust contour matching via the order-preserving assignment problem.

Scott, Clayton; Nowak, Robert.

IEEE Trans Image Process ; 15(7): 1831-8, 2006 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-16830905

RESUMO

A common approach to determining corresponding points on two shapes is to compute the cost of each possible pairing of points and solve the assignment problem (weighted bipartite matching) for the resulting cost matrix. We consider the problem of solving for point correspondences when the shapes of interest are each defined by a single, closed contour. A modification of the standard assignment problem is proposed whereby the correspondences are required to preserve the ordering of the points induced from the shapes' contours. Enforcement of this constraint leads to significantly improved correspondences. Robustness with respect to outliers and shape irregularity is obtained by required only a fraction of feature points to be matched. Furthermore, the minimum matching size may be specified in advance. We present efficient dynamic programming algorithms to solve the proposed optimization problem. Experiments on the Brown and MPEG-7 shape databases demonstrate the effectiveness of the proposed method relative to the standard assignment problem.

Assuntos

Algoritmos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Gravação em Vídeo/métodos , Inteligência Artificial , Gráficos por Computador , Análise Numérica Assistida por Computador , Processamento de Sinais Assistido por Computador

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA