Pesquisa | Portal Regional da BVS (teste)

A longitudinal study of topic classification on Twitter.

Bouadjenek, Mohamed Reda; Sanner, Scott; Iman, Zahra; Xie, Lexing; Shi, Daniel Xiaoliang.

PeerJ Comput Sci ; 8: e991, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35721404

RESUMO

Twitter represents a massively distributed information source over topics ranging from social and political events to entertainment and sports news. While recent work has suggested this content can be narrowed down to the personalized interests of individual users by training topic filters using standard classifiers, there remain many open questions about the efficacy of such classification-based filtering approaches. For example, over a year or more after training, how well do such classifiers generalize to future novel topical content, and are such results stable across a range of topics? In addition, how robust is a topic classifier over the time horizon, e.g., can a model trained in 1 year be used for making predictions in the subsequent year? Furthermore, what features, feature classes, and feature attributes are most critical for long-term classifier performance? To answer these questions, we collected a corpus of over 800 million English Tweets via the Twitter streaming API during 2013 and 2014 and learned topic classifiers for 10 diverse themes ranging from social issues to celebrity deaths to the "Iran nuclear deal". The results of this long-term study of topic classifier performance provide a number of important insights, among them that: (i) such classifiers can indeed generalize to novel topical content with high precision over a year or more after training though performance degrades with time, (ii) the classes of hashtags and simple terms contain the most informative feature instances, (iii) removing tweets containing training hashtags from the validation set allows better generalization, and (iv) the simple volume of tweets by a user correlates more with their informativeness than their follower or friend count. In summary, this work provides a long-term study of topic classifiers on Twitter that further justifies classification-based topical filtering approaches while providing detailed insight into the feature properties most critical for topic classifier performance.

Evaluation of Machine Learning Algorithms for Predicting Readmission After Acute Myocardial Infarction Using Routinely Collected Clinical Data.

Gupta, Shagun; Ko, Dennis T; Azizi, Paymon; Bouadjenek, Mohamed Reda; Koh, Maria; Chong, Alice; Austin, Peter C; Sanner, Scott.

Can J Cardiol ; 36(6): 878-885, 2020 06.

Artigo em Inglês | MEDLINE | ID: mdl-32204950

RESUMO

BACKGROUND: The ability to predict readmission accurately after hospitalization for acute myocardial infarction (AMI) is limited in current statistical models. Machine-learning (ML) methods have shown improved predictive ability in various clinical contexts, but their utility in predicting readmission after hospitalization for AMI is unknown. METHODS: Using detailed clinical information collected from patients hospitalized with AMI, we evaluated 6 ML algorithms (logistic regression, naïve Bayes, support vector machines, random forest, gradient boosting, and deep neural networks) to predict readmission within 30 days and 1 year of discharge. A nested cross-validation approach was used to develop and test models. We used C-statistics to compare discriminatory capacity, whereas the Brier score was used to indicate overall model performance. Model calibration was assessed using calibration plots. RESULTS: The 30-day readmission rate was 16.3%, whereas the 1-year readmission rate was 45.1%. For 30-day readmission, the discriminative ability for the ML models was modest (C-statistic 0.641; 95% confidence interval (CI), 0.621-0.662 for gradient boosting) and did not outperform previously reported methods. For 1-year readmission, different ML models showed moderate performance, with C-statistics around 0.72. Despite modest discriminatory capabilities, the observed readmission rates were markedly higher in the tenth decile of predicted risk compared with the first decile of predicted risk for both 30-day and 1-year readmission. CONCLUSIONS: Despite including detailed clinical information and evaluating various ML methods, these models did not have better discriminatory ability to predict readmission outcomes compared with previously reported methods.

Assuntos

Algoritmos , Hospitalização/estatística & dados numéricos , Aprendizado de Máquina , Infarto do Miocárdio , Readmissão do Paciente/estatística & dados numéricos , Canadá/epidemiologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Infarto do Miocárdio/epidemiologia , Infarto do Miocárdio/terapia , Valor Preditivo dos Testes , Prognóstico , Medição de Risco/métodos , Fatores de Risco , Fatores de Tempo

Deep learning-based selection of human sperm with high DNA integrity.

McCallum, Christopher; Riordon, Jason; Wang, Yihe; Kong, Tian; You, Jae Bem; Sanner, Scott; Lagunov, Alexander; Hannam, Thomas G; Jarvi, Keith; Sinton, David.

Commun Biol ; 2: 250, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31286067

RESUMO

Despite the importance of sperm DNA to human reproduction, currently no method exists to assess individual sperm DNA quality prior to clinical selection. Traditionally, skilled clinicians select sperm based on a variety of morphological and motility criteria, but without direct knowledge of their DNA cargo. Here, we show how a deep convolutional neural network can be trained on a collection of ~1000 sperm cells of known DNA quality, to predict DNA quality from brightfield images alone. Our results demonstrate moderate correlation (bivariate correlation ~0.43) between a sperm cell image and DNA quality and the ability to identify higher DNA integrity cells relative to the median. This deep learning selection process is directly compatible with current, manual microscopy-based sperm selection and could assist clinicians, by providing rapid DNA quality predictions (under 10 ms per cell) and sperm selection within the 86th percentile from a given sample.

Assuntos

DNA/análise , Aprendizado Profundo , Espermatozoides/metabolismo , Teorema de Bayes , Cromatina , Fragmentação do DNA , Voluntários Saudáveis , Humanos , Curva de Aprendizado , Masculino , Redes Neurais de Computação , Distribuição Normal , Análise do Sêmen/métodos , Espermatozoides/patologia

Deep Learning with Microfluidics for Biotechnology.

Riordon, Jason; Sovilj, Dusan; Sanner, Scott; Sinton, David; Young, Edmond W K.

Trends Biotechnol ; 37(3): 310-324, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-30301571

RESUMO

Advances in high-throughput and multiplexed microfluidics have rewarded biotechnology researchers with vast amounts of data but not necessarily the ability to analyze complex data effectively. Over the past few years, deep artificial neural networks (ANNs) leveraging modern graphics processing units (GPUs) have enabled the rapid analysis of structured input data - sequences, images, videos - to predict complex outputs with unprecedented accuracy. While there have been early successes in flow cytometry, for example, the extensive potential of pairing microfluidics (to acquire data) and deep learning (to analyze data) to tackle biotechnology challenges remains largely untapped. Here we provide a roadmap to integrating deep learning and microfluidics in biotechnology laboratories that matches computational architectures to problem types, and provide an outlook on emerging opportunities.

Assuntos

Biotecnologia/métodos , Aprendizado Profundo , Microfluídica/métodos , Biotecnologia/tendências , Microfluídica/tendências

Measuring and Mitigating the Costs of Attentional Switches in Active Network Monitoring for Cybersecurity.

Kortschot, Sean W; Sovilj, Dusan; Jamieson, Greg A; Sanner, Scott; Carrasco, Chelsea; Soh, Harold.

Hum Factors ; 60(7): 962-977, 2018 11.

Artigo em Inglês | MEDLINE | ID: mdl-29995449

RESUMO

OBJECTIVE: The authors seek to characterize the behavioral costs of attentional switches between points in a network map and assess the efficacy of interventions intended to reduce those costs. BACKGROUND: Cybersecurity network operators are tasked with determining an appropriate attentional allocation scheme given the state of the network, which requires repeated attentional switches. These attentional switches may result in temporal performance decrements, during which operators disengage from one attentional fixation point and engage with another. METHOD: We ran two experiments where participants identified a chain of malicious emails within a network. All interactions with the system were logged and analyzed to determine if users experienced disengagement and engagement delays. RESULTS: Both experiments revealed significant costs from attentional switches before (i.e., disengagement) and after (i.e., engagement) participants navigated to a new area in the network. In our second experiment, we found that interventions aimed at contextualizing navigation actions lessened both disengagement and engagement delays. CONCLUSION: Attentional switches are detrimental to operator performance. Their costs can be reduced by design features that contextualize navigations through an interface. APPLICATION: This research can be applied to the identification and mitigation of attentional switching costs in a variety of visual search tasks. Furthermore, it demonstrates the efficacy of noninvasive behavioral monitoring for inferring cognitive events.

Assuntos

Atenção/fisiologia , Segurança Computacional , Sistemas Computacionais , Desempenho Psicomotor/fisiologia , Adulto , Feminino , Humanos , Masculino , Adulto Jovem

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA