Search | VHL Regional Portal

1.

Genome scans for selection signatures identify candidate virulence genes for adaptation of the soybean cyst nematode to host resistance.

Kwon, Khee Man; Viana, João P G; Walden, Kimberly K O; Usovsky, Mariola; Scaboo, Andrew M; Hudson, Matthew E; Mitchum, Melissa G.

Mol Ecol ; 33(17): e17490, 2024 Sep.

Article in English | MEDLINE | ID: mdl-39135406

ABSTRACT

Plant pathogens are constantly under selection pressure for host resistance adaptation. Soybean cyst nematode (SCN, Heterodera glycines) is a major pest of soybean primarily managed through resistant cultivars; however, SCN populations have evolved virulence in response to selection pressures driven by repeated monoculture of the same genetic resistance. Resistance to SCN is mediated by multiple epistatic interactions between Rhg (for resistance to H. glycines) genes. However, the identity of SCN virulence genes that confer the ability to overcome resistance remains unknown. To identify candidate genomic regions showing signatures of selection for increased virulence, we conducted whole genome resequencing of pooled individuals (Pool-Seq) from two pairs of SCN populations adapted on soybeans with Peking-type (rhg1-a, rhg2, and Rhg4) resistance. Population differentiation and principal component analysis-based approaches identified approximately 0.72-0.79 million SNPs, the frequency of which showed potential selection signatures across multiple genomic regions. Chromosomes 3 and 6 between population pairs showed the greatest density of outlier SNPs with high population differentiation. Conducting multiple outlier detection tests to identify overlapping SNPs resulted in a total of 966 significantly differentiated SNPs, of which 285 exon SNPs were mapped to 97 genes. Of these, six genes encoded members of known stylet-secreted effector protein families potentially involved in host defence modulation including venom-allergen-like, annexin, glutathione synthetase, SPRYSEC, chitinase, and CLE effector proteins. Further functional analysis of identified candidate genes will provide new insights into the genetic mechanisms by which SCN overcomes soybean resistance and inform the development of molecular markers for rapidly screening the virulence profile of an SCN-infested field.

Subject(s)

Disease Resistance , Glycine max , Plant Diseases , Polymorphism, Single Nucleotide , Tylenchoidea , Animals , Glycine max/genetics , Glycine max/parasitology , Polymorphism, Single Nucleotide/genetics , Virulence/genetics , Plant Diseases/parasitology , Plant Diseases/genetics , Disease Resistance/genetics , Tylenchoidea/genetics , Tylenchoidea/pathogenicity , Selection, Genetic , Genetics, Population , Whole Genome Sequencing

2.

Prototyping a Secure and Usable User Authentication Mechanism for Mobile Passenger ID Devices for Land/Sea Border Control.

Papaioannou, Maria; Zachos, Georgios; Mantas, Georgios; Panaousis, Emmanouil; Rodriguez, Jonathan.

Sensors (Basel) ; 24(16)2024 Aug 11.

Article in English | MEDLINE | ID: mdl-39204888

ABSTRACT

As the number of European Union (EU) visitors grows, implementing novel border control solutions, such as mobile devices for passenger identification for land and sea border control, becomes paramount to ensure the convenience and safety of passengers and officers. However, these devices, handling sensitive personal data, become attractive targets for malicious actors seeking to misuse or steal such data. Therefore, to increase the level of security of such devices without interrupting border control activities, robust user authentication mechanisms are essential. Toward this direction, we propose a risk-based adaptive user authentication mechanism for mobile passenger identification devices for land and sea border control, aiming to enhance device security without hindering usability. In this work, we present a comprehensive assessment of novelty and outlier detection algorithms and discern OneClassSVM, Local Outlier Factor (LOF), and Bayesian_GaussianMixtureModel (B_GMM) novelty detection algorithms as the most effective ones for risk estimation in the proposed mechanism. Furthermore, in this work, we develop the proposed risk-based adaptive user authentication mechanism as an application on a Raspberry Pi 4 Model B device (i.e., playing the role of the mobile device for passenger identification), where we evaluate the detection performance of the three best performing novelty detection algorithms (i.e., OneClassSVM, LOF, and B_GMM), with B_GMM surpassing the others in performance when deployed on the Raspberry Pi 4 device. Finally, we evaluate the risk estimation overhead of the proposed mechanism when the best performing B_GMM novelty detection algorithm is used for risk estimation, indicating efficient operation with minimal additional latency.

3.

Testing Outlier Detection Algorithms for Identifying Early Stage Solute Clusters in Atom Probe Tomography.

Stroud, Ryan S; Al-Saffar, Ayham; Carter, Megan; Moody, Michael P; Pedrazzini, Stella; Wenman, Mark R.

Microsc Microanal ; 2024 Aug 27.

Article in English | MEDLINE | ID: mdl-39189873

ABSTRACT

Atom probe tomography (APT) is commonly used to study solute clustering and precipitation in materials. However, standard techniques used to identify and characterize clusters within atom probe data, such as the density-based spatial clustering applications with noise (DBSCAN), often underperform with respect to small clusters. This is a limitation of density-based cluster identification algorithms, due to their dependence on the parameter Nmin, an arbitrary lower limit placed on detectable cluster sizes. Therefore, this article attempts to consider the characterization of clustering in atom probe data as an outlier detection problem of which k-nearest neighbors local outlier factor and learnable unified neighborhood-based anomaly ranking algorithms were tested against a simulated dataset and compared to the standard method. The decision score output of the algorithms was then auto thresholded by the Karcher mean to remove human bias. Each of the major models tested outperforms DBSCAN for cluster sizes of <25 atoms but underperforms for sizes >30 atoms using simulated data. However, the new combined k-nearest neighbors (k-NN) and DBSCAN method presented was able to perform well at all cluster sizes. The combined k-NN and seven methods are presented as a new approach to identifying clusters in APT.

4.

Outlier detection for keystroke biometric user authentication.

G Ismail, Mahmoud; Salem, Mohammed A-M; Abd El Ghany, Mohamed A; Aldakheel, Eman Abdullah; Abbas, Safia.

PeerJ Comput Sci ; 10: e2086, 2024.

Article in English | MEDLINE | ID: mdl-38983219

ABSTRACT

User authentication is a fundamental aspect of information security, requiring robust measures against identity fraud and data breaches. In the domain of keystroke dynamics research, a significant challenge lies in the reliance on imposter datasets, particularly evident in real-world scenarios where obtaining authentic imposter data is exceedingly difficult. This article presents a novel approach to keystroke dynamics-based authentication, utilizing unsupervised outlier detection techniques, notably exemplified by the histogram-based outlier score (HBOS), eliminating the necessity for imposter samples. A comprehensive evaluation, comparing HBOS with 15 alternative outlier detection methods, highlights its superior performance. This departure from traditional dependence on imposter datasets signifies a substantial advancement in keystroke dynamics research. Key innovations include the introduction of an alternative outlier detection paradigm with HBOS, increased practical applicability by reducing reliance on extensive imposter data, resolution of real-world challenges in simulating fraudulent keystrokes, and addressing critical gaps in existing authentication methodologies. Rigorous testing on Carnegie Mellon University's (CMU) keystroke biometrics dataset validates the effectiveness of the proposed approach, yielding an impressive equal error rate (EER) of 5.97%, a notable area under the ROC curve of 97.79%, and a robust accuracy (ACC) of 89.23%. This article represents a significant advancement in keystroke dynamics-based authentication, offering a reliable and efficient solution characterized by substantial improvements in accuracy and practical applicability.

5.

Multiparametric identification of putative senescent cells in skeletal muscle via mass cytometry.

Li, Yijia; Baig, Nameera; Roncancio, Daniel; Elbein, Kris; Lowe, Dawn; Kyba, Michael; Arriaga, Edgar A.

Cytometry A ; 105(8): 580-594, 2024 Aug.

Article in English | MEDLINE | ID: mdl-38995093

ABSTRACT

Senescence is an irreversible arrest of the cell cycle that can be characterized by markers of senescence such as p16, p21, and KI-67. The characterization of different senescence-associated phenotypes requires selection of the most relevant senescence markers to define reliable cytometric methodologies. Mass cytometry (a.k.a. Cytometry by time of flight, CyTOF) can monitor up to 40 different cell markers at the single-cell level and has the potential to integrate multiple senescence and other phenotypic markers to identify senescent cells within a complex tissue such as skeletal muscle, with greater accuracy and scalability than traditional bulk measurements and flow cytometry-based measurements. This article introduces an analysis framework for detecting putative senescent cells based on clustering, outlier detection, and Boolean logic for outliers. Results show that the pipeline can identify putative senescent cells in skeletal muscle with well-established markers such as p21 and potential markers such as GAPDH. It was also found that heterogeneity of putative senescent cells in skeletal muscle can partly be explained by their cell type. Additionally, autophagy-related proteins ATG4A, LRRK2, and GLB1 were identified as important proteins in predicting the putative senescent population, providing insights into the association between autophagy and senescence. It was observed that sex did not affect the proportion of putative senescent cells among total cells. However, age did have an effect, with a higher proportion observed in fibro/adipogenic progenitors (FAPs), satellite cells, M1 and M2 macrophages from old mice. Moreover, putative senescent cells from muscle of old and young mice show different expression levels of senescence-related proteins, with putative senescent cells of old mice having higher levels of p21 and GAPDH, whereas putative senescent cells of young mice had higher levels of IL-6. Overall, the analysis framework prioritizes multiple senescence-associated proteins to characterize putative senescent cells sourced from tissue made of different cell types.

Subject(s)

Biomarkers , Cellular Senescence , Flow Cytometry , Muscle, Skeletal , Animals , Cellular Senescence/physiology , Mice , Muscle, Skeletal/cytology , Muscle, Skeletal/metabolism , Flow Cytometry/methods , Biomarkers/metabolism , Female , Male , Mice, Inbred C57BL , Cyclin-Dependent Kinase Inhibitor p21/metabolism , Single-Cell Analysis/methods

6.

Identifying dysregulated regions in amyotrophic lateral sclerosis through chromatin accessibility outliers.

Çelik, Muhammed Hasan; Gagneur, Julien; Lim, Ryan G; Wu, Jie; Thompson, Leslie M; Xie, Xiaohui.

HGG Adv ; 5(3): 100318, 2024 Jul 18.

Article in English | MEDLINE | ID: mdl-38872308

ABSTRACT

The high heritability of amyotrophic lateral sclerosis (ALS) contrasts with its low molecular diagnosis rate post-genetic testing, pointing to potential undiscovered genetic factors. To aid the exploration of these factors, we introduced EpiOut, an algorithm to identify chromatin accessibility outliers that are regions exhibiting divergent accessibility from the population baseline in a single or few samples. Annotation of accessible regions with histone chromatin immunoprecipitation sequencing and Hi-C indicates that outliers are concentrated in functional loci, especially among promoters interacting with active enhancers. Across different omics levels, outliers are robustly replicated, and chromatin accessibility outliers are reliable predictors of gene expression outliers and aberrant protein levels. When promoter accessibility does not align with gene expression, our results indicate that molecular aberrations are more likely to be linked to post-transcriptional regulation rather than transcriptional regulation. Our findings demonstrate that the outlier detection paradigm can uncover dysregulated regions in rare diseases. EpiOut is available at github.com/uci-cbcl/EpiOut.

Subject(s)

Amyotrophic Lateral Sclerosis , Chromatin , Amyotrophic Lateral Sclerosis/genetics , Amyotrophic Lateral Sclerosis/metabolism , Humans , Chromatin/metabolism , Chromatin/genetics , Promoter Regions, Genetic/genetics , Algorithms , Gene Expression Regulation , Chromatin Immunoprecipitation Sequencing , Histones/metabolism , Histones/genetics

7.

Chemoinformatic regression methods and their applicability domain.

Dutschmann, Thomas-Martin; Schlenker, Valerie; Baumann, Knut.

Mol Inform ; 43(7): e202400018, 2024 Jul.

Article in English | MEDLINE | ID: mdl-38803302

ABSTRACT

The growing interest in chemoinformatic model uncertainty calls for a summary of the most widely used regression techniques and how to estimate their reliability. Regression models learn a mapping from the space of explanatory variables to the space of continuous output values. Among other limitations, the predictive performance of the model is restricted by the training data used for model fitting. Identification of unusual objects by outlier detection methods can improve model performance. Additionally, proper model evaluation necessitates defining the limitations of the model, often called the applicability domain. Comparable to certain classifiers, some regression techniques come with built-in methods or augmentations to quantify their (un)certainty, while others rely on generic procedures. The theoretical background of their working principles and how to deduce specific and general definitions for their domain of applicability shall be explained.

Subject(s)

Cheminformatics , Cheminformatics/methods , Regression Analysis

8.

SEAOP: a statistical ensemble approach for outlier detection in quantitative proteomics data.

Huang, Jinze; Zhao, Yang; Meng, Bo; Lu, Ao; Wei, Yaoguang; Dong, Lianhua; Fang, Xiang; An, Dong; Dai, Xinhua.

Brief Bioinform ; 25(3)2024 Mar 27.

Article in English | MEDLINE | ID: mdl-38557674

ABSTRACT

Quality control in quantitative proteomics is a persistent challenge, particularly in identifying and managing outliers. Unsupervised learning models, which rely on data structure rather than predefined labels, offer potential solutions. However, without clear labels, their effectiveness might be compromised. Single models are susceptible to the randomness of parameters and initialization, which can result in a high rate of false positives. Ensemble models, on the other hand, have shown capabilities in effectively mitigating the impacts of such randomness and assisting in accurately detecting true outliers. Therefore, we introduced SEAOP, a Python toolbox that utilizes an ensemble mechanism by integrating multi-round data management and a statistics-based decision pipeline with multiple models. Specifically, SEAOP uses multi-round resampling to create diverse sub-data spaces and employs outlier detection methods to identify candidate outliers in each space. Candidates are then aggregated as confirmed outliers via a chi-square test, adhering to a 95% confidence level, to ensure the precision of the unsupervised approaches. Additionally, SEAOP introduces a visualization strategy, specifically designed to intuitively and effectively display the distribution of both outlier and non-outlier samples. Optimal hyperparameter models of SEAOP for outlier detection were identified by using a gradient-simulated standard dataset and Mann-Kendall trend test. The performance of the SEAOP toolbox was evaluated using three experimental datasets, confirming its reliability and accuracy in handling quantitative proteomics.

Subject(s)

Data Management , Proteomics , Reproducibility of Results , Quality Control , Data Interpretation, Statistical

9.

Kalman filter with impulse noised outliers: a robust sequential algorithm to filter data with a large number of outliers.

Cloez, Bertrand; Fontez, Bénédicte; González-García, Eliel; Sanchez, Isabelle.

Int J Biostat ; 2024 Apr 17.

Article in English | MEDLINE | ID: mdl-38625678

ABSTRACT

Impulse noised outliers are data points that differ significantly from other observations. They are generally removed from the data set through local regression or the Kalman filter algorithm. However, these methods, or their generalizations, are not well suited when the number of outliers is of the same order as the number of low-noise data (often called nominal measurement). In this article, we propose a new model for impulsed noise outliers. It is based on a hierarchical model and a simple linear Gaussian process as with the Kalman Filter. We present a fast forward-backward algorithm to filter and smooth sequential data and which also detects these outliers. We compare the robustness and efficiency of this algorithm with classical methods. Finally, we apply this method on a real data set from a Walk Over Weighing system admitting around 60â¯% of outliers. For this application, we further develop an (explicit) EM algorithm to calibrate some algorithm parameters.

10.

Outlier detection in spatial error models using modified thresholding-based iterative procedure for outlier detection approach.

Cai, Jiaxin; Hu, Weiwei; Yang, Yuhui; Yan, Hong; Chen, Fangyao.

BMC Med Res Methodol ; 24(1): 89, 2024 Apr 15.

Article in English | MEDLINE | ID: mdl-38622516

ABSTRACT

BACKGROUND: Outliers, data points that significantly deviate from the norm, can have a substantial impact on statistical inference and provide valuable insights in data analysis. Multiple methods have been developed for outlier detection, however, almost all available approaches fail to consider the spatial dependence and heterogeneity in spatial data. Spatial data has diverse formats and semantics, requiring specialized outlier detection methodology to handle these unique properties. For now, there is limited research exists on robust spatial outlier detection methods designed specifically under the spatial error model (SEM) structure. METHOD: We propose the Spatial-Θ-Iterative Procedure for Outlier Detection (Spatial-Θ-IPOD), which utilizes a mean-shift vector to identify outliers within the SEM. Our method enables an effective detection of spatial outliers while also providing robust coefficient estimates. To assess the performance of our approach, we conducted extensive simulations and applied it to a real-world empirical study using life expectancy data from multiple countries. RESULTS: Simulation results showed that the masking and JD (Joint Detection) indicators of our Spatial-Θ-IPOD method outperformed several commonly used methods, even in high-dimensional scenarios, demonstrating stable performance. Conversely, the Θ-IPOD method proved to be ineffective in detecting outliers when spatial correlation was present. Moreover, our model successfully provided reliable coefficient estimation alongside outlier detection. The proposed method consistently outperformed other models (both robust and non-robust) in most cases. In the empirical study, our proposed model successfully detected outliers and provided valuable insights in the modeling process. CONCLUSIONS: Our proposed Spatial-Θ-IPOD offers an effective solution for detecting spatial outliers for SEM while providing robust coefficient estimates. Notably, our approach showcases its relative superiority even in the presence of high leverage points. By successfully identifying outliers, our method enhances the overall understanding of the data and provides valuable insights for further analysis.

11.

Identification of near-infrared characteristic bands of small bowel necrosis based on cellwise detection algorithm.

Peng, Chenxi; Huang, Guangzao; Chen, Xiaojing; Xie, Zhonghao; Ali, Shujat; Chen, Xi; Nie, Huagui; Yang, Zhi; Zhu, Libin; Chen, Xiaoqing; Yan, Shubin.

J Biophotonics ; 17(6): e202300438, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38468556

ABSTRACT

The near-infrared spectroscopy is often used to distinguish small bowel necrosis due to necrotizing enterocolitis (NEC). The characteristic bands of small bowel necrosis, as an important basis for evaluating the confidence of the differentiation results, are challenging to identify quickly. In this study, we proposed to identify characteristic bands of lesion samples based on hyperspectral imaging (HSI) and cellwise outlier detection. Rabbits were used as an animal model to simulate the clinical symptoms of NEC. The rabbits were detected at intervals of 10, 30, 60, and 90 min. The characteristic bands were identified within the same rabbit, between different rabbits and at different times. The result showed the bands near 763 nm, corresponding to the absorption peak of deoxyhemoglobin, were the characteristic bands separating samples with NEC. The identification result was plausible because hypoxia was the main cause of NEC. The method was easy to perform.

Subject(s)

Algorithms , Enterocolitis, Necrotizing , Intestine, Small , Necrosis , Spectroscopy, Near-Infrared , Animals , Rabbits , Intestine, Small/pathology , Intestine, Small/diagnostic imaging , Enterocolitis, Necrotizing/pathology , Enterocolitis, Necrotizing/diagnostic imaging , Hyperspectral Imaging

12.

General value functions for fault detection in multivariate time series data.

Wong, Andy; Taghian Jazi, Mehran; Takeuchi, Tomoharu; Günther, Johannes; Zaïane, Osmar.

Front Robot AI ; 11: 1214043, 2024.

Article in English | MEDLINE | ID: mdl-38544745

ABSTRACT

One of the greatest challenges to the automated production of goods is equipment malfunction. Ideally, machines should be able to automatically predict and detect operational faults in order to minimize downtime and plan for timely maintenance. While traditional condition-based maintenance (CBM) involves costly sensor additions and engineering, machine learning approaches offer the potential to learn from already existing sensors. Implementations of data-driven CBM typically use supervised and semi-supervised learning to classify faults. In addition to a large collection of operation data, records of faulty operation are also necessary, which are often costly to obtain. Instead of classifying faults, we use an approach to detect abnormal behaviour within the machine's operation. This approach is analogous to semi-supervised anomaly detection in machine learning (ML), with important distinctions in experimental design and evaluation specific to the problem of industrial fault detection. We present a novel method of machine fault detection using temporal-difference learning and General Value Functions (GVFs). Using GVFs, we form a predictive model of sensor data to detect faulty behaviour. As sensor data from machines is not i.i.d. but closer to Markovian sampling, temporal-difference learning methods should be well suited for this data. We compare our GVF outlier detection (GVFOD) algorithm to a broad selection of multivariate and temporal outlier detection methods, using datasets collected from a tabletop robot emulating the movement of an industrial actuator. We find that not only does GVFOD achieve the same recall score as other multivariate OD algorithms, it attains significantly higher precision. Furthermore, GVFOD has intuitive hyperparameters which can be selected based upon expert knowledge of the application. Together, these findings allow for a more reliable detection of abnormal machine behaviour to allow ideal timing of maintenance; saving resources, time and cost.

13.

Knowledge-based quality assurance of a comprehensive set of organ at risk contours for head and neck radiotherapy.

Brooks, Jamison; Tryggestad, Erik; Anand, Aman; Beltran, Chris; Foote, Robert; Lucido, J John; Laack, Nadia N; Routman, David; Patel, Samir H; Seetamsetty, Srinivas; Moseley, Douglas.

Front Oncol ; 14: 1295251, 2024.

Article in English | MEDLINE | ID: mdl-38487718

ABSTRACT

Introduction: Manual review of organ at risk (OAR) contours is crucial for creating safe radiotherapy plans but can be time-consuming and error prone. Statistical and deep learning models show the potential to automatically detect improper contours by identifying outliers using large sets of acceptable data (knowledge-based outlier detection) and may be able to assist human reviewers during review of OAR contours. Methods: This study developed an automated knowledge-based outlier detection method and assessed its ability to detect erroneous contours for all common head and neck (HN) OAR types used clinically at our institution. We utilized 490 accurate CT-based HN structure sets from unique patients, each with forty-two HN OAR contours when anatomically present. The structure sets were distributed as 80% for training, 10% for validation, and 10% for testing. In addition, 190 and 37 simulated contours containing errors were added to the validation and test sets, respectively. Single-contour features, including location, shape, orientation, volume, and CT number, were used to train three single-contour feature models (z-score, Mahalanobis distance [MD], and autoencoder [AE]). Additionally, a novel contour-to-contour relationship (CCR) model was trained using the minimum distance and volumetric overlap between pairs of OAR contours to quantify overlap and separation. Inferences from single-contour feature models were combined with the CCR model inferences and inferences evaluating the number of disconnected parts in a single contour and then compared. Results: In the test dataset, before combination with the CCR model, the area under the curve values were 0.922/0.939/0.939 for the z-score, MD, and AE models respectively for all contours. After combination with CCR model inferences, the z-score, MD, and AE had sensitivities of 0.838/0.892/0.865, specificities of 0.922/0.907/0.887, and balanced accuracies (BA) of 0.880/0.900/0.876 respectively. In the validation dataset, with similar overall performance and no signs of overfitting, model performance for individual OAR types was assessed. The combined AE model demonstrated minimum, median, and maximum BAs of 0.729, 0.908, and 0.980 across OAR types. Discussion: Our novel knowledge-based method combines models utilizing single-contour and CCR features to effectively detect erroneous OAR contours across a comprehensive set of 42 clinically used OAR types for HN radiotherapy.

14.

Bearing fault detection by using graph autoencoder and ensemble learning.

Wang, Meng; Yu, Jiong; Leng, Hongyong; Du, Xusheng; Liu, Yiran.

Sci Rep ; 14(1): 5206, 2024 Mar 03.

Article in English | MEDLINE | ID: mdl-38433237

ABSTRACT

The research and application of bearing fault diagnosis techniques are crucial for enhancing equipment reliability, extending bearing lifespan, and reducing maintenance expenses. Nevertheless, most existing methods encounter challenges in discriminating between signals from machines operating under normal and faulty conditions, leading to unstable detection results. To tackle this issue, the present study proposes a novel approach for bearing fault detection based on graph neural networks and ensemble learning. Our key contribution is a novel stochasticity-based compositional method that transforms Euclidean-structured data into a graph format for processing by graph neural networks, with feature fusion and a newly proposed ensemble learning strategy for outlier detection specifically designed for bearing fault diagnosis. This approach marks a significant advancement in accurately identifying bearing faults, highlighting our study's pivotal role in enhancing diagnostic methodologies.

15.

Sources of Variance in Human Tear Proteomic Samples: Statistical Evaluation, Quality Control, Normalization, and Biological Insight.

Bruszel, Bella; Tóth-Molnár, Edit; Janáky, Tamás; Szabó, Zoltán.

Int J Mol Sci ; 25(3)2024 Jan 26.

Article in English | MEDLINE | ID: mdl-38338841

ABSTRACT

Human tear fluid contains numerous compounds, which are present in highly variable amounts owing to the dynamic and multipurpose functions of tears. A better understanding of the level and sources of variance is essential for determining the functions of the different tear components and the limitations of tear samples as a potential biomarker source. In this study, a quantitative proteomic method was used to analyze variations in the tear protein profiles of healthy volunteers. High day-to-day and inter-eye personal variances were observed in the tear volumes, protein content, and composition of the tear samples. Several normalization and outlier exclusion approaches were evaluated to decrease variances. Despite the intrapersonal variances, statistically significant differences and cluster analysis revealed that proteome profile and immunoglobulin composition of tear fluid present personal characteristics. Using correlation analysis, we could identify several correlating protein clusters, mainly related to the source of the proteins. Our study is the first attempt to achieve more insight into the biochemical background of human tears by statistical evaluation of the experimentally observed dynamic behavior of the tear proteome. As a pilot study for determination of personal protein profiles of the tear fluids of individual patients, it contributes to the application of this noninvasively collectible body fluid in personal medicine.

Subject(s)

Proteome , Proteomics , Humans , Proteome/metabolism , Proteomics/methods , Pilot Projects , Tears/metabolism , Eye Proteins/metabolism , Quality Control

16.

Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images.

Selle, Michael; Kircher, Magdalena; Schwennen, Cornelia; Visscher, Christian; Jung, Klaus.

BMC Med Inform Decis Mak ; 24(1): 49, 2024 Feb 14.

Article in English | MEDLINE | ID: mdl-38355504

ABSTRACT

BACKGROUND: Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted. METHODS: We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space. RESULTS: MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA. CONCLUSIONS: MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.

Subject(s)

Algorithms , Tomography, X-Ray Computed , Humans , Cluster Analysis , Principal Component Analysis

17.

Communication Delay Outlier Detection and Compensation for Teleoperation Using Stochastic State Estimation.

Kim, Eugene; Hwang, Myeonghwan; Lim, Taeyoon; Jeong, Chanyeong; Yoon, Seungha; Cha, Hyunrok.

Sensors (Basel) ; 24(4)2024 Feb 15.

Article in English | MEDLINE | ID: mdl-38400399

ABSTRACT

There have been numerous studies attempting to overcome the limitations of current autonomous driving technologies. However, there is no doubt that it is challenging to promise integrity of safety regarding urban driving scenarios and dynamic driving environments. Among the reported countermeasures to supplement the uncertain behavior of autonomous vehicles, teleoperation of the vehicle has been introduced to deal with the disengagement of autonomous driving. However, teleoperation can lead the vehicle to unforeseen and hazardous situations from the viewpoint of wireless communication stability. In particular, communication delay outliers that severely deviate from the passive communication delay should be highlighted because they could hamper the cognition of the circumstances monitored by the teleoperator, or the control signal could be contaminated regardless of the teleoperator's intention. In this study, communication delay outliers were detected and classified based on the stochastic approach (passive delays and outliers were estimated as 98.67% and 1.33%, respectively). Results indicate that communication delay outliers can be automatically detected, independently of the real-time quality of wireless communication stability. Moreover, the proposed framework demonstrates resilience against outliers, thereby mitigating potential performance degradation.

18.

Modern Subtype Classification and Outlier Detection Using the Attention Embedder to Transform Ovarian Cancer Diagnosis.

Nobel, S M Nuruzzaman; Swapno, S M Masfequier Rahman; Hossain, Md Ashraful; Safran, Mejdl; Alfarhood, Sultan; Kabir, Md Mohsin; Mridha, M F.

Tomography ; 10(1): 105-132, 2024 01 15.

Article in English | MEDLINE | ID: mdl-38250956

ABSTRACT

Ovarian cancer, a deadly female reproductive system disease, is a significant challenge in medical research due to its notorious lethality. Addressing ovarian cancer in the current medical landscape has become more complex than ever. This research explores the complex field of Ovarian Cancer Subtype Classification and the crucial task of Outlier Detection, driven by a progressive automated system, as the need to fight this unforgiving illness becomes critical. This study primarily uses a unique dataset painstakingly selected from 20 esteemed medical institutes. The dataset includes a wide range of images, such as tissue microarray (TMA) images at 40× magnification and whole-slide images (WSI) at 20× magnification. The research is fully committed to identifying abnormalities within this complex environment, going beyond the classification of subtypes of ovarian cancer. We proposed a new Attention Embedder, a state-of-the-art model with effective results in ovarian cancer subtype classification and outlier detection. Using images magnified WSI, the model demonstrated an astonishing 96.42% training accuracy and 95.10% validation accuracy. Similarly, with images magnified via a TMA, the model performed well, obtaining a validation accuracy of 94.90% and a training accuracy of 93.45%. Our fine-tuned hyperparameter testing resulted in exceptional performance on independent images. At 20× magnification, we achieved an accuracy of 93.56%. Even at 40× magnification, our testing accuracy remained high, at 91.37%. This study highlights how machine learning can revolutionize the medical field's ability to classify ovarian cancer subtypes and identify outliers, giving doctors a valuable tool to lessen the severe effects of the disease. Adopting this novel method is likely to improve the practice of medicine and give people living with ovarian cancer worldwide hope.

Subject(s)

Ovarian Neoplasms , Physicians , Female , Humans , Ovarian Neoplasms/diagnostic imaging , Machine Learning

19.

Automatic recording of rare behaviors of wild animals using video bio-loggers with on-board light-weight outlier detector.

Tanigaki, Kei; Otsuka, Ryoma; Li, Aiyi; Hatano, Yota; Wei, Yuanzhou; Koyama, Shiho; Yoda, Ken; Maekawa, Takuya.

PNAS Nexus ; 3(1): pgad447, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38229952

ABSTRACT

Rare behaviors displayed by wild animals can generate new hypotheses; however, observing such behaviors may be challenging. While recent technological advancements, such as bio-loggers, may assist in documenting rare behaviors, the limited running time of battery-powered bio-loggers is insufficient to record rare behaviors when employing high-cost sensors (e.g. video cameras). In this study, we propose an artificial intelligence (AI)-enabled bio-logger that automatically detects outlier readings from always-on low-cost sensors, e.g. accelerometers, indicative of rare behaviors in target animals, without supervision by researchers, subsequently activating high-cost sensors to record only these behaviors. We implemented an on-board outlier detector via knowledge distillation by building a lightweight outlier classifier supervised by a high-cost outlier behavior detector trained in an unsupervised manner. The efficacy of AI bio-loggers has been demonstrated on seabirds, where videos and sensor data captured by the bio-loggers have enabled the identification of some rare behaviors, facilitating analyses of their frequency, and potential factors underlying these behaviors. This approach offers a means of documenting previously overlooked rare behaviors, augmenting our understanding of animal behavior.

20.

Detecting outliers beyond tolerance limits derived from statistical process control in patient-specific quality assurance.

Tan, Hong Qi; Lew, Kah Seng; Wong, Yun Ming; Chong, Wen Chuan; Koh, Calvin Wei Yang; Chua, Clifford Ghee Ann; Yeap, Ping Lin; Ang, Khong Wei; Lee, James Cheow Lei; Park, Sung Yong.

J Appl Clin Med Phys ; 25(2): e14154, 2024 Feb.

Article in English | MEDLINE | ID: mdl-37683120

ABSTRACT

BACKGROUND: Tolerance limit is defined on pre-treatment patient specific quality assurance results to identify "out of the norm" dose discrepancy in plan. An out-of-tolerance plan during measurement can often cause treatment delays especially if replanning is required. In this study, we aim to develop an outlier detection model to identify out-of-tolerance plan early during treatment planning phase to mitigate the above-mentioned risks. METHODS: Patient-specific quality assurance results with portal dosimetry for stereotactic body radiotherapy measured between January 2020 and December 2021 were used in this study. Data were divided into thorax and pelvis sites and gamma passing rates were recorded using 2%/2 mm, 2%/1 mm, and 1%/1 mm gamma criteria. Statistical process control method was used to determine six different site and criterion-specific tolerance and action limits. Using only the inliers identified with our determined tolerance limits, we trained three different outlier detection models using the plan complexity metrics extracted from each treatment field-robust covariance, isolation forest, and one class support vector machine. The hyperparameters were optimized using the F1-score calculated from both the inliers and validation outliers' data. RESULTS: 308 pelvis and 200 thorax fields were used in this study. The tolerance (action) limits for 2%/2 mm, 2%/1 mm, and 1%/1 mm gamma criteria in the pelvis site are 99.1% (98.1%), 95.8% (91.1%), and 91.7% (86.1%), respectively. The tolerance (action) limits in the thorax site are 99.0% (98.7%), 97.0% (96.2%), and 91.5% (87.2%). One class support vector machine performs the best among all the algorithms. The best performing model in the thorax (pelvis) site achieves a precision of 0.56 (0.54), recall of 1.0 (1.0), and F1-score of 0.72 (0.70) when using the 2%/2 mm (2%/1 mm) criterion. CONCLUSION: The model will help the planner to identify an out-of-tolerance plan early so that they can refine the plan further during the planning stage without risking late discovery during measurement.

Subject(s)

Radiosurgery , Radiotherapy, Intensity-Modulated , Humans , Radiotherapy Planning, Computer-Assisted/methods , Radiotherapy Dosage , Algorithms , Pelvis , Radiometry/methods , Radiotherapy, Intensity-Modulated/methods , Quality Assurance, Health Care

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL