Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
Add more filters










Publication year range
1.
Artif Intell Med ; 149: 102786, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38462286

ABSTRACT

In machine learning, data often comes from different sources, but combining them can introduce extraneous variation that affects both generalization and interpretability. For example, we investigate the classification of neurodegenerative diseases using FDG-PET data collected from multiple neuroimaging centers. However, data collected at different centers introduces unwanted variation due to differences in scanners, scanning protocols, and processing methods. To address this issue, we propose a two-step approach to limit the influence of center-dependent variation on the classification of healthy controls and early vs. late-stage Parkinson's disease patients. First, we train a Generalized Matrix Learning Vector Quantization (GMLVQ) model on healthy control data to identify a "relevance space" that distinguishes between centers. Second, we use this space to construct a correction matrix that restricts a second GMLVQ system's training on the diagnostic problem. We evaluate the effectiveness of this approach on the real-world multi-center datasets and simulated artificial dataset. Our results demonstrate that the approach produces machine learning systems with reduced bias - being more specific due to eliminating information related to center differences during the training process - and more informative relevance profiles that can be interpreted by medical experts. This method can be adapted to similar problems outside the neuroimaging domain, as long as an appropriate "relevance space" can be identified to construct the correction matrix.


Subject(s)
Neuroimaging , Parkinson Disease , Humans , Positron-Emission Tomography , Machine Learning , Parkinson Disease/diagnostic imaging
2.
Entropy (Basel) ; 25(3)2023 Mar 21.
Article in English | MEDLINE | ID: mdl-36981428

ABSTRACT

In the field of machine learning, vector quantization is a category of low-complexity approaches that are nonetheless powerful for data representation and clustering or classification tasks. Vector quantization is based on the idea of representing a data or a class distribution using a small set of prototypes, and hence, it belongs to interpretable models in machine learning. Further, the low complexity of vector quantizers makes them interesting for the application of quantum concepts for their implementation. This is especially true for current and upcoming generations of quantum devices, which only allow the execution of simple and restricted algorithms. Motivated by different adaptation and optimization paradigms for vector quantizers, we provide an overview of respective existing quantum algorithms and routines to realize vector quantization concepts, maybe only partially, on quantum devices. Thus, the reader can infer the current state-of-the-art when considering quantum computing approaches for vector quantization.

3.
Z Psychosom Med Psychother ; 69(1): 56-75, 2023 Feb.
Article in German | MEDLINE | ID: mdl-36927321

ABSTRACT

Objectives: As part of the quality assurance of inpatient treatment, the severity of the disease and the course of therapy must be mapped. However, there is a high degree of heterogeneity in the implementation of basic diagnostics in psychosomatic facilities.There is a lack of scientifically based standardisation in determining the quality of outcomes. Methods: With the help of scientifically established test instruments, a resource-saving basic documentation instrument was developed. Many existing psychometric instruments were checked for test quality, costs and computer-supported application. Results: The Psychosomatic Health Inventory (gi-ps) consists of three basic modules with a total of 63 items: sociodemography, screening and psychosomatic health status.The latter is represented bymeans of construct-based recording on eight scales. Its collection at admission and discharge allows the presentation of the quality of outcomes.The development of a proprietary software solution with LimeSurvey enables the computer-based collection, evaluation, and storage of data. A list of test inventories for confirming diagnoses and predictors has been compiled, which are recommended for use in clinical routine. Discussion: With the gi-ps, a modular basic documentation instrument including the software solution is available to all interested institutions free of charge.


Subject(s)
Inpatients , Quality Assurance, Health Care , Humans , Hospitalization , Psychophysiologic Disorders/diagnosis , Psychophysiologic Disorders/therapy , Psychophysiologic Disorders/psychology , Documentation
4.
Article in English | MEDLINE | ID: mdl-34990369

ABSTRACT

The encounter of large amounts of biological sequence data generated during the last decades and the algorithmic and hardware improvements have offered the possibility to apply machine learning techniques in bioinformatics. While the machine learning community is aware of the necessity to rigorously distinguish data transformation from data comparison and adopt reasonable combinations thereof, this awareness is often lacking in the field of comparative sequence analysis. With realization of the disadvantages of alignments for sequence comparison, some typical applications use more and more so-called alignment-free approaches. In light of this development, we present a conceptual framework for alignment-free sequence comparison, which highlights the delineation of: 1) the sequence data transformation comprising of adequate mathematical sequence coding and feature generation, from 2) the subsequent (dis-)similarity evaluation of the transformed data by means of problem-specific but mathematically consistent proximity measures. We consider coding to be an information-loss free data transformation in order to get an appropriate representation, whereas feature generation is inevitably information-lossy with the intention to extract just the task-relevant information. This distinction sheds light on the plethora of methods available and assists in identifying suitable methods in machine learning and data analysis to compare the sequences under these premises.


Subject(s)
Algorithms , Machine Learning , Sequence Alignment , Sequence Analysis , Mathematics
5.
Neural Comput Appl ; 34(1): 67-78, 2022.
Article in English | MEDLINE | ID: mdl-33935376

ABSTRACT

We present an approach to discriminate SARS-CoV-2 virus types based on their RNA sequence descriptions avoiding a sequence alignment. For that purpose, sequences are preprocessed by feature extraction and the resulting feature vectors are analyzed by prototype-based classification to remain interpretable. In particular, we propose to use variants of learning vector quantization (LVQ) based on dissimilarity measures for RNA sequence data. The respective matrix LVQ provides additional knowledge about the classification decisions like discriminant feature correlations and, additionally, can be equipped with easy to realize reject options for uncertain data. Those options provide self-controlled evidence, i.e., the model refuses to make a classification decision if the model evidence for the presented data is not sufficient. This model is first trained using a GISAID dataset with given virus types detected according to the molecular differences in coronavirus populations by phylogenetic tree clustering. In a second step, we apply the trained model to another but unlabeled SARS-CoV-2 virus dataset. For these data, we can either assign a virus type to the sequences or reject atypical samples. Those rejected sequences allow to speculate about new virus types with respect to nucleotide base mutations in the viral sequences. Moreover, this rejection analysis improves model robustness. Last but not least, the presented approach has lower computational complexity compared to methods based on (multiple) sequence alignment. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00521-021-06018-2.

6.
Entropy (Basel) ; 23(10)2021 Oct 17.
Article in English | MEDLINE | ID: mdl-34682081

ABSTRACT

In the present article we propose the application of variants of the mutual information function as characteristic fingerprints of biomolecular sequences for classification analysis. In particular, we consider the resolved mutual information functions based on Shannon-, Rényi-, and Tsallis-entropy. In combination with interpretable machine learning classifier models based on generalized learning vector quantization, a powerful methodology for sequence classification is achieved which allows substantial knowledge extraction in addition to the high classification ability due to the model-inherent robustness. Any potential (slightly) inferior performance of the used classifier is compensated by the additional knowledge provided by interpretable models. This knowledge may assist the user in the analysis and understanding of the used data and considered task. After theoretical justification of the concepts, we demonstrate the approach for various example data sets covering different areas in biomolecular sequence analysis.

7.
Sensors (Basel) ; 21(13)2021 Jun 27.
Article in English | MEDLINE | ID: mdl-34199090

ABSTRACT

Sensor fusion has gained a great deal of attention in recent years. It is used as an application tool in many different fields, especially the semiconductor, automotive, and medical industries. However, this field of research, regardless of the field of application, still presents different challenges concerning the choice of the sensors to be combined and the fusion architecture to be developed. To decrease application costs and engineering efforts, it is very important to analyze the sensors' data beforehand once the application target is defined. This pre-analysis is a basic step to establish a working environment with fewer misclassification cases and high safety. One promising approach to do so is to analyze the system using deep neural networks. The disadvantages of this approach are mainly the required huge storage capacity, the big training effort, and that these networks are difficult to interpret. In this paper, we focus on developing a smart and interpretable bi-functional artificial intelligence (AI) system, which has to discriminate the combined data regarding predefined classes. Furthermore, the system can evaluate the single source signals used in the classification task. The evaluation here covers each sensor contribution and robustness. More precisely, we train a smart and interpretable prototype-based neural network, which learns automatically to weight the influence of the sensors for the classification decision. Moreover, the prototype-based classifier is equipped with a reject option to measure classification certainty. To validate our approach's efficiency, we refer to different industrial sensor fusion applications.


Subject(s)
Artificial Intelligence , Neural Networks, Computer , Decision Making
8.
Bioinformatics ; 36(22-23): 5507-5513, 2021 Apr 01.
Article in English | MEDLINE | ID: mdl-33367605

ABSTRACT

MOTIVATION: Viruses are the most abundant biological entities and constitute a large reservoir of genetic diversity. In recent years, knowledge about them has increased significantly as a result of dynamic development in life sciences and rapid technological progress. This knowledge is scattered across various data repositories, making a comprehensive analysis of viral data difficult. RESULTS: In response to the need for gathering a comprehensive knowledge of viruses and viral sequences, we developed Virxicon, a lexicon of all experimentally acquired sequences for RNA and DNA viruses. The ability to quickly obtain data for entire viral groups, searching sequences by levels of taxonomic hierarchy-according to the Baltimore classification and ICTV taxonomy-and tracking the distribution of viral data and its growth over time are unique features of our database compared to the other tools. AVAILABILITYAND IMPLEMENTATION: Virxicon is a publicly available resource, updated weekly. It has an intuitive web interface and can be freely accessed at http://virxicon.cs.put.poznan.pl/.

9.
Thromb Res ; 180: 98-104, 2019 Aug.
Article in English | MEDLINE | ID: mdl-31276978

ABSTRACT

INTRODUCTION: Little is known about peril constellations in primary hemostasis contributing to an acute myocardial infarction (MI) in patients with already manifest atherosclerosis. The study aimed to establish a predicting model based on six biomarkers of primary hemostasis: platelet count, mean platelet volume, hematocrit, soluble glycoprotein VI, fibrinogen and von Willebrand factor ratio. MATERIALS AND METHODS: The biomarkers were measured in 1.491 patients with manifest atherosclerosis of the Leipzig (LIFE) heart study. Three groups were divided: patients with coronary artery disease (900 patients) and patients with atherosclerosis and either ST-elevated MI (404 patients) or Non-ST-elevated MI (187 patients). Correlations were analyzed by non-linear analysis with Self Organizing Maps. Classification and discriminant analysis was performed using Learning Vector Quantization. RESULTS AND CONCLUSIONS: The combination of hemostatic biomarkers is regarded as valuable tool for identifying patients with atherosclerosis at risk for MI. Nevertheless, our study contradicts this belief. The biomarkers did not allow to establish a predicting model usable in daily patient care. Good specificity and sensitivity for the detection of MI was only reached in models including acute phase parameters (specificity 0,9036, sensitivity 0,7937 in men; 0,8977 and 0,8133 in women). In detail, hematocrit and soluble glycoprotein VI were significantly different between the groups. Significant dissimilarities were also found for fibrinogen (in men) and von Willebrand factor ratio. In contrast, the most promising parameters mean platelet volume and platelet count showed no difference, which is an important contribution to the controversy concerning them as new risk and therapy targets for MI.


Subject(s)
Atherosclerosis/blood , Blood Platelets/cytology , Non-ST Elevated Myocardial Infarction/blood , Platelet Membrane Glycoproteins/analysis , ST Elevation Myocardial Infarction/blood , von Willebrand Factor/analysis , Aged , Atherosclerosis/complications , Biomarkers/blood , Coronary Artery Disease/blood , Coronary Artery Disease/complications , Female , Hemostasis , Humans , Male , Mean Platelet Volume , Middle Aged , Non-ST Elevated Myocardial Infarction/etiology , Platelet Count , Risk Factors , ST Elevation Myocardial Infarction/etiology
10.
BioData Min ; 12: 1, 2019.
Article in English | MEDLINE | ID: mdl-30627219

ABSTRACT

BACKGROUND: Machine learning strategies are prominent tools for data analysis. Especially in life sciences, they have become increasingly important to handle the growing datasets collected by the scientific community. Meanwhile, algorithms improve in performance, but also gain complexity, and tend to neglect interpretability and comprehensiveness of the resulting models. RESULTS: Generalized Matrix Learning Vector Quantization (GMLVQ) is a supervised, prototype-based machine learning method and provides comprehensive visualization capabilities not present in other classifiers which allow for a fine-grained interpretation of the data. In contrast to commonly used machine learning strategies, GMLVQ is well-suited for imbalanced classification problems which are frequent in life sciences. We present a Weka plug-in implementing GMLVQ. The feasibility of GMLVQ is demonstrated on a dataset of Early Folding Residues (EFR) that have been shown to initiate and guide the protein folding process. Using 27 features, an area under the receiver operating characteristic of 76.6% was achieved which is comparable to other state-of-the-art classifiers. The obtained model is accessible at https://biosciences.hs-mittweida.de/efpred/. CONCLUSIONS: The application on EFR prediction demonstrates how an easy interpretation of classification models can promote the comprehension of biological mechanisms. The results shed light on the special features of EFR which were reported as most influential for the classification: EFR are embedded in ordered secondary structure elements and they participate in networks of hydrophobic residues. Visualization capabilities of GMLVQ are presented as we demonstrate how to interpret the results.

11.
Wiley Interdiscip Rev Cogn Sci ; 7(2): 92-111, 2016.
Article in English | MEDLINE | ID: mdl-26800334

ABSTRACT

An overview is given of prototype-based models in machine learning. In this framework, observations, i.e., data, are stored in terms of typical representatives. Together with a suitable measure of similarity, the systems can be employed in the context of unsupervised and supervised analysis of potentially high-dimensional, complex datasets. We discuss basic schemes of competitive vector quantization as well as the so-called neural gas approach and Kohonen's topology-preserving self-organizing map. Supervised learning in prototype systems is exemplified in terms of learning vector quantization. Most frequently, the familiar Euclidean distance serves as a dissimilarity measure. We present extensions of the framework to nonstandard measures and give an introduction to the use of adaptive distances in relevance learning.


Subject(s)
Computer Simulation , Data Mining/methods , Machine Learning , Pattern Recognition, Automated/methods , Neural Networks, Computer , Neurons/physiology , Statistics as Topic
12.
Comput Intell Neurosci ; 2013: 165248, 2013.
Article in English | MEDLINE | ID: mdl-24396342

ABSTRACT

We consider some modifications of the neural gas algorithm. First, fuzzy assignments as known from fuzzy c-means and neighborhood cooperativeness as known from self-organizing maps and neural gas are combined to obtain a basic Fuzzy Neural Gas. Further, a kernel variant and a simulated annealing approach are derived. Finally, we introduce a fuzzy extension of the ConnIndex to obtain an evaluation measure for clusterings based on fuzzy vector quantization.


Subject(s)
Algorithms , Fuzzy Logic , Neural Networks, Computer , Cluster Analysis
13.
Neural Netw ; 26: 159-73, 2012 Feb.
Article in English | MEDLINE | ID: mdl-22041220

ABSTRACT

We present an extension of the recently introduced Generalized Matrix Learning Vector Quantization algorithm. In the original scheme, adaptive square matrices of relevance factors parameterize a discriminative distance measure. We extend the scheme to matrices of limited rank corresponding to low-dimensional representations of the data. This allows to incorporate prior knowledge of the intrinsic dimension and to reduce the number of adaptive parameters efficiently. In particular, for very large dimensional data, the limitation of the rank can reduce computation time and memory requirements significantly. Furthermore, two- or three-dimensional representations constitute an efficient visualization method for labeled data sets. The identification of a suitable projection is not treated as a pre-processing step but as an integral part of the supervised training. Several real world data sets serve as an illustration and demonstrate the usefulness of the suggested method.


Subject(s)
Artificial Intelligence , Learning , Algorithms , Discriminant Analysis , Humans , Pattern Recognition, Automated
14.
Int J Neural Syst ; 21(6): 443-57, 2011 Dec.
Article in English | MEDLINE | ID: mdl-22131298

ABSTRACT

Prototype based classifiers are effective algorithms in modeling classification problems and have been applied in multiple domains. While many supervised learning algorithms have been successfully extended to kernels to improve the discrimination power by means of the kernel concept, prototype based classifiers are typically still used with Euclidean distance measures. Kernelized variants of prototype based classifiers are currently too complex to be applied for larger data sets. Here we propose an extension of Kernelized Generalized Learning Vector Quantization (KGLVQ) employing a sparsity and approximation technique to reduce the learning complexity. We provide generalization error bounds and experimental results on real world data, showing that the extended approach is comparable to SVM on different public data.


Subject(s)
Algorithms , Artificial Intelligence , Computer Simulation , Databases, Factual , Pattern Recognition, Automated/methods , Software
15.
Neural Comput ; 23(5): 1343-92, 2011 May.
Article in English | MEDLINE | ID: mdl-21299418

ABSTRACT

Supervised and unsupervised vector quantization methods for classification and clustering traditionally use dissimilarities, frequently taken as Euclidean distances. In this article, we investigate the applicability of divergences instead, focusing on online learning. We deduce the mathematical fundamentals for its utilization in gradient-based online vector quantization algorithms. It bears on the generalized derivatives of the divergences known as Fréchet derivatives in functional analysis, which reduces in finite-dimensional problems to partial derivatives in a natural way. We demonstrate the application of this methodology for widely applied supervised and unsupervised online vector quantization schemes, including self-organizing maps, neural gas, and learning vector quantization. Additionally, principles for hyperparameter optimization and relevance learning for parameterized divergences in the case of supervised vector quantization are given to achieve improved classification accuracy.


Subject(s)
Artificial Intelligence , Computer Simulation/standards , Neural Networks, Computer , Algorithms , Cognition/physiology , Humans , Mathematical Concepts , Models, Theoretical , Pattern Recognition, Automated/methods
16.
IEEE Trans Neural Netw ; 21(5): 831-40, 2010 May.
Article in English | MEDLINE | ID: mdl-20236882

ABSTRACT

In this paper, we present a regularization technique to extend recently proposed matrix learning schemes in learning vector quantization (LVQ). These learning algorithms extend the concept of adaptive distance measures in LVQ to the use of relevance matrices. In general, metric learning can display a tendency towards oversimplification in the course of training. An overly pronounced elimination of dimensions in feature space can have negative effects on the performance and may lead to instabilities in the training. We focus on matrix learning in generalized LVQ (GLVQ). Extending the cost function by an appropriate regularization term prevents the unfavorable behavior and can help to improve the generalization ability. The approach is first tested and illustrated in terms of artificial model data. Furthermore, we apply the scheme to benchmark classification data sets from the UCI Repository of Machine Learning. We demonstrate the usefulness of regularization also in the case of rank limited relevance matrices, i.e., matrix learning with an implicit, low-dimensional representation of the data.


Subject(s)
Artificial Intelligence , Feedback , Learning/physiology , Neural Networks, Computer , Algorithms , Humans
17.
Psychother Res ; 20(4): 398-412, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20234974

ABSTRACT

The authors developed a concept that applies self-organization theory to psychodynamic principles. According to this concept, episodes of temporary destabilization represent a precondition for abrupt changes within the therapeutic process. The authors examined six courses of therapy (patients diagnosed with depression and personality disorder). After each therapy session, patients rated their experience of the therapeutic interaction. A measure of instability was used to identify episodes of destabilization with respect to patients' interaction experience throughout the process. Episodes of pronounced destabilization occurred in the four courses of therapy that showed better therapy outcomes. These episodes were characterized by temporary strong deteriorations in interaction experience (negative peaks). Three of the four courses showed subsequent discontinuous improvements to a higher level of interaction. Results indicate that the systematic inclusion of a measure of instability is worthwhile in investigations of discontinuous changes. This method allows the theoretical assumptions of the psychodynamic approach to be tested.


Subject(s)
Professional-Patient Relations , Psychotherapeutic Processes , Adult , Countertransference , Depressive Disorder/psychology , Depressive Disorder/therapy , Female , Humans , Models, Psychological , Personality Disorders/psychology , Personality Disorders/therapy , Psychoanalytic Therapy , Psychotherapy , Surveys and Questionnaires , Transference, Psychology , Treatment Outcome
18.
Psychosoc Med ; 6: Doc04, 2009 Oct 13.
Article in English | MEDLINE | ID: mdl-19911073

ABSTRACT

OBJECTIVE: Living organ donation involves interference with a healthy organism. Therefore, most transplantation centres ascertain the voluntariness of the donation as well as its motivation by means of a psychosomatic evaluation. The circumstance that the evaluation is compulsory and not a primary concern of the donor-recipient pair may occasion respondents to present only what they consider innocuous and socially adequate. Thus, the information value of the results can be considerably affected. METHODS: In the context of a psychosomatic evaluation prior to living kidney transplantation, 71 donor-recipient pairs were screened at the transplantation centre of Friedrich Schiller University, Jena. Using the validity scales of the Minnesota Multiphasic Personality Inventory (MMPI) ("infrequency" (F), "lie" (L) and "correction-scales" (K)) and the Dissimulation Index according to Gough ("F-K"), we tried to find traits of dissimulation and denial. RESULTS: About 50% of the participants showed an infrequency raw score of zero. This means that at least half of the sample is apprehensive which may cause a cautious and controlled attitude towards the examination. The K-value (T>/=59) and the Dissimulation Index (F-K

19.
Artif Intell Med ; 45(2-3): 215-28, 2009.
Article in English | MEDLINE | ID: mdl-18778925

ABSTRACT

OBJECTIVE: Mass spectrometry has become a standard technique to analyze clinical samples in cancer research. The obtained spectrometric measurements reveal a lot of information of the clinical sample at the peptide and protein level. The spectra are high dimensional and, due to the small number of samples a sparse coverage of the population is very common. In clinical research the calculation and evaluation of classification models is important. For classical statistics this is achieved by hypothesis testing with respect to a chosen level of confidence. In clinical proteomics the application of statistical tests is limited due to the small number of samples and the high dimensionality of the data. Typically soft methods from the field of machine learning are used to generate such models. However for these methods no or only few additional information about the safety of the model decision is available. In this contribution the spectral data are processed as functional data and conformal classifier models are generated. The obtained models allow the detection of potential biomarker candidates and provide confidence measures for the classification decision. METHODS: First, wavelet-based techniques for the efficient processing and encoding of mass spectrometric measurements from clinical samples are presented. A prototype-based classifier is extended by a functional metric and combined with the concept of conformal prediction to classify the clinical proteomic spectra and to evaluate the results. RESULTS: Clinical proteomic data of a colorectal cancer and a lung cancer study are used to test the performance of the proposed algorithm. The prototype classifiers are evaluated with respect to prediction accuracy and the confidence of the classification decisions. The adapted metric parameters are analyzed and interpreted to find potential biomarker candidates. CONCLUSIONS: The proposed algorithm can be used to analyze functional data as obtained from clinical mass spectrometry, to find discriminating mass positions and to judge the confidence of the obtained classifications, providing robust and interpretable classification models.


Subject(s)
Computational Biology , Mass Spectrometry/methods , Models, Theoretical
20.
Ann Indian Acad Neurol ; 12(1): 28-34, 2009 Jan.
Article in English | MEDLINE | ID: mdl-20151006

ABSTRACT

OBJECTIVES: Fine motor skills disorders belong to the neurological manifestation of Wilson's disease. The aim of this study is to investigate if fine motor performance changes during the course of the disease and with therapy. METHODS: In 15 neurological patients with Wilson's disease, severity of neurological symptoms was assessed with a neurology score. A test battery consisting of the hand writing of a test sentence, lines of "double-I" and retracing a circle was carried out for analysis. By means of a computer-aided analysis of the patient's handwriting, 10 kinematic parameters of the writing trace were calculated. These parameters were determined once at the very beginning of the study and then again after 7 years. RESULTS: Improvement of clinical symptoms was observed after onset of therapy only within the first 2 years. In contrast to the standard population, a reduced degree of automation could be detected both at the beginning and at the end of the 7-year interval. There was no significant change in 8 out of the 10 kinematic parameters during the observation period, 2 deteriorated. DISCUSSION: The absence of a significant increase in fine motor disturbances proves, on the one hand, the efficacy of the therapy regime applied. On the other hand, the end point of a possible reversibility had been reached. A computer-aided analysis of the patient's handwriting allows for a sensitive detection of the "functional scar" in the extrapyramidal control and can subsequently prompt a timely correction of therapy in case of progression.

SELECTION OF CITATIONS
SEARCH DETAIL
...