Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters











Publication year range
1.
Diagnostics (Basel) ; 14(3)2024 Jan 26.
Article in English | MEDLINE | ID: mdl-38337787

ABSTRACT

In this paper, a novel strategy to perform high-dimensional feature selection using an evolutionary algorithm for the automatic classification of coronary stenosis is introduced. The method involves a feature extraction stage to form a bank of 473 features considering different types such as intensity, texture and shape. The feature selection task is carried out on a high-dimensional feature bank, where the search space is denoted by O(2n) and n=473. The proposed evolutionary search strategy was compared in terms of the Jaccard coefficient and accuracy classification with different state-of-the-art methods. The highest feature selection rate, along with the best classification performance, was obtained with a subset of four features, representing a 99% discrimination rate. In the last stage, the feature subset was used as input to train a support vector machine using an independent testing set. The classification of coronary stenosis cases involves a binary classification type by considering positive and negative classes. The highest classification performance was obtained with the four-feature subset in terms of accuracy (0.86) and Jaccard coefficient (0.75) metrics. In addition, a second dataset containing 2788 instances was formed from a public image database, obtaining an accuracy of 0.89 and a Jaccard Coefficient of 0.80. Finally, based on the performance achieved with the four-feature subset, they can be suitable for use in a clinical decision support system.

2.
PeerJ ; 11: e16216, 2023.
Article in English | MEDLINE | ID: mdl-37842061

ABSTRACT

Background: Identifying species, particularly small metazoans, remains a daunting challenge and the phylum Nematoda is no exception. Typically, nematode species are differentiated based on morphometry and the presence or absence of certain characters. However, recent advances in artificial intelligence, particularly machine learning (ML) algorithms, offer promising solutions for automating species identification, mostly in taxonomically complex groups. By training ML models with extensive datasets of accurately identified specimens, the models can learn to recognize patterns in nematodes' morphological and morphometric features. This enables them to make precise identifications of newly encountered individuals. Implementing ML algorithms can improve the speed and accuracy of species identification and allow researchers to efficiently process vast amounts of data. Furthermore, it empowers non-taxonomists to make reliable identifications. The objective of this study is to evaluate the performance of ML algorithms in identifying species of free-living marine nematodes, focusing on two well-known genera: Acantholaimus Allgén, 1933 and Sabatieria Rouville, 1903. Methods: A total of 40 species of Acantholaimus and 60 species of Sabatieria were considered. The measurements and identifications were obtained from the original publications of species for both genera, this compilation included information regarding the presence or absence of specific characters, as well as morphometric data. To assess the performance of the species identification four ML algorithms were employed: Random Forest (RF), Stochastic Gradient Boosting (SGBoost), Support Vector Machine (SVM) with both linear and radial kernels, and K-nearest neighbor (KNN) algorithms. Results: For both genera, the random forest (RF) algorithm demonstrated the highest accuracy in correctly classifying specimens into their respective species, achieving an accuracy rate of 93% for Acantholaimus and 100% for Sabatieria, only a single individual from Acantholaimus of the test data was misclassified. Conclusion: These results highlight the overall effectiveness of ML algorithms in species identification. Moreover, it demonstrates that the identification of marine nematodes can be automated, optimizing biodiversity and ecological studies, as well as turning species identification more accessible, efficient, and scalable. Ultimately it will contribute to our understanding and conservation of biodiversity.


Subject(s)
Artificial Intelligence , Nematoda , Humans , Animals , Algorithms , Machine Learning , Chromadorea
3.
Rev. mex. ing. bioméd ; 44(2): 1334, May.-Aug. 2023. tab, graf
Article in English | LILACS-Express | LILACS | ID: biblio-1536653

ABSTRACT

ABSTRACT With an estimated approximately 2 million deaths per year, diabetes is one of the top 5 deadliest noncommunicable diseases globally. Although this disease is not fatal, the degradation of the patient's health due to a bad plan to control their glucose levels can have a fatal outcome. In order to lay the foundations for the development of a device that allows estimating glucose levels in some body fluid, we present the results obtained not only for the estimation of glucose in deionized water, but also describe the development and configuration of the created device. After analyzing 50 signals obtained from 5 different glucose concentrations, the feasibility of using the developed device for the analysis is evident, since, considering the K-Nearest Neighbors (KNN) algorithm, all the signals were associated correctly to the glucose group to which they belong.


RESUMEN Con un estimado de aproximadamente 2 millones de muertes por año, la diabetes es una de las 5 enfermedades no transmisibles más mortales a nivel mundial. Aunque esta enfermedad no es mortal, el deterioro de la salud del paciente por un mal plan para controlar sus niveles de glucosa puede tener un desenlace fatal. Con el fin de sentar las bases para el desarrollo de un dispositivo que permita estimar los niveles de glucosa en algún fluido corporal, presentamos los resultados obtenidos no solo para la estimación de glucosa en agua desionizada, sino que también describimos el desarrollo y configuración del dispositivo creado. Luego de analizar 50 señales obtenidos a partir de 5 concentraciones de glucosa diferentes, se evidencia la factibilidad de utilizar el dispositivo desarrollado para el análisis, ya que, considerando el algoritmo K-Nearest Neighbors (KNN), todas las señales se asociaron correctamente al grupo de glucosa al que pertenecen.

4.
Environ Monit Assess ; 194(3): 203, 2022 Feb 19.
Article in English | MEDLINE | ID: mdl-35182211

ABSTRACT

The security of water distribution systems has become the subject of an increasing volume of research over the last decade. Data analysis and machine learning are linked to hydraulic and quality modeling for improving the capacity of water utilities to save lives when faced with the contamination of water networks. This research applies k-nearest neighbor and random forest algorithms to estimate the location of contamination sources at near-real time. Epanet and Epanet-MSX software are used to simulate intrusions of pesticide into water distribution system and the interaction with compounds already present in water bulk. Different pesticide concentrations are considered in the simulations, and chlorine monitoring occurs through placed quality sensors. The results show that random forest can localize [Formula: see text] of contamination scenarios, while the KNN algorithm found [Formula: see text]. Finally, an assessment of contamination spread is made for a better understanding of the impacts of non-localized contamination.


Subject(s)
Water Supply , Water , Data Mining , Environmental Monitoring/methods , Water Quality
5.
Healthcare (Basel) ; 9(4)2021 Apr 06.
Article in English | MEDLINE | ID: mdl-33917300

ABSTRACT

Diabetes incidence has been a problem, because according with the World Health Organization and the International Diabetes Federation, the number of people with this disease is increasing very fast all over the world. Diabetic treatment is important to prevent the development of several complications, also lipid profile monitoring is important. For that reason the aim of this work is the implementation of machine learning algorithms that are able to classify cases, that corresponds to patients diagnosed with diabetes that have diabetes treatment, and controls that refers to subjects who do not have diabetes treatment but some of them have diabetes, bases on lipids profile levels. Logistic regression, K-nearest neighbor, decision trees and random forest were implemented, all of them were evaluated with accuracy, sensitivity, specificity and AUC-ROC curve metrics. Artificial neural network obtain an acurracy of 0.685 and an AUC value of 0.750, logistic regression achieve an accuracy of 0.729 and an AUC value of 0.795, K-nearest neighbor gets an accuracy of 0.669 and an AUC value of 0.709, on the other hand, decision tree reached an accuracy pg 0.691 and a AUC value of 0.683, finally random forest achieve an accuracy of 0.704 and an AUC curve of 0.776. The performance of all models was statistically significant, but the best performance model for this problem corresponds to logistic regression.

6.
Sensors (Basel) ; 21(4)2021 Feb 10.
Article in English | MEDLINE | ID: mdl-33578915

ABSTRACT

One of the major problems facing humanity in the coming decades is the production of food on a large scale. The production of large quantities of food must be conducted in a sustainable and responsible manner for nature and humans. In this sense, the appropriate application of agricultural pesticides plays a fundamental role since pesticide application in a qualified manner reduces human and environmental risks as well as the costs of food production. Evaluation of the quality of application using sprayers is an important issue, and several quality descriptors related to the average diameter and distribution of droplets are used. This paper describes the construction of a data-driven soft sensor using the parametric principal component regression (PCR) method based on principal component analysis (PCA), which works in two configurations: with the input being the operating conditions of the agricultural boom sprayers and its outputs being the prediction of the quality descriptors of spraying, and vice versa. The soft sensor provides, in one configuration, estimates of the quality of pesticide application at a certain time and, in the other, estimates of the appropriate sprayer-operating conditions, which can be used for control and optimization of the processes in pesticide application. Full cone nozzles are used to illustrate a practical application as well as to validate the usefulness of the soft sensor designed with the PCR method. The selection of historical data, exploration, and filtering of data, and the structure and validation of the soft sensor are presented. For comparison purposes, the results with the well-known nonparametric k-Nearest Neighbor (k-NN) regression method are presented. The results of this research reveal the usefulness of soft sensors in the application of agricultural pesticides and as a knowledge base to assist in agricultural decision-making.

7.
Cancer Control ; 26(1): 1073274819876598, 2019.
Article in English | MEDLINE | ID: mdl-31538497

ABSTRACT

Several statistical-based approaches have been developed to support medical personnel in early breast cancer detection. This article presents a method for feature selection aimed at classifying cases into categories based on patients' breast tissue measures and protein microarray. The effectiveness of this feature selection strategy was evaluated against the commonly used Wisconsin Breast Cancer Database-WBCD (with several patients and fewer features) and a new protein microarray data set (with several features and fewer patients). Features were ranked according to a feature importance index that combines parameters emerging from the unsupervised method of principal component analysis and the supervised method of Bhattacharyya distance. Observations of a training set were iteratively categorized into malignant and benign cases through 3 classification techniques: k-Nearest Neighbor, linear discriminant analysis, and probabilistic neural network. After each classification, the feature with the smallest importance index was removed, and a new categorization was carried out until there was only one feature left. The subset yielding maximum accuracy was used to classify observations in the testing set. Our method yielded average 99.17% accurate classifications in the testing set while retaining average 4.61 out of 9 features in the WBCD, which is comparable to the best results reported by the literature on that data set, with the advantage of relying on simple and widely available multivariate techniques. When applied to the microarray data, the method yielded average accuracy of 98.30% while retaining average 2.17% of the original features. Our results can aid health-care professionals during early diagnosis of breast cancer.


Subject(s)
Breast Neoplasms/classification , Decision Support Techniques , Early Detection of Cancer/methods , Female , Humans
8.
Rev. bras. eng. biomed ; 30(4): 301-311, Oct.-Dec. 2014. ilus, graf, tab
Article in English | LILACS | ID: lil-732829

ABSTRACT

INTRODUCTION: Face recognition, one of the most explored themes in biometry, is used in a wide range of applications: access control, forensic detection, surveillance and monitoring systems, and robotic and human machine interactions. In this paper, a new classifier is proposed for face recognition: the novelty classifier. METHODS: The performance of a novelty classifier is compared with the performance of the nearest neighbor classifier. The ORL face image database was used. Three methods were employed for characteristic extraction: principal component analysis, bi-dimensional principal component analysis with dimension reduction in one dimension and bi-dimensional principal component analysis with dimension reduction in two directions. RESULTS: In identification mode, the best recognition rate with the leave-one-out strategy is equal to 100%. In the verification mode, the best recognition rate was also 100%. For the half-half strategy, the best recognition rate in the identification mode is equal to 98.5%, and in the verification mode, 88%. CONCLUSION: For face recognition, the novelty classifier performs comparable to the best results already published in the literature, which further confirms the novelty classifier as an important pattern recognition method in biometry.

9.
Ciênc. Saúde Colet. (Impr.) ; Ciênc. Saúde Colet. (Impr.);19(4): 1295-1304, abr. 2014. graf
Article in Portuguese | LILACS | ID: lil-710506

ABSTRACT

Na maioria dos países, o câncer de mama entre as mulheres é predominante. Se diagnosticado precocemente, apresenta alta probabilidade de cura. Diversas abordagens baseadas em Estatística foram desenvolvidas para auxiliar na sua detecção precoce. Este artigo apresenta um método para a seleção de variáveis para classificação dos casos em duas classes de resultado, benigno ou maligno, baseado na análise citopatológica de amostras de célula da mama de pacientes. As variáveis são ordenadas de acordo com um novo índice de importância de variáveis que combina os pesos de importância da Análise de Componentes Principais e a variância explicada a partir de cada componente retido. Observações da amostra de treino são categorizadas em duas classes através das ferramentas k-vizinhos mais próximos e Análise Discriminante, seguida pela eliminação da variável com o menor índice de importância. Usa-se o subconjunto com a máxima acurácia para classificar as observações na amostra de teste. Aplicando ao Wisconsin Breast Cancer Database, o método proposto apresentou uma média de 97,77% de acurácia de classificação, retendo uma média de 5,8 variáveis.


In the majority of countries, breast cancer among women is highly prevalent. If diagnosed in the early stages, there is a high probability of a cure. Several statistical-based approaches have been developed to assist in early breast cancer detection. This paper presents a method for selection of variables for the classification of cases into two classes, benign or malignant, based on cytopathological analysis of breast cell samples of patients. The variables are ranked according to a new index of importance of variables that combines the weighting importance of Principal Component Analysis and the explained variance based on each retained component. Observations from the test sample are categorized into two classes using the k-Nearest Neighbor algorithm and Discriminant Analysis, followed by elimination of the variable with the index of lowest importance. The subset with the highest accuracy is used to classify observations in the test sample. When applied to the Wisconsin Breast Cancer Database, the proposed method led to average of 97.77% in classification accuracy while retaining an average of 5.8 variables.


Subject(s)
Female , Humans , Breast Neoplasms/diagnosis , Data Mining/methods , Data Mining/statistics & numerical data , Early Detection of Cancer/methods , Early Detection of Cancer/statistics & numerical data
SELECTION OF CITATIONS
SEARCH DETAIL