Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
Comput Biol Med ; 171: 108216, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38442555

RESUMO

Despite being one of the most prevalent forms of cancer, prostate cancer (PCa) shows a significantly high survival rate, provided there is timely detection and treatment. Computational methods can help make this detection process considerably faster and more robust. However, some modern machine-learning approaches require accurate segmentation of the prostate gland and the index lesion. Since performing manual segmentations is a very time-consuming task, and highly prone to inter-observer variability, there is a need to develop robust semi-automatic segmentation models. In this work, we leverage the large and highly diverse ProstateNet dataset, which includes 638 whole gland and 461 lesion segmentation masks, from 3 different scanner manufacturers provided by 14 institutions, in addition to other 3 independent public datasets, to train accurate and robust segmentation models for the whole prostate gland, zones and lesions. We show that models trained on large amounts of diverse data are better at generalizing to data from other institutions and obtained with other manufacturers, outperforming models trained on single-institution single-manufacturer datasets in all segmentation tasks. Furthermore, we show that lesion segmentation models trained on ProstateNet can be reliably used as lesion detection models.


Assuntos
Próstata , Neoplasias da Próstata , Masculino , Humanos , Próstata/diagnóstico por imagem , Imageamento Tridimensional/métodos , Estudos Retrospectivos , Algoritmos , Neoplasias da Próstata/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos
3.
Artigo em Inglês | MEDLINE | ID: mdl-37027541

RESUMO

Full-reference image quality measures are a fundamental tool to approximate the human visual system in various applications for digital data management: from retrieval to compression to detection of unauthorized uses. Inspired by both the effectiveness and the simplicity of hand-crafted Structural Similarity Index Measure (SSIM), in this work, we present a framework for the formulation of SSIM-like image quality measures through genetic programming. We explore different terminal sets, defined from the building blocks of structural similarity at different levels of abstraction, and we propose a two-stage genetic optimization that exploits hoist mutation to constrain the complexity of the solutions. Our optimized measures are selected through a cross-dataset validation procedure, which results in superior performance against different versions of structural similarity, measured as correlation with human mean opinion scores. We also demonstrate how, by tuning on specific datasets, it is possible to obtain solutions that are competitive with (or even outperform) more complex image quality measures.

4.
Cancers (Basel) ; 15(5)2023 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-36900261

RESUMO

Prostate cancer is one of the most common forms of cancer globally, affecting roughly one in every eight men according to the American Cancer Society. Although the survival rate for prostate cancer is significantly high given the very high incidence rate, there is an urgent need to improve and develop new clinical aid systems to help detect and treat prostate cancer in a timely manner. In this retrospective study, our contributions are twofold: First, we perform a comparative unified study of different commonly used segmentation models for prostate gland and zone (peripheral and transition) segmentation. Second, we present and evaluate an additional research question regarding the effectiveness of using an object detector as a pre-processing step to aid in the segmentation process. We perform a thorough evaluation of the deep learning models on two public datasets, where one is used for cross-validation and the other as an external test set. Overall, the results reveal that the choice of model is relatively inconsequential, as the majority produce non-significantly different scores, apart from nnU-Net which consistently outperforms others, and that the models trained on data cropped by the object detector often generalize better, despite performing worse during cross-validation.

5.
PLoS One ; 16(11): e0260609, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34843603

RESUMO

Cell counting is a frequent task in medical research studies. However, it is often performed manually; thus, it is time-consuming and prone to human error. Even so, cell counting automation can be challenging to achieve, especially when dealing with crowded scenes and overlapping cells, assuming different shapes and sizes. In this paper, we introduce a deep learning-based cell detection and quantification methodology to automate the cell counting process in the zebrafish xenograft cancer model, an innovative technique for studying tumor biology and for personalizing medicine. First, we implemented a fine-tuned architecture based on the Faster R-CNN using the Inception ResNet V2 feature extractor. Second, we performed several adjustments to optimize the process, paying attention to constraints such as the presence of overlapped cells, the high number of objects to detect, the heterogeneity of the cells' size and shape, and the small size of the data set. This method resulted in a median error of approximately 1% of the total number of cell units. These results demonstrate the potential of our novel approach for quantifying cells in poorly labeled images. Compared to traditional Faster R-CNN, our method improved the average precision from 71% to 85% on the studied data set.


Assuntos
Contagem de Células/métodos , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Neoplasias Experimentais/diagnóstico , Animais , Xenoenxertos , Humanos , Transplante de Neoplasias , Neoplasias/diagnóstico , Neoplasias/patologia , Neoplasias Experimentais/patologia , Peixe-Zebra
6.
J Bus Res ; 131: 411-425, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33100428

RESUMO

This research examines how artificial intelligence may contribute to better understanding and to overcome over-indebtedness in contexts of high poverty risk. This research uses Automated Machine Learning (AutoML) in a field database of 1654 over-indebted households to identify distinguishable clusters and to predict its risk factors. First, unsupervised machine learning using Self-Organizing Maps generated three over-indebtedness clusters: low-income (31.27%), low credit control (37.40%), and crisis-affected households (31.33%). Second, supervised machine learning with exhaustive grid search hyperparameters (32,730 predictive models) suggests that Nu-Support Vector Machine had the best accuracy in predicting families' over-indebtedness risk factors (89.5%). By proposing an AutoML approach on over-indebtedness, our research adds both theoretically and methodologically to current models of scarcity with important practical implications for business research and society. Our findings also contribute to novel ways to identify and characterize poverty risk in earlier stages, allowing customized interventions for different profiles of over-indebtedness.

7.
Plants (Basel) ; 9(2)2020 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-32024121

RESUMO

When a dark-adapted leaf is illuminated with saturating light, a fast polyphasic rise of fluorescence emission (Kautsky effect) is observed. The shape of the curve is dependent on the molecular organization of the photochemical apparatus, which in turn is a function of the interaction between genotype and environment. In this paper, we evaluate the potential of rapid fluorescence transients, aided by machine learning techniques, to classify plant genotypes. We present results of the application of several machine learning algorithms (k-nearest neighbors, decision trees, artificial neural networks, genetic programming) to rapid induction curves recorded in different species and cultivars of vine grown in the same environmental conditions. The phylogenetic relations between the selected Vitis species and Vitis vinifera cultivars were established with molecular markers. Both neural networks (71.8%) and genetic programming (75.3%) presented much higher global classification success rates than k-nearest neighbors (58.5%) or decision trees (51.6%), genetic programming performing slightly better than neural networks. However, compared with a random classifier (success rate = 14%), even the less successful algorithms were good at the task of classifying. The use of rapid fluorescence transients, handled by genetic programming, for rapid preliminary classification of Vitis genotypes is foreseen as feasible.

8.
Comput Methods Programs Biomed ; 185: 105160, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31710983

RESUMO

BACKGROUND: The literature shows the effectiveness of music listening, but which factors and what types of music produce therapeutic effects, as well as how music therapists can select music, remain unclear. Here, we present a study to establish the main predictive factors of music listening's relaxation effects using machine learning methods. METHODS: Three hundred and twenty healthy participants were evenly distributed by age, education level, presence of musical training, and sex. Each of them listened to music for nine minutes (either to their preferred music or to algorithmically generated music). Relaxation levels were recorded using a visual analogue scale (VAS) before and after the listening experience. The participants were then divided into three classes: increase, decrease, or no change in relaxation. A decision tree was generated to predict the effect of music listening on relaxation. RESULTS: A decision tree with an overall accuracy of 0.79 was produced. An analysis of the structure of the decision tree yielded some inferences as to the most important factors in predicting the effect of music listening, particularly the initial relaxation level, the combination of education and musical training, age, and music listening frequency. CONCLUSIONS: The resulting decision tree and analysis of this interpretable model makes it possible to find predictive factors that influence therapeutic music listening outcomes. The strong subjectivity of therapeutic music listening suggests the use of machine learning techniques as an important and innovative approach to supporting music therapy practice.


Assuntos
Aprendizado de Máquina , Musicoterapia , Adolescente , Adulto , Idoso , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Masculino , Pessoa de Meia-Idade
9.
J Comput Biol ; 25(9): 1009-1022, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29671616

RESUMO

The alignment among three or more nucleotides/amino acids sequences at the same time is known as multiple sequence alignment (MSA), a nondeterministic polynomial time (NP)-hard optimization problem. The time complexity of finding an optimal alignment raises exponentially when the number of sequences to align increases. In this work, we deal with a multiobjective version of the MSA problem wherein the goal is to simultaneously optimize the accuracy and conservation of the alignment. A parallel version of the hybrid multiobjective memetic metaheuristics for MSA is proposed. To evaluate the parallel performance of our proposal, we have selected a pull of data sets with different number of sequences (up to 1000 sequences) and study its parallel performance against other well-known parallel metaheuristics published in the literature, such as MSAProbs, tree-based consistency objective function for alignment evaluation (T-Coffee), Clustal [Formula: see text], and multiple alignment using fast Fourier transform (MAFFT). The comparative study reveals that our parallel aligner obtains better results than MSAProbs, T-Coffee, Clustal [Formula: see text], and MAFFT. In addition, the parallel version is around 25 times faster than the sequential version with 32 cores, obtaining an efficiency around 80%.


Assuntos
Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Humanos
10.
IEEE Trans Cybern ; 48(1): 41-51, 2018 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27831898

RESUMO

The multiple sequence alignment is a well-known bioinformatics problem that consists in the alignment of three or more biological sequences (protein or nucleic acid). In the literature, a number of tools have been proposed for dealing with this biological sequence alignment problem, such as progressive methods, consistency-based methods, or iterative methods; among others. These aligners often use a default parameter configuration for all the input sequences to align. However, the default configuration is not always the best choice, the alignment accuracy of the tool may be highly boosted if specific parameter configurations are used, depending on the biological characteristics of the input sequences. In this paper, we propose a characteristic-based framework for multiple sequence aligners. The idea of the framework is, given an input set of unaligned sequences, extract its characteristics and run the aligner with the best parameter configuration found for another set of unaligned sequences with similar characteristics. In order to test the framework, we have used the well-known multiple sequence comparison by log-expectation (MUSCLE) v3.8 aligner with different benchmarks, such as benchmark alignments database v3.0, protein reference alignment benchmark v4.0, and sequence alignment benchmark v1.65. The results shown that the alignment accuracy and conservation of MUSCLE might be greatly improved with the proposed framework, specially in those scenarios with a low percentage of identity. The characteristic-based framework for multiple sequence aligners is freely available for downloading at http://arco.unex.es/arl/fwk-msa/cbf-msa.zip.

11.
J Comput Biol ; 24(11): 1144-1154, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28686466

RESUMO

The alignment of three or more protein or nucleotide sequences is known as Multiple Sequence Alignment problem. The complexity of this problem increases exponentially with the number of sequences; therefore, many of the current approaches published in the literature suffer a computational overhead when thousands of sequences are required to be aligned. We introduce a new approach for dealing with ultra-large sets of sequences. A two-level clustering method is considered. The first level clusters the input sequences by using their biological composition, that is, the number of positive, negative, polar, special, and hydrophobic amino acids. In the second level, each cluster is divided into different clusters according to their similarity. Then, each cluster is aligned by using any method/aligner. After aligning the centroid sequences of each second-level cluster, we extrapolate the new gaps to each cluster of sequences to obtain the final alignment. We present a study on biological data with up to ∼100,000 sequences, showing that the proposed approach is able to obtain accurate alignments in a reduced amount of time; for example, in >10,000 sequences datasets, it is able to reduce up to ∼45 times the required runtime of the well-known Kalign.


Assuntos
Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Humanos
12.
Comput Intell Neurosci ; 2016: 8326760, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27057158

RESUMO

In 2012, Moraglio and coauthors introduced new genetic operators for Genetic Programming, called geometric semantic genetic operators. They have the very interesting advantage of inducing a unimodal error surface for any supervised learning problem. At the same time, they have the important drawback of generating very large data models that are usually very hard to understand and interpret. The objective of this work is to alleviate this drawback, still maintaining the advantage. More in particular, we propose an elitist version of geometric semantic operators, in which offspring are accepted in the new population only if they have better fitness than their parents. We present experimental evidence, on five complex real-life test problems, that this simple idea allows us to obtain results of a comparable quality (in terms of fitness), but with much smaller data models, compared to the standard geometric semantic operators. In the final part of the paper, we also explain the reason why we consider this a significant improvement, showing that the proposed elitist operators generate manageable models, while the models generated by the standard operators are so large in size that they can be considered unmanageable.


Assuntos
Algoritmos , Simulação por Computador , Aprendizado de Máquina
13.
Comput Intell Neurosci ; 2015: 971908, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26106410

RESUMO

Energy consumption forecasting (ECF) is an important policy issue in today's economies. An accurate ECF has great benefits for electric utilities and both negative and positive errors lead to increased operating costs. The paper proposes a semantic based genetic programming framework to address the ECF problem. In particular, we propose a system that finds (quasi-)perfect solutions with high probability and that generates models able to produce near optimal predictions also on unseen data. The framework blends a recently developed version of genetic programming that integrates semantic genetic operators with a local search method. The main idea in combining semantic genetic programming and a local searcher is to couple the exploration ability of the former with the exploitation ability of the latter. Experimental results confirm the suitability of the proposed method in predicting the energy consumption. In particular, the system produces a lower error with respect to the existing state-of-the art techniques used on the same dataset. More importantly, this case study has shown that including a local searcher in the geometric semantic genetic programming system can speed up the search process and can result in fitter models that are able to produce an accurate forecasting also on unseen data.


Assuntos
Simulação por Computador , Previsões , Modelos Teóricos , Alocação de Recursos/tendências , Software , Humanos , Itália , Alocação de Recursos/estatística & dados numéricos , Estudos Retrospectivos
14.
IEEE Trans Cybern ; 44(1): 103-13, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23757595

RESUMO

The concept of semantics (in the sense of input-output behavior of solutions on training data) has been the subject of a noteworthy interest in the genetic programming (GP) research community over the past few years. In this paper, we present a new GP system that uses the concept of semantics to improve search effectiveness. It maintains a distribution of different semantic behaviors and biases the search toward solutions that have similar semantics to the best solutions that have been found so far. We present experimental evidence of the fact that the new semantics-based GP system outperforms the standard GP and the well-known bacterial GP on a set of test functions, showing particularly interesting results for noncontinuous (i.e., generally harder to optimize) test functions. We also observe that the solutions generated by the proposed GP system often have a larger size than the ones returned by standard GP and bacterial GP and contain an elevated number of introns, i.e., parts of code that do not have any effect on the semantics. Nevertheless, we show that the deletion of introns during the evolution does not affect the performance of the proposed method.

15.
Int J Data Min Bioinform ; 6(6): 585-601, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23356009

RESUMO

Being able to predict the human oral bioavailability for a potential new drug is extremely important for the drug discovery process. This problem has been addressed by several prediction tools, with Genetic Programming providing some of the best results ever achieved. In this paper we use the newest developments of Genetic Programming, in particular the latest bloat control method, Operator Equalisation, to find out how much improvement we can achieve on this problem. We show examples of some actual solutions and discuss their quality, comparing them with previously published results. We identify some unexpected behaviours related to overfitting, and discuss the way for further improving the practical usage of the Genetic Programming approach.


Assuntos
Algoritmos , Disponibilidade Biológica , Genética Populacional , Preparações Farmacêuticas , Humanos , Modelos Lineares , Modelos Genéticos
16.
BioData Min ; 4: 12, 2011 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-21569330

RESUMO

BACKGROUND: The ability to accurately classify cancer patients into risk classes, i.e. to predict the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years gene expression data have been successfully used to complement the clinical and histological criteria traditionally used in such prediction. Many "gene expression signatures" have been developed, i.e. sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology. Here we investigate the use of several machine learning techniques to classify breast cancer patients using one of such signatures, the well established 70-gene signature. RESULTS: We show that Genetic Programming performs significantly better than Support Vector Machines, Multilayered Perceptrons and Random Forests in classifying patients from the NKI breast cancer dataset, and comparably to the scoring-based method originally proposed by the authors of the 70-gene signature. Furthermore, Genetic Programming is able to perform an automatic feature selection. CONCLUSIONS: Since the performance of Genetic Programming is likely to be improvable compared to the out-of-the-box approach used here, and given the biological insight potentially provided by the Genetic Programming solutions, we conclude that Genetic Programming methods are worth further investigation as a tool for cancer patient classification based on gene expression data.

17.
Evol Comput ; 13(2): 213-39, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15969901

RESUMO

We present an approach to genetic programming difficulty based on a statistical study of program fitness landscapes. The fitness distance correlation is used as an indicator of problem hardness and we empirically show that such a statistic is adequate in nearly all cases studied here. However, fitness distance correlation has some known problems and these are investigated by constructing an artificial landscape for which the correlation gives contradictory indications. Although our results confirm the usefulness of fitness distance correlation, we point out its shortcomings and give some hints for improvement in assessing problem hardness in genetic programming.


Assuntos
Biologia Computacional/métodos , Algoritmos , Animais , Evolução Biológica , Simulação por Computador , Genética Populacional , Humanos , Computação Matemática , Matemática , Modelos Biológicos , Modelos Genéticos , Modelos Estatísticos , Mutação , Seleção Genética , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...