Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Phys Med Biol ; 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38981590

RESUMO

OBJECTIVE: Vital rules learned from FDG-PET radiomics of tumor subregional response can provide clinical decision support for precise treatment adaptation. We combined a rule-based machine learning (ML) model (RuleFit) with a heuristic algorithm (Gray Wolf Optimizer, GWO) for mid-chemoradiation FDG-PET response prediction in patients with locally advanced non-small cell lung cancer. Approach: Tumors subregions were identified using K-means clustering. GWO+RuleFit consists of three main parts: (i) a random forest is constructed based on conventional features or radiomic features extracted from tumor regions or subregions in FDG-PET images, from which the initial rules are generated; (ii) GWO is used for iterative rule selection; (iii) the selected rules are fit to a linear model to make predictions about the target variable. Two target variables were considered: a binary response measure (∆SUVmean⩾20% decline) for classification and a continuous response measure (∆SUVmean) for regression. GWO+RuleFit was benchmarked against common ML algorithms and RuleFit, with leave-one-out cross-validated performance evaluated by the area under the receiver operating characteristic curve (AUC) in classification and root-mean-square error (RMSE) in regression. Main results: GWO+RuleFit selected 15 rules from the radiomic feature dataset of 23 patients. For treatment response classification, GWO+RuleFit attained numerically better cross-validated performance than RuleFit across tumor regions and sets of features (AUC:0.58-0.86 vs. 0.52-0.78, p=0.170-0.925). GWO+Rulefit also had the best or second-best performance numerically compared to all other algorithms for all conditions. For treatment response regression prediction, GWO+RuleFit (RMSE:0.162-0.192) performed better numerically for low-dimensional models (p=0.097-0.614) and significantly better for high-dimensional models across all tumor regions except one (RMSE:0.189-0.219, p<0.004). Significance: The GWO+RuleFit selected rules were interpretable, highlighting distinct radiomic phenotypes that modulated treatment response. GWO+Rulefit achieved parsimonious models while maintaining utility for treatment response prediction, which can aid clinical decisions for patient risk stratification, treatment selection, and biologically driven adaptation. Clinical trial: NCT02773238.

2.
Accid Anal Prev ; 205: 107681, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38897142

RESUMO

Lane change behavior disrupts traffic flow and increases the potential for traffic conflicts, especially on expressway weaving segments. Focusing on the diversion process, this study incorporating individual driving patterns into conflict prediction and causation analysis can help develop individualized intervention measures to avoid risky diversion behaviors. First, to minimize measurement errors, this study introduces a lane line reconstruction method. Second, several unsupervised clustering methods, including k-means, agglomerative clustering, gaussian mixture, and spectral clustering, are applied to explore diversion patterns. Moreover, machine learning methods, including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), Attention-based LSTM, eXtreme Gradient Boosting (XGB), Support Vector Machine (SVM), and Multilayer Perceptron (MLP), are employed for real-time traffic conflict prediction. Finally, mixed logit models are developed using pre-conflict condition data to investigate the causal mechanisms of traffic conflicts. The results indicate that the K-means algorithm with four clusters exhibits the highest Calinski-Harabasz and Silhouette scores and the lowest Davies-Bouldin scores. With superior classification accuracy and generalization ability, the LSTM is used to develop the personalized traffic conflict prediction model. Sensitivity analysis indicates that incorporating the diversion patterns into the LSTM model results in an improvement of 3.64% in Accuracy, 7.15% in Precision, and 1.34% in Recall. Results from the four mixed logit models indicate significant differences in factors contributing to traffic conflicts within each diversion pattern. For instance, increasing the speed difference between the target vehicle and the right preceding vehicle benefits traffic conflict during acceleration diversions but decreases the likelihood of traffic conflicts during deceleration diversions. These results can help traffic engineers propose individualized solutions to reduce unsafe diversion behavior.


Assuntos
Condução de Veículo , Humanos , Redes Neurais de Computação , Aprendizado de Máquina , Análise por Conglomerados , Algoritmos , Planejamento Ambiental , Máquina de Vetores de Suporte , Acidentes de Trânsito/prevenção & controle , Modelos Logísticos
3.
Sci Rep ; 14(1): 12003, 2024 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-38796483

RESUMO

The online channel has affected many facets of an individual's identity, commercial, social policy, and culture, among others. It implies that discovering the topics on which these brief writings are focused, as well as examining the qualities of these short texts is critical. Another key issue that has been identified is the evaluation of newly discovered topics in terms of topic quality, which includes topic separation and coherence. A topic modeling method has been shown to be an outstanding aid in the linguistic interpretation of quite tiny texts. Based on the underlying strategy, topic models are divided into two categories: probabilistic methods and non-probabilistic methods. In this research, short texts are analyzed using topic models, including latent Dirichlet allocation (LDA) for probabilistic topic modeling and non-negative matrix factorization (NMF) for non-probabilistic topic modeling. A novel approach for topic evaluation is used, such as clustering methods and silhouette analysis on both models, to investigate performance in terms of quality. The experiment results indicate that the proposed evaluation method outperforms on both LDA and NMF.

4.
Pak J Biol Sci ; 27(1): 35-45, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38413396

RESUMO

<b>Background and Objective:</b> Considering that the potential for okra as an anti-diabetic is very high, while okra productivity in Indonesia is still low, a plant breeding program through variety development is needed. One of the initial activities that needs to be carried out is the characterization of various genotypes, both quantitative and qualitative characters. This research aimed to obtain information on the diversity of morpho-agronomic characters in okra genotypes. <b>Materials and Methods:</b> The experiment was conducted as a randomized block design, one factor is genotype with three replications. The materials used in this research were 20 okra genotypes. The experimental units used in this research were 60 units. Each experimental unit consists of 10 sample plants. Analysis of quantitative character variations used PKBT-STAT 3.1. Cluster analysis was carried out with PBSTAT-CL 2.1.2 with the Gower dissimilarity and average linkage clustering methods. Furthermore, analysis was carried out using SAS OnDemand for Academics to see the distinguishing characteristics between clusters. <b>Results:</b> There were differences in okra genotypes based on qualitative and quantitative characteristics. The most diverse quantitative character is the yield component, which is the fruit character. Variance in genetic and heritability showed broad and high criteria, respectively. Based on cluster analysis results, okra genotypes were grouped into 3 clusters with a cophenetic distance value of 0.40. Cluster 1 consists of 9 genotypes. Cluster 2 consists of 10 genotypes. Cluster 3 consists of 1 genotype the Red Hill Country genotype. The grouping in cluster analysis was carried out based on leaf width, number of fruits, fruit weight, fruit diameter and carpel thickness character. <b>Conclusion:</b> This diversity of okra germplasm can facilitate plant breeding activities in the future by selecting genotypes to serve as parents according to the objectives carried out.


Assuntos
Abelmoschus , Abelmoschus/genética , Melhoramento Vegetal , Frutas , Genótipo , Indonésia
5.
Stud Health Technol Inform ; 310: 1261-1265, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38270017

RESUMO

With the growing popularity of content-sharing platforms, patients are increasingly using the Internet as a critical source of health information. As one of the most popular video-sharing sites, YouTube provides easy access to health information seekers, but it is difficult and time-consuming to identify and retrieve high-quality videos that may serve as engaging patient education materials. This paper reports on an exploratory analysis of 317 YouTube videos on Obstructive Sleep Apnea (OSA) to better understand some key features of the videos and the relationships between them to facilitate subsequent video classification and recommendation. Features intrinsic to a video, such as video duration, and extrinsic, such as the number of views, are analyzed using unsupervised clustering methods and the Sankey diagram to discover the relationship between the clusters and their significance across different clusters, providing promising insights for the assessment of video quality.


Assuntos
Apneia Obstrutiva do Sono , Mídias Sociais , Humanos , Educação de Pacientes como Assunto , Análise por Conglomerados , Internet , Apneia Obstrutiva do Sono/diagnóstico
6.
Cancers (Basel) ; 15(4)2023 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-36831474

RESUMO

In recent years, breast cancer detection is an important area of concentration towards curative image dispensation and exploration. Detection of a disease at an early stage is an important factor in taking it to the next level of treatment. Accuracy plays an important role in the detection of disease. COA-T2FCM (Chimp Optimization Algorithm Based Type-2 Intuitionistic Fuzzy C-Means Clustering) is constructed for detection of such malignancy with the highest accuracy in this paper. The proposed detection process is designed with the combination of type-2 intuitionistic fuzzy c-means clustering in addition to oppositional function. In the type-2 intuitionistic fuzzy c-means clustering, the efficient cluster center can be preferred using the chimp optimization algorithm. Initially, the objective function of the type-2 intuitionistic fuzzy c-means clustering is considered. The chimp optimization algorithm is utilized to optimize the cluster center and fuzzifier in the clustering method. The projected technique is implemented, and in addition, performance metrics such as specificity, sensitivity, accuracy, Jaccard Similarity Index (JSI), and Dice Similarity Coefficient (DSC) are assessed. The projected technique is compared with the conventional technique such as fuzzy c means clustering and k mean clustering methods. The resulting method was also compared with existing methods to ensure the accuracy in the proposed method. The proposed algorithm is tested for its effectiveness on the mammogram images of the three different datasets collected from the Mini-Mammographic Image Analysis Society (Mini-MIAS), the Digital Database for Screening Mammography (DDSM), and Inbreast. The accuracy and Jaccard index score are generally used to measure the similarity between the proposed output and the actual cancer affected regions from the image considered. On an average the proposed method achieved an accuracy of 97.29% and JSI of 95.

7.
Front Bioinform ; 3: 1073918, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36819479

RESUMO

Feather growth patterns are important anatomical phenotypes for investigating the underlying genomic regulation of skin and epidermal appendage development. However, characterization of feather growth patterns previously relied on manual examination and visual inspection, which are both subjective and practically prohibitive for large sample sizes. Here, we report a new high-throughput technique to quantify the location and spatial extent of reversed feathers that comprise head crests in domestic pigeons. Phenotypic variation in pigeon feather growth patterns were rendered by computed tomography (CT) scans as point clouds. We then developed machine learning based, feature extraction techniques to isolate the feathers, and map the growth patterns on the skin in a quantitative, automated, and non-invasive way. Results from five test animals were in excellent agreement with "ground truth" results obtained via visual inspection, which demonstrates the viability of this method for quantification of feather growth patterns. Our findings underscore the potential and increasingly indispensable role of modern computer vision and machine learning techniques at the interface of organismal biology and genetics.

8.
Aphasiology ; 36(12): 1492-1519, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36457942

RESUMO

Background: Large shared databases and automated language analyses allow for the application of new data analysis techniques that can shed new light on the connected speech of people with aphasia (PWA). Aims: To identify coherent clusters of PWA based on language output using unsupervised statistical algorithms and to identify features that are most strongly associated with those clusters. Methods & Procedures: Clustering and classification methods were applied to language production data from 168 PWA. Language samples were from a standard discourse protocol tapping four genres: free speech personal narratives, picture descriptions, Cinderella storytelling, procedural discourse. Outcomes & Results: Seven distinct clusters of PWA were identified by the K-means algorithm. Using the random forests algorithm, a classification tree was proposed and validated, showing 91% agreement with the cluster assignments. This representative tree used only two variables to divide the data into distinct groups: total words from free speech tasks and total closed class words from the Cinderella storytelling task. Conclusion: Connected speech data can be used to distinguish PWA into coherent groups, providing insight into traditional aphasia classifications, factors that may guide discourse research and clinical work.

9.
Clin Epidemiol ; 14: 1229-1240, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36325201

RESUMO

Purpose: Preeclampsia is a leading cause of maternal morbidity and mortality. Calcium-based antacids and proton pump inhibitors (PPIs) are commonly used during pregnancy to treat symptoms of gastroesophageal reflux disease. Both have been hypothesized to reduce the risk of preeclampsia. We determined associations of calcium-based antacid and PPI use during pregnancy with late-onset preeclampsia (≥34 weeks of gestation), taking into account dosage and timing of use. Patients and Methods: We included 9058 pregnant women participating in the PRIDE Study (2012-2019) or The Dutch Pregnancy Drug Register (2014-2019), two prospective cohorts in The Netherlands. Data were collected through web-based questionnaires and obstetric records. We estimated risk ratios (RRs) for late-onset preeclampsia for any use and trajectories of calcium-based antacid and PPI use before gestational day 238, and hazard ratios (HRs) for time-varying exposures after gestational day 237. Results: Late-onset preeclampsia was diagnosed in 2.6% of pregnancies. Any use of calcium-based antacids (RR 1.2 [95% CI 0.9-1.6]) or PPIs (RR 1.4 [95% CI 0.8-2.4]) before gestational day 238 was not associated with late-onset preeclampsia. Use of low-dose calcium-based antacids in gestational weeks 0-16 (<1 g/day; RR 1.8 [95% CI 1.1-2.9]) and any use of PPIs in gestational weeks 17-33 (RR 1.6 [95% CI 1.0-2.8]) seemed to increase risks of late-onset preeclampsia. We did not observe associations between late-onset preeclampsia and use of calcium-based antacids (HR 1.0 [95% CI 0.6-1.5]) and PPIs (HR 1.4 [95% CI 0.7-2.9]) after gestational day 237. Conclusion: In this prospective cohort study, use of calcium-based antacids and PPIs during pregnancy was not found to reduce the risk of late-onset preeclampsia.

10.
J Clin Epidemiol ; 152: 164-175, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36228971

RESUMO

BACKGROUND AND OBJECTIVES: To investigate the reproducibility and validity of latent class analysis (LCA) and hierarchical cluster analysis (HCA), multiple correspondence analysis followed by k-means (MCA-kmeans) and k-means (kmeans) for multimorbidity clustering. METHODS: We first investigated clustering algorithms in simulated datasets with 26 diseases of varying prevalence in predetermined clusters, comparing the derived clusters to known clusters using the adjusted Rand Index (aRI). We then them investigated the medical records of male patients, aged 65 to 84 years from 50 UK general practices, with 49 long-term health conditions. We compared within cluster morbidity profiles using the Pearson correlation coefficient and assessed cluster stability using in 400 bootstrap samples. RESULTS: In the simulated datasets, the closest agreement (largest aRI) to known clusters was with LCA and then MCA-kmeans algorithms. In the medical records dataset, all four algorithms identified one cluster of 20-25% of the dataset with about 82% of the same patients across all four algorithms. LCA and MCA-kmeans both found a second cluster of 7% of the dataset. Other clusters were found by only one algorithm. LCA and MCA-kmeans clustering gave the most similar partitioning (aRI 0.54). CONCLUSION: LCA achieved higher aRI than other clustering algorithms.


Assuntos
Algoritmos , Multimorbidade , Humanos , Masculino , Análise de Classes Latentes , Reprodutibilidade dos Testes , Análise por Conglomerados
11.
Animals (Basel) ; 12(16)2022 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-36009611

RESUMO

Unsupervised clustering algorithms are widely used in ecology and conservation to classify animal sounds, but also offer several advantages in basic bioacoustics research. Consequently, it is important to overcome the existing challenges. A common practice is extracting the acoustic features of vocalizations one-dimensionally, only extracting an average value for a given feature for the entire vocalization. With frequency-modulated vocalizations, whose acoustic features can change over time, this can lead to insufficient characterization. Whether the necessary parameters have been set correctly and the obtained clustering result reliably classifies the vocalizations subsequently often remains unclear. The presented software, CASE, is intended to overcome these challenges. Established and new unsupervised clustering methods (community detection, affinity propagation, HDBSCAN, and fuzzy clustering) are tested in combination with various classifiers (k-nearest neighbor, dynamic time-warping, and cross-correlation) using differently transformed animal vocalizations. These methods are compared with predefined clusters to determine their strengths and weaknesses. In addition, a multidimensional data transformation procedure is presented that better represents the course of multiple acoustic features. The results suggest that, especially with frequency-modulated vocalizations, clustering is more applicable with multidimensional feature extraction compared with one-dimensional feature extraction. The characterization and clustering of vocalizations in multidimensional space offer great potential for future bioacoustic studies. The software CASE includes the developed method of multidimensional feature extraction, as well as all used clustering methods. It allows quickly applying several clustering algorithms to one data set to compare their results and to verify their reliability based on their consistency. Moreover, the software CASE determines the optimal values of most of the necessary parameters automatically. To take advantage of these benefits, the software CASE is provided for free download.

12.
Comput Struct Biotechnol J ; 20: 3718-3728, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35891790

RESUMO

Human cancer arises from a population of cells that have acquired a wide range of genetic alterations, most of which are targets of therapeutic treatments or are used as prognostic factors for patient's risk stratification. Among these, copy number alterations (CNAs) are quite frequent. Currently, several molecular biology technologies, such as microarrays, NGS and single-cell approaches are used to define the genomic profile of tumor samples. Output data need to be analyzed with bioinformatic approaches and particularly by employing computational algorithms. Molecular biology tools estimate the baseline region by comparing either the mean probe signals, or the number of reads to the reference genome. However, when tumors display complex karyotypes, this type of approach could fail the baseline region estimation and consequently cause errors in the CNAs call. To overcome this issue, we designed an R-package, BoBafit , able to check and, eventually, to adjust the baseline region, according to both the tumor-specific alterations' context and the sample-specific clustered genomic lesions. Several databases have been chosen to set up and validate the designed package, thus demonstrating the potential of BoBafit to adjust copy number (CN) data from different tumors and analysis techniques. Relevantly, the analysis highlighted that up to 25% of samples need a baseline region adjustment and a redefinition of CNAs calls, thus causing a change in the prognostic risk classification of the patients. We support the implementation of BoBafit within CN analysis bioinformatics pipelines to ensure a correct patient's stratification in risk categories, regardless of the tumor type.

13.
Multimed Tools Appl ; 81(24): 35001-35026, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-33584121

RESUMO

Image segmentation is an essential phase of computer vision in which useful information is extracted from an image that can range from finding objects while moving across a room to detect abnormalities in a medical image. As image pixels are generally unlabelled, the commonly used approach for the same is clustering. This paper reviews various existing clustering based image segmentation methods. Two main clustering methods have been surveyed, namely hierarchical and partitional based clustering methods. As partitional clustering is computationally better, further study is done in the perspective of methods belonging to this class. Further, literature bifurcates the partitional based clustering methods into three categories, namely K-means based methods, histogram-based methods, and meta-heuristic based methods. The survey of various performance parameters for the quantitative evaluation of segmentation results is also included. Further, the publicly available benchmark datasets for image-segmentation are briefed.

14.
Mol Ecol Resour ; 22(3): 1135-1148, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34597471

RESUMO

The software program STRUCTURE is one of the most cited tools for determining population structure. To infer the optimal number of clusters from STRUCTURE output, the ΔK method is often applied. However, a recent study relying on simulated microsatellite data suggested that this method has a downward bias in its estimation of K and is sensitive to uneven sampling. If this finding holds for empirical data sets, conclusions about the scale of gene flow may have to be revised for a large number of studies. To determine the impact of method choice, we applied recently described estimators of K to re-estimate genetic structure in 41 empirical microsatellite data sets; 15 from a broad range of taxa and 26 from one phylogenetic group, coral. We compared alternative estimates of K (Puechmaille statistics) with traditional (ΔK and posterior probability) estimates and found widespread disagreement of estimators across data sets. Thus, one estimator alone is insufficient for determining the optimal number of clusters; this was regardless of study organism or evenness of sampling scheme. Subsequent analysis of molecular variance (AMOVA) did not necessarily clarify which clustering solution was best. To better infer population structure, we suggest a combination of visual inspection of STRUCTURE plots and calculation of the alternative estimators at various thresholds in addition to ΔK. Disagreement between traditional and recent estimators may have important biological implications, such as previously unrecognized population structure, as was the case for many studies reanalysed here.


Assuntos
Genética Populacional , Repetições de Microssatélites , Teorema de Bayes , Análise por Conglomerados , Filogenia
15.
Epidemiol Rev ; 43(1): 130-146, 2022 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-34100086

RESUMO

In many perinatal pharmacoepidemiologic studies, exposure to a medication is classified as "ever exposed" versus "never exposed" within each trimester or even over the entire pregnancy. This approach is often far from real-world exposure patterns, may lead to exposure misclassification, and does not to incorporate important aspects such as dosage, timing of exposure, and treatment duration. Alternative exposure modeling methods can better summarize complex, individual-level medication use trajectories or time-varying exposures from information on medication dosage, gestational timing of use, and frequency of use. We provide an overview of commonly used methods for more refined definitions of real-world exposure to medication use during pregnancy, focusing on the major strengths and limitations of the techniques, including the potential for method-specific biases. Unsupervised clustering methods, including k-means clustering, group-based trajectory models, and hierarchical cluster analysis, are of interest because they enable visual examination of medication use trajectories over time in pregnancy and complex individual-level exposures, as well as providing insight into comedication and drug-switching patterns. Analytical techniques for time-varying exposure methods, such as extended Cox models and Robins' generalized methods, are useful tools when medication exposure is not static during pregnancy. We propose that where appropriate, combining unsupervised clustering techniques with causal modeling approaches may be a powerful approach to understanding medication safety in pregnancy, and this framework can also be applied in other areas of epidemiology.


Assuntos
Farmacoepidemiologia , Análise por Conglomerados , Feminino , Humanos , Gravidez , Trimestres da Gravidez
16.
Soc Psychiatry Psychiatr Epidemiol ; 57(2): 221-237, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-34773462

RESUMO

PURPOSE: An intersectionality framework has been increasingly incorporated into quantitative study of health inequity, to incorporate social power in meaningful ways. Researchers have identified "person-centered" methods that cluster within-individual characteristics as appropriate to intersectionality. We aimed to review their use and match with theory. METHODS: We conducted a multidisciplinary systematic review of English-language quantitative studies wherein authors explicitly stated an intersectional approach, and used clustering methods. We extracted study characteristics and applications of intersectionality. RESULTS: 782 studies with quantitative applications of intersectionality were identified, of which 16 were eligible: eight using latent class analysis, two latent profile analysis, and six clustering methods. Papers used cross-sectional data (100.0%) primarily had U.S. lead authors (68.8%) and were published within psychology, social sciences, and health journals. While 87.5% of papers defined intersectionality and 93.8% cited foundational authors, engagement with intersectionality method literature was more limited. Clustering variables were based on social identities/positions (e.g., gender), dimensions of identity (e.g., race centrality), or processes (e.g., stigma). Results most commonly included four classes/clusters (60.0%), which were frequently used in additional analyses. These described sociodemographic differences across classes/clusters, or used classes/clusters as an exposure variable to predict outcomes in regression analysis, structural equation modeling, mediation, or survival analysis. Author rationales for method choice included both theoretical/intersectional and statistical arguments. CONCLUSION: Latent variable and clustering methods were used in varied ways in intersectional approaches, and reflected differing matches between theory and methods. We highlight situations in which these methods may be advantageous, and missed opportunities for additional uses.


Assuntos
Desigualdades de Saúde , Enquadramento Interseccional , Análise por Conglomerados , Estudos Transversais , Humanos , Estigma Social
17.
Front Genet ; 12: 794354, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34970305

RESUMO

Identifying the protein complexes in protein-protein interaction (PPI) networks is essential for understanding cellular organization and biological processes. To address the high false positive/negative rates of PPI networks and detect protein complexes with multiple topological structures, we developed a novel improved memetic algorithm (IMA). IMA first combines the topological and biological properties to obtain a weighted PPI network with reduced noise. Next, it integrates various clustering results to construct the initial populations. Furthermore, a fitness function is designed based on the five topological properties of the protein complexes. Finally, we describe the rest of our IMA method, which primarily consists of four steps: selection operator, recombination operator, local optimization strategy, and updating the population operator. In particular, IMA is a combination of genetic algorithm and a local optimization strategy, which has a strong global search ability, and searches for local optimal solutions effectively. The experimental results demonstrate that IMA performs much better than the base methods and existing state-of-the-art techniques. The source code and datasets of the IMA can be found at https://github.com/RongquanWang/IMA.

18.
Chaos Solitons Fractals ; 151: 111240, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34253943

RESUMO

The coronavirus has a high basic reproduction number ( R 0 ) and has caused the global COVID-19 pandemic. Governments are implementing lockdowns that are leading to economic fallout in many countries. Policy makers can take better decisions if provided with the indicators connected with the disease spread. This study is aimed to cluster the countries using social, economic, health and environmental related metrics affecting the disease spread so as to implement the policies to control the widespread of disease. Thus, countries with similar factors can take proactive steps to fight against the pandemic. The data is acquired for 79 countries and 18 different feature variables (the factors that are associated with COVID-19 spread) are selected. Pearson Product Moment Correlation Analysis is performed between all the feature variables with cumulative death cases and cumulative confirmed cases individually to get an insight of relation of these factors with the spread of COVID-19. Unsupervised k-means algorithm is used and the feature set includes economic, environmental indicators and disease prevalence along with COVID-19 variables. The learning model is able to group the countries into 4 clusters on the basis of relation with all 18 feature variables. We also present an analysis of correlation between the selected feature variables, and COVID-19 confirmed cases and deaths. Prevalence of underlying diseases shows strong correlation with COVID-19 whereas environmental health indicators are weakly correlated with COVID-19.

19.
Asian Pac J Cancer Prev ; 22(6): 1781-1787, 2021 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-34181334

RESUMO

BACKGROUND: Comparison of gene expression algorithms may be beneficial for obtaining disease pattern or grouping patients based on the gene expression profile. The current study aimed to investigate whether the knowledge within these data is able to group the ovarian cancer patients with similar disease pattern. METHODS: Four different clustering methods were applied on 20 genes expression data of 37 women with ovarian cancer. All selected genes in this study had prominent roles in the control of the activity of the immune system, as well as the chemotaxis, angiogenesis, apoptosis, and etc. Comparison of different clustering methods such as K-means, Hierarchical, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Expectation-Maximization (EM) algorithm was the other aim of the present study. In addition, the percentage of correct prediction, Robustness-Performance Trade-off (RPT), and Silhouette criteria were used to evaluate the performance of clustering methods. RESULTS: Six out of 20 genes (IFN-γ, Foxp3, IL-4, BCL-2, Oct4 and survivin) selected by the Laplacian score showed key roles in the development of ovarian cancer and their prognostic values were clinically and statistically confirmed. The results indicated proper capability of the expression pattern of these genes in grouping the patients with similar prognosis, i.e. patients alive after 5 years or dead (62.12%). CONCLUSION: The results revealed the better performance for k-means and hierarchical clustering methods, and confirmed the fact that by using the expression profile of these genes, patients with similar behavior can be grouped in the same cluster with acceptable accuracy level. Certainly, the useful information from these data may contribute to the prediction of prognosis in ovarian cancer patients along with other features of patients.
.


Assuntos
Perfilação da Expressão Gênica/métodos , Neoplasias Ovarianas/genética , Adolescente , Adulto , Algoritmos , Análise por Conglomerados , Feminino , Humanos , Prognóstico , Estudos Prospectivos
20.
Genes (Basel) ; 12(4)2021 04 13.
Artigo em Inglês | MEDLINE | ID: mdl-33924545

RESUMO

The interplay between shrimp immune system, its environment, and microbiota contributes to the organism's homeostasis and optimal production. The metagenomic composition is typically studied using 16S rDNA profiling by clustering amplicon sequences into operational taxonomic units (OTUs) and, more recently, amplicon sequence variants (ASVs). Establish the compatibility of the taxonomy, α, and ß diversity described by both methods is necessary to compare past and future shrimp microbiota studies. Here, we used identical sequences to survey the V3 16S hypervariable-region using 97% and 99% OTUs and ASVs to assess the hepatopancreas and intestine microbiota of L. vannamei from two ponds under standardized rearing conditions. We found that applying filters to retain clusters >0.1% of the total abundance per sample enabled a consistent taxonomy comparison while preserving >94% of the total reads. The three sets turned comparable at the family level, whereas the 97% identity OTU set produced divergent genus and species profiles. Interestingly, the detection of organ and pond variations was robust to the clustering method's choice, producing comparable α and ß-diversity profiles. For comparisons on shrimp microbiota between past and future studies, we strongly recommend that ASVs be compared at the family level to 97% identity OTUs or use 99% identity OTUs, both using tailored frequency filters.


Assuntos
Bactérias/classificação , Biologia Computacional/métodos , Variação Genética , Penaeidae/microbiologia , Análise de Sequência de DNA/métodos , Animais , Bactérias/genética , DNA Bacteriano/genética , DNA Ribossômico/genética , Microbioma Gastrointestinal , Hepatopâncreas/microbiologia , Sequenciamento de Nucleotídeos em Larga Escala , Microbiota , Penaeidae/genética , Filogenia , RNA Ribossômico 16S/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...