Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 24(1): 379, 2023 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-37803253

RESUMO

PURPOSE: Autism spectrum disorder(ASD) is a disease associated with the neurodevelopment of the brain. The autism spectrum can be observed in early childhood, where the symptoms of the disease usually appear in children within the first year of their life. Currently, ASD can only be diagnosed based on the apparent symptoms due to the lack of information on genes related to the disease. Therefore, in this paper, we need to predict the largest number of disease-causing genes for a better diagnosis. METHODS: A hybrid stacking ensemble model with Synthetic Minority Oversampling TEchnique (Stack-SMOTE) is proposed to predict the genes associated with ASD. The proposed model uses the gene ontology database to measure the similarities between the genes using a hybrid gene similarity function(HGS). HGS is effective in measuring the similarity as it combines the features of information gain-based methods and graph-based methods. The proposed model solves the imbalanced ASD dataset problem using the Synthetic Minority Oversampling Technique (SMOTE), which generates synthetic data rather than duplicates the data to reduce the overfitting. Sequentially, a gradient boosting-based random forest classifier (GBBRF) is introduced as a new combination technique to enhance the prediction of ASD genes. Moreover, the GBBRF classifier combined with random forest(RF), k-nearest neighbor, support vector machine(SVM), and logistic regression(LR) to form the proposed Stacking-SMOTE model to optimize the prediction of ASD genes. RESULTS: The proposed Stacking-SMOTE model is evaluated using the Simons Foundation Autism Research Initiative (SFARI) gene database and a set of candidates ASD genes.The results of the proposed model-based SMOTE outperform other reported undersampling and oversampling techniques. Sequentially, the results of GBBRF achieve higher accuracy than using the basic classifiers. Moreover, the experimental results show that the proposed Stacking-SMOTE model outperforms the existing ASD prediction models with approximately 95.5% accuracy. CONCLUSION: The proposed Stacking-SMOTE model demonstrates that SMOTE is effective in handling the autism imbalanced data. Sequentially, the integration between the gradient boosting and random forest classifier (GBBRF) support to build a robust stacking ensemble model(Stacking-SMOTE).


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Pré-Escolar , Criança , Humanos , Transtorno Autístico/genética , Transtorno do Espectro Autista/genética , Algoritmo Florestas Aleatórias , Máquina de Vetores de Suporte , Fenótipo
2.
Comput Biol Med ; 162: 107109, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37276752

RESUMO

BACKGROUND AND OBJECTIVE: Early diagnosis of Coronavirus Disease 2019 (COVID-19) can help save patients' lives before the disease turns severe. This can be achieved through an effective and correct treatment protocol. In this paper, a prediction model is proposed to detect infected cases and determine the severity level of the disease. METHODS: The proposed model is based on utilizing proteins and metabolites as features for each patient, which are then analyzed using feature selection methods such as Principal Component Analysis (PCA), Information Gain (IG), and analysis of Variance (ANOVA) to select the most significant features. The model employs three classifiers, namely K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest (RF), to predict and classify the severity level of the COVID-19 infection. The proposed model is evaluated using four performance measures: accuracy, sensitivity, specificity, and precision. RESULTS: The experiment results show that the proposed model accuracy can reach 80% using RF classifier with PCA. The PCA selects 22 proteins and 10 metabolites. While ANOVA selects 9 proteins and 5 metabolites. The accuracy reaches 92% after applying RF classifier with the ANOVA. Finally, the accuracy reaches 93% using the RF classifier with only ten features. The selected features are 7 proteins and 3 metabolites. Moreover, it shows that the selected features have a relation to the immune system and respiratory systems. CONCLUSION: The proposed model uses three classifiers and shows promising results by selecting the important features and maximizing the prediction accuracy.


Assuntos
COVID-19 , Humanos , COVID-19/diagnóstico , Proteômica , Algoritmo Florestas Aleatórias , Máquina de Vetores de Suporte , Análise de Componente Principal , Teste para COVID-19
3.
BMC Bioinformatics ; 23(1): 554, 2022 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-36544099

RESUMO

PURPOSE: Autism spectrum disorder (ASD) is the most prevalent disease today. The causes of its infection may be attributed to genetic causes by 80% and environmental causes by 20%. In spite of this, the majority of the current research is concerned with environmental causes, and the least proportion with the genetic causes of the disease. Autism is a complex disease, which makes it difficult to identify the genes that cause the disease. METHODS: Hybrid ensemble-based classification (HEC-ASD) model for predicting ASD genes using gradient boosting machines is proposed. The proposed model utilizes gene ontology (GO) to construct a gene functional similarity matrix using hybrid gene similarity (HGS) method. HGS measures the semantic similarity between genes effectively. It combines the graph-based method, such as Wang method with the number of directed children's nodes of gene term from GO. Moreover, an ensemble gradient boosting classifier is adapted to enhance the prediction of genes forming a robust classification model. RESULTS: The proposed model is evaluated using the Simons Foundation Autism Research Initiative (SFARI) gene database. The experimental results are promising as they improve the classification performance for predicting ASD genes. The results are compared with other approaches that used gene regulatory network (GRN), protein to protein interaction network (PPI), or GO. The HEC-ASD model reaches the highest prediction accuracy of 0.88% using ensemble learning classifiers. CONCLUSION: The proposed model demonstrates that ensemble learning technique using gradient boosting is effective in predicting autism spectrum disorder genes. Moreover, the HEC-ASD model utilized GO rather than using PPI network and GRN.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Criança , Humanos , Transtorno do Espectro Autista/genética , Transtorno Autístico/genética , Redes Reguladoras de Genes , Mapas de Interação de Proteínas
4.
Pak J Biol Sci ; 25(2): 144-153, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35234003

RESUMO

<b>Background and Objective:</b> Toxoplasmosis is an infective zoonotic disease caused by protozoan <i>Toxoplasma gondii </i>(<i>T. gondii</i>).<i> </i>Molecular identification of <i>T. gondii</i> followed by studying the hereditary variety range of <i>T. gondii </i>isolates in Egypt was investigated. <b>Materials and Methods:</b> Blood samples were acquired from 138 live ewes and 212 she-goats from 5 governorates of Egypt, also the blood and its related tissue samples (uterus, diaphragm, heart and thigh muscles from each animal) were collected from slaughtered 180 ewes and 206 she-goats from Cairo and Giza abattoirs. <b>Results:</b> Using ELISA, the total seropositivity of live ewes and she-goats was 26.8 and 21.2%, respectively, while it was 16.6 and 33% in slaughtered ewes and she-goats, respectively. <i>T. gondii</i> tissue cysts with the associated characteristic histopathological changes were detected in different organs. Twenty-eight <i>T. gondii</i> isolates were confirmed using PCR, while among 24 milk samples from seropositive live ewes and she-goats, only 12.5 and 6.25%, were positive using PCR, respectively. Genotyping using multiple nested PCR (n-PCR) combined with restriction enzyme analysis (RFLP) of the surface antigen 2 (SAG2) gene confirmed 26 isolates (92.8%) as type II and 2 (7.1%) as type III. <b>Conclusion:</b> Type II and III are the most common <i>T. gondii</i> genotypes in Egyptian small ruminants with additional importance for public health in Egypt. Further studies are needed on the role of milk in the transmission of toxoplasmosis.


Assuntos
Toxoplasma , Toxoplasmose Animal , Animais , Anticorpos Antiprotozoários , Egito , Feminino , Genótipo , Ruminantes , Estudos Soroepidemiológicos , Ovinos , Toxoplasma/genética , Toxoplasmose Animal/epidemiologia
5.
Chemometr Intell Lab Syst ; 224: 104535, 2022 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-35308181

RESUMO

COVID-19 disease causes serious respiratory illnesses. Therefore, accurate identification of the viral infection cycle plays a key role in designing appropriate vaccines. The risk of this disease depends on proteins that interact with human receptors. In this paper, we formulate a novel model for COVID-19 named "amino acid encoding based prediction" (AAPred). This model is accurate, classifies the various coronavirus types, and distinguishes SARS-CoV-2 from other coronaviruses. With the AAPred model, we reduce the number of features to enhance its performance by selecting the most important ones employing statistical criteria. The protein sequence of SARS-CoV-2 for understanding the viral infection cycle is analyzed. Six machine learning classifiers related to decision trees, k-nearest neighbors, random forest, support vector machine, bagging ensemble, and gradient boosting are used to evaluate the model in terms of accuracy, precision, sensitivity, and specificity. We implement the obtained results computationally and apply them to real data from the National Genomics Data Center. The experimental results report that the AAPred model reduces the features to seven of them. The average accuracy of the 10-fold cross-validation is 98.69%, precision is 98.72%, sensitivity is 96.81%, and specificity is 97.72%. The features are selected utilizing information gain and classified with random forest. The proposed model predicts the type of Coronavirus and reduces the number of extracted features. We identify that SARS-CoV-2 has similar physicochemical characteristics in some regions of SARS-CoV. Also, we report that SARS-CoV-2 has similar infection cycles and sequences in some regions of SARS CoV indicating the affectedness of vaccines on SARS-CoV-2. A comparison with deep learning shows similar results with our method.

6.
PeerJ Comput Sci ; 7: e558, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34239969

RESUMO

Recently, many users prefer online shopping to purchase items from the web. Shopping websites allow customers to submit comments and provide their feedback for the purchased products. Opinion mining and sentiment analysis are used to analyze products' comments to help sellers and purchasers decide to buy products or not. However, the nature of online comments affects the performance of the opinion mining process because they may contain negation words or unrelated aspects to the product. To address these problems, a semantic-based aspect level opinion mining (SALOM) model is proposed. The SALOM extracts the product aspects based on the semantic similarity and classifies the comments. The proposed model considers the negation words and other types of product aspects such as aspects' synonyms, hyponyms, and hypernyms to improve the accuracy of classification. Three different datasets are used to evaluate the proposed SALOM. The experimental results are promising in terms of Precision, Recall, and F-measure. The performance reaches 94.8% precision, 93% recall, and 92.6% f-measure.

7.
J Egypt Natl Canc Inst ; 17(1): 29-34, 2005 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16353080

RESUMO

BACKGROUND: CD46 is a membrane cofactor protein, which acts as a cofactor for factor I proteolytic cleavage of C3, so it protects the cells expressing it on their surface from autologous complement attack. It has been recently described as a receptor for HHV-6. Also, it has been shown to be highly expressed on malignant cells as compared to normal cells, thus playing a major role by which these cells, either cells of haematological malignancy or cells of other body cancers, can protect themselves against complement attack so they can survive and metastasize. PATIENTS AND METHODS: This study has been done to detect the seroprevalence of HHV-6 among 47 Egyptian adult cases of acute leukemia using the anti-HHV-6 IgG ELISA serological technique. CD46 receptor expression and immunophenotyping technique were performed using FCM. Twenty nine of the cases were ANLL, while 18 were ALL cases. Sixteen age- and sex-matched control cases were also studied for both anti-HHV-6 IgG and CD46 receptor expression. RESULTS: HHV-6 IgG antibodies were encountered in 29 (100%), 14 (77.8%) and 12 (75%) of the ANLL, ALL and the control group, cases, respectively. CD46 expression was encountered in 21 (72.4%) of the ANLL cases and in 10 (55.6%) of the ALL cases. Concordance between HHV- 6 seropositivity and CD46 expression was encountered in 31 cases (29 positive and 2 negative). Disconcordance was encountered in 16 cases with 14 showing HHV-6 IgG seropositivity with no CD46 expression and 2 showing the reverse. CONCLUSION: The lack of significant correlation between CD46 expression and seropositivity would exclude CD46 expression as a cause of contracting HHV-6 infection in leukemic patients.


Assuntos
Anticorpos Antivirais/sangue , Herpesvirus Humano 6/imunologia , Imunoglobulina G/sangue , Leucemia/virologia , Proteína Cofatora de Membrana/análise , Doença Aguda , Adolescente , Adulto , Idoso , Egito , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Soroepidemiológicos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...