Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
Comput Methods Programs Biomed ; 245: 108019, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38237450

ABSTRACT

BACKGROUND AND OBJECTIVE: Pancreatic Ductal Adenocarcinoma (PDAC) is a form of pancreatic cancer that is one of the primary causes of cancer-related deaths globally, with less than 10 % of the five years survival rate. The prognosis of pancreatic cancer has remained poor in the last four decades, mainly due to the lack of early diagnostic mechanisms. This study proposes a novel method for detecting PDAC using explainable and supervised machine learning from Raman spectroscopic signals. METHODS: An insightful feature set consisting of statistical, peak, and extended empirical mode decomposition features is selected using the support vector machine recursive feature elimination method integrated with a correlation bias reduction. Explicable features successfully identified mutations in Kirsten rat sarcoma viral oncogene homolog (KRAS) and tumor suppressor protein53 (TP53) in the fingerprint region for the first time in the literature. PDAC and normal pancreas are classified using K-nearest neighbor, linear discriminant analysis, and support vector machine classifiers. RESULTS: This study achieved a classification accuracy of 98.5% using a nonlinear support vector machine. Our proposed method reduced test time by 28.5 % and saved 85.6 % memory utilization, which reduces complexity significantly and is more accurate than the state-of-the-art method. The generalization of the proposed method is assessed by fifteen-fold cross-validation, and its performance is evaluated using accuracy, specificity, sensitivity, and receiver operating characteristic curves. CONCLUSIONS: In this study, we proposed a method to detect and define the fingerprint region for PDAC using explainable machine learning. This simple, accurate, and efficient method for PDAC detection in mice could be generalized to examine human pancreatic cancer and provide a basis for precise chemotherapy for early cancer treatment.


Subject(s)
Adenocarcinoma , Carcinoma, Pancreatic Ductal , Pancreatic Neoplasms , Humans , Animals , Mice , Pancreatic Neoplasms/diagnosis , Pancreatic Neoplasms/genetics , Carcinoma, Pancreatic Ductal/diagnosis , Carcinoma, Pancreatic Ductal/genetics , Carcinoma, Pancreatic Ductal/pathology , ROC Curve , Machine Learning
2.
Heliyon ; 9(10): e21151, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37928383

ABSTRACT

Background: As an inevitable event after kidney transplantation, ischemia‒reperfusion injury (IRI) can lead to a decrease in kidney transplant success. The search for signature genes of renal ischemia‒reperfusion injury (RIRI) is helpful in improving the diagnosis and guiding clinical treatment. Methods: We first downloaded 3 datasets from the GEO database. Then, differentially expressed genes (DEGs) were identified and applied for functional enrichment analysis. After that, we performed three machine learning methods, including random forest (RF), Lasso regression analysis, and support vector machine recursive feature elimination (SVM-RFE), to further predict candidate genes. WGCNA was also executed to screen candidate genes from DEGs. Then, we took the intersection of candidate genes to obtain the signature genes of RIRI. Receiver operating characteristic (ROC) analysis was conducted to measure the predictive ability of the signature genes. Kaplan‒Meier analysis was used for association analysis between signature genes and graft survival. Verifying the expression of signature genes in the ischemia cell model. Results: A total of 117 DEGs were screened out. Subsequently, RF, Lasso regression analysis, SVM-RFE and WGCNA identified 17, 25, 18 and 74 candidate genes, respectively. Finally, 3 signature genes (DUSP1, FOS, JUN) were screened out through the intersection of candidate genes. ROC analysis suggested that the 3 signature genes could well diagnose and predict RIRI. Kaplan‒Meier analysis indicated that patients with low FOS or JUN expression had a longer OS than those with high FOS or JUN expression. Finally, we validated using the ischemia cell model that compared to the control group, the expression level of JUN increased under hypoxic conditions. Conclusions: Three signature genes (DUSP1, FOS, JUN) offer a good prediction for RIRI outcome and may serve as potential therapeutic targets for RIRI intervention, especially JUN. The prediction of graft survival by FOS and JUN may improve graft survival in patients with RIRI.

3.
J Int Med Res ; 51(11): 3000605231213781, 2023 Nov.
Article in English | MEDLINE | ID: mdl-38006610

ABSTRACT

OBJECTIVES: Hypertrophic cardiomyopathy (HCM), a leading cause of heart failure and sudden death, requires early diagnosis and treatment. This study investigated the underlying pathogenesis and explored potential diagnostic gene biomarkers for HCM. METHODS: Transcriptional profiles of myocardial tissues from patients with HCM (dataset GSE36961) were downloaded from the Gene Expression Omnibus database and subjected to bioinformatics analyses, including differentially expressed gene (DEG) identification, enrichment analyses, and protein-protein interaction (PPI) network analysis. Least absolute shrinkage and selection operator (LASSO) regression and support vector machine recursive feature elimination were performed to identify candidate diagnostic gene biomarkers. mRNA expression levels of candidate biomarkers were tested in an external dataset (GSE141910); area under the receiver operating characteristic curve (AUC) values were obtained to validate diagnostic efficacy. RESULTS: Overall, 156 DEGs (109 downregulated, 47 upregulated) were identified. Enrichment and PPI network analyses indicated that the DEGs were involved in biological functions and molecular pathways including inflammatory response, platelet activity, complement and coagulation cascades, extracellular matrix organization, phagosome, apoptosis, and VEGFA-VEGFR2 signaling. RASD1, CDC42EP4, MYH6, and FCN3 were identified as diagnostic biomarkers for HCM. CONCLUSIONS: RASD1, CDC42EP4, MYH6, and FCN3 might be diagnostic gene biomarkers for HCM and can provide insights concerning HCM pathogenesis.


Subject(s)
Cardiomyopathy, Hypertrophic , Humans , Cardiomyopathy, Hypertrophic/diagnosis , Cardiomyopathy, Hypertrophic/genetics , Myocardium , Apoptosis , Blood Coagulation , Machine Learning , Biomarkers , ras Proteins
4.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 40(4): 725-735, 2023 Aug 25.
Article in Chinese | MEDLINE | ID: mdl-37666763

ABSTRACT

Keloids are benign skin tumors resulting from the excessive proliferation of connective tissue in wound skin. Precise prediction of keloid risk in trauma patients and timely early diagnosis are of paramount importance for in-depth keloid management and control of its progression. This study analyzed four keloid datasets in the high-throughput gene expression omnibus (GEO) database, identified diagnostic markers for keloids, and established a nomogram prediction model. Initially, 37 core protein-encoding genes were selected through weighted gene co-expression network analysis (WGCNA), differential expression analysis, and the centrality algorithm of the protein-protein interaction network. Subsequently, two machine learning algorithms including the least absolute shrinkage and selection operator (LASSO) and the support vector machine-recursive feature elimination (SVM-RFE) were used to further screen out four diagnostic markers with the highest predictive power for keloids, which included hepatocyte growth factor (HGF), syndecan-4 (SDC4), ectonucleotide pyrophosphatase/phosphodiesterase 2 (ENPP2), and Rho family guanosine triphophatase 3 (RND3). Potential biological pathways involved were explored through gene set enrichment analysis (GSEA) of single-gene. Finally, univariate and multivariate logistic regression analyses of diagnostic markers were performed, and a nomogram prediction model was constructed. Internal and external validations revealed that the calibration curve of this model closely approximates the ideal curve, the decision curve is superior to other strategies, and the area under the receiver operating characteristic curve is higher than the control model (with optimal cutoff value of 0.588). This indicates that the model possesses high calibration, clinical benefit rate, and predictive power, and is promising to provide effective early means for clinical diagnosis.


Subject(s)
Keloid , Humans , Keloid/diagnosis , Keloid/genetics , Nomograms , Algorithms , Calibration , Machine Learning
5.
Math Biosci Eng ; 20(6): 10741-10756, 2023 04 17.
Article in English | MEDLINE | ID: mdl-37322958

ABSTRACT

BACKGROUND: Ulcerative colitis (UC) is an idiopathic inflammatory disease with an increasing incidence. This study aimed to identify potential UC biomarkers and associated immune infiltration characteristics. METHODS: Two datasets (GSE87473 and GSE92415) were merged to obtain 193 UC samples and 42 normal samples. Using R, differentially expressed genes (DEGs) between UC and normal samples were filtered out, and their biological functions were investigated using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analyses. Promising biomarkers were identified using least absolute shrinkage selector operator regression and support vector machine recursive feature elimination, and their diagnostic efficacy was evaluated through receiver operating characteristic (ROC) curves. Finally, CIBERSORT was used to investigate the immune infiltration characteristics in UC, and the relationship between the identified biomarkers and various immune cells was examined. RESULTS: We found 102 DEGs, of which 64 were significantly upregulated, and 38 were significantly downregulated. The DEGs were enriched in pathways associated with interleukin-17, cytokine-cytokine receptor interaction and viral protein interactions with cytokines and cytokine receptors, among others. Using machine learning methods and ROC tests, we confirmed DUOX2, DMBT1, CYP2B7P, PITX2 and DEFB1 to be essential diagnostic genes for UC. Immune cell infiltration analysis revealed that all five diagnostic genes were correlated with regulatory T cells, CD8 T cells, activated and resting memory CD4 T cells, activated natural killer cells, neutrophils, activated and resting mast cells, activated and resting dendritic cells and M0, M1 and M2 macrophages. CONCLUSIONS: DUOX2, DMBT1, CYP2B7P, PITX2 and DEFB1 were identified as prospective biomarkers for UC. A new perspective on understanding the progression of UC may be provided by these biomarkers and their relationship with immune cell infiltration.


Subject(s)
Colitis, Ulcerative , beta-Defensins , Humans , Colitis, Ulcerative/diagnosis , Colitis, Ulcerative/genetics , Dual Oxidases , Computational Biology , Biomarkers , Cytokines , Machine Learning , Calcium-Binding Proteins , DNA-Binding Proteins , Tumor Suppressor Proteins
6.
Article in Chinese | WPRIM (Western Pacific) | ID: wpr-1008893

ABSTRACT

Keloids are benign skin tumors resulting from the excessive proliferation of connective tissue in wound skin. Precise prediction of keloid risk in trauma patients and timely early diagnosis are of paramount importance for in-depth keloid management and control of its progression. This study analyzed four keloid datasets in the high-throughput gene expression omnibus (GEO) database, identified diagnostic markers for keloids, and established a nomogram prediction model. Initially, 37 core protein-encoding genes were selected through weighted gene co-expression network analysis (WGCNA), differential expression analysis, and the centrality algorithm of the protein-protein interaction network. Subsequently, two machine learning algorithms including the least absolute shrinkage and selection operator (LASSO) and the support vector machine-recursive feature elimination (SVM-RFE) were used to further screen out four diagnostic markers with the highest predictive power for keloids, which included hepatocyte growth factor (HGF), syndecan-4 (SDC4), ectonucleotide pyrophosphatase/phosphodiesterase 2 (ENPP2), and Rho family guanosine triphophatase 3 (RND3). Potential biological pathways involved were explored through gene set enrichment analysis (GSEA) of single-gene. Finally, univariate and multivariate logistic regression analyses of diagnostic markers were performed, and a nomogram prediction model was constructed. Internal and external validations revealed that the calibration curve of this model closely approximates the ideal curve, the decision curve is superior to other strategies, and the area under the receiver operating characteristic curve is higher than the control model (with optimal cutoff value of 0.588). This indicates that the model possesses high calibration, clinical benefit rate, and predictive power, and is promising to provide effective early means for clinical diagnosis.


Subject(s)
Humans , Keloid/genetics , Nomograms , Algorithms , Calibration , Machine Learning
7.
BMC Cancer ; 21(1): 1303, 2021 Dec 06.
Article in English | MEDLINE | ID: mdl-34872521

ABSTRACT

BACKGROUND: There is no unified treatment standard for patients with extranodal NK/T-cell lymphoma (ENKTL). Cancer neoantigens are the result of somatic mutations and cancer-specific. Increased number of somatic mutations are associated with anti-cancer effects. Screening out ENKTL-specific neoantigens on the surface of cancer cells relies on the understanding of ENKTL mutation patterns. Hence, it is imperative to identify ENKTL-specific genes for ENKTL diagnosis, the discovery of tumor-specific neoantigens and the development of novel therapeutic strategies. We investigated the gene signatures of ENKTL patients. METHODS: We collected the peripheral blood of a pair of twins for sequencing to identify unique variant genes. One of the twins is diagnosed with ENKTL. Seventy samples were analyzed by Robust Multi-array Analysis (RMA). Two methods (elastic net and Support Vector Machine-Recursive Feature Elimination) were used to select unique genes. Next, we performed functional enrichment analysis and pathway enrichment analysis. Then, we conducted single-sample gene set enrichment analysis of immune infiltration and validated the expression of the screened markers with limma packages. RESULTS: We screened out 126 unique variant genes. Among them, 11 unique genes were selected by the combination of elastic net and Support Vector Machine-Recursive Feature Elimination. Subsequently, GO and KEGG analysis indicated the biological function of identified unique genes. GSEA indicated five immunity-related pathways with high signature scores. In patients with ENKTL and the group with high signature scores, a proportion of functional immune cells are all of great infiltration. We finally found that CDC27, ZNF141, FCGR2C and NES were four significantly differential genes in ENKTL patients. ZNF141, FCGR2C and NES were upregulated in patients with ENKTL, while CDC27 was significantly downregulated. CONCLUSION: We identified four ENKTL markers (ZNF141, FCGR2C, NES and CDC27) in patients with extranodal NK/T-cell lymphoma.


Subject(s)
Lymphoma, Extranodal NK-T-Cell/genetics , Machine Learning/standards , Female , Humans , Male , Twins
8.
Hum Genomics ; 15(1): 66, 2021 11 09.
Article in English | MEDLINE | ID: mdl-34753514

ABSTRACT

BACKGROUND: Nowadays we are observing an explosion of gene expression data with phenotypes. It enables us to accurately identify genes responsible for certain medical condition as well as classify them for drug target. Like any other phenotype data in medical domain, gene expression data with phenotypes also suffer from being a very underdetermined system. In a very large set of features but a very small sample size domain (e.g. DNA microarray, RNA-seq data, GWAS data, etc.), it is often reported that several contrasting feature subsets may yield near equally optimal results. This phenomenon is known as instability. Considering these facts, we have developed a robust and stable supervised gene selection algorithm to select a set of robust and stable genes having a better prediction ability from the gene expression datasets with phenotypes. Stability and robustness is ensured by class and instance level perturbations, respectively. RESULTS: We have performed rigorous experimental evaluations using 10 real gene expression microarray datasets with phenotypes. They reveal that our algorithm outperforms the state-of-the-art algorithms with respect to stability and classification accuracy. We have also performed biological enrichment analysis based on gene ontology-biological processes (GO-BP) terms, disease ontology (DO) terms, and biological pathways. CONCLUSIONS: It is indisputable from the results of the performance evaluations that our proposed method is indeed an effective and efficient supervised gene selection algorithm.


Subject(s)
Algorithms , Machine Learning , Oligonucleotide Array Sequence Analysis/methods , Phenotype
9.
Exp Ther Med ; 20(4): 3096-3103, 2020 Oct.
Article in English | MEDLINE | ID: mdl-32855677

ABSTRACT

The etiology and pathophysiological mechanisms of idiopathic pulmonary fibrosis (IPF) are yet to be fully elucidated; however, mining of disease-related microRNAs (miRNAs/miRs) has improved the understanding of the progression of IPF. The aim of the current study was to screen miRNAs associated with IPF using three mathematical algorithms: One-way ANOVA, least absolute shrinkage and selector operation (LASSO) and support vector machine-recursive feature elimination (SVM-RFE). Using ANOVA, three miRNAs and two miRNAs were selected with opposite expression patterns in moderate and severe IPF, respectively. In total, two algorithms, LASSO and SVM-RFE, were used to perform feature selection of miRNAs. miRNAs from patients were also extracted from formalin-fixed paraffin-embedded tissues and detected using reverse transcription-quantitative PCR (RT-qPCR). The intersection of the three algorithms (ANOVA, LASSO and SVM-RFE) was taken as the final result of the miRNA candidates. Three miRNA candidates, including miR-124, hsa-miR-524-5p and hsa-miR-194 were therefore used as biomarkers. The receiver operating characteristic model demonstrated favorable discrimination between IPF and control groups, with an area under the curve of 78.5%. Moreover, RT-qPCR results indicated that miR-124, hsa-miR-524-5p, hsa-miR-194 and hsa-miR-133a were differentially expressed between patients with IPF and age-matched men without fibrotic lung disease. The target genes of these miRNAs were further predicted and Kyoto Encyclopedia of Genes and Genomes enrichment analysis was performed. Collectively, the present results suggested that the identified miRNAs associated with IPF may be useful biomarkers for the diagnosis of this disease.

10.
Front Pharmacol ; 11: 534, 2020.
Article in English | MEDLINE | ID: mdl-32425783

ABSTRACT

Pancreatic ductal adenocarcinoma (PDAC) is one of the leading causes of cancer-related death and has an extremely poor prognosis. Thus, identifying new disease-associated genes and targets for PDAC diagnosis and therapy is urgently needed. This requires investigations into the underlying molecular mechanisms of PDAC at both the systems and molecular levels. Herein, we developed a computational method of predicting cancer genes and anticancer drug targets that combined three independent expression microarray datasets of PDAC patients and protein-protein interaction data. First, Support Vector Machine-Recursive Feature Elimination was applied to the gene expression data to rank the differentially expressed genes (DEGs) between PDAC patients and controls. Then, protein-protein interaction networks were constructed based on the DEGs, and a new score comprising gene expression and network topological information was proposed to identify cancer genes. Finally, these genes were validated by "druggability" prediction, survival and common network analysis, and functional enrichment analysis. Furthermore, two integrins were screened to investigate their structures and dynamics as potential drug targets for PDAC. Collectively, 17 disease genes and some stroma-related pathways including extracellular matrix-receptor interactions were predicted to be potential drug targets and important pathways for treating PDAC. The protein-drug interactions and hinge sites predication of ITGAV and ITGA2 suggest potential drug binding residues in the Thigh domain. These findings provide new possibilities for targeted therapeutic interventions in PDAC, which may have further applications in other cancer types.

11.
Int J Biol Macromol ; 145: 429-436, 2020 Feb 15.
Article in English | MEDLINE | ID: mdl-31883894

ABSTRACT

The study aimed to explore the molecular mechanism underlying triple-negative breast cancer (TNBC) and to identify their potential diagnostic/prognostic biomarkers. The differentially expressed lncRNAs (DElncRNAs) were identified by meta-analysis and machine learning feature selection methods. The dysregulated lncRNA-miRNA-mRNA network was constructed based on the competing endogenous RNA (ceRNA) hypothesis. A total of 26 DElncRNAs were identified with a meta-analysis approach of which 18 DElncRNAs attained high accuracy in training and test dataset by Support Vector Machine-Recursive Feature Elimination (SVM-RFE) which could act as diagnostic biomarkers. Among the identified DElncRNAs, LINC01315 and CTA-384D8.35 could act as prognostic biomarkers. Finally, two important sub-modules from lncRNA-miRNA-mRNA network were identified which consists of DElncRNAs (LINC01087, LINC01315, and SOX9-AS1) interacting with co-expressed DEmRNAs and DEmiRNAs. Thus, the study indicated the importance of DElncRNAs and highlighted the efficacy as potential biomarkers in TNBC.


Subject(s)
Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , MicroRNAs/genetics , RNA Interference , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Triple Negative Breast Neoplasms/genetics , Biomarkers, Tumor , Cluster Analysis , Computational Biology/methods , Female , Gene Expression Profiling , Gene Ontology , Humans , Meta-Analysis as Topic , Models, Theoretical , Molecular Sequence Annotation , Prognosis , Reproducibility of Results , Support Vector Machine , Transcriptome , Triple Negative Breast Neoplasms/mortality
12.
Sensors (Basel) ; 18(7)2018 Jul 23.
Article in English | MEDLINE | ID: mdl-30041417

ABSTRACT

Routine stress monitoring in daily life can predict potentially serious health impacts. Effective stress monitoring in medical and healthcare fields is dependent upon accurate determination of stress-related features. In this study, we determined the optimal stress-related features for effective monitoring of cumulative stress. We first investigated the effects of short- and long-term stress on various heart rate variability (HRV) features using a rodent model. Subsequently, we determined an optimal HRV feature set using support vector machine-recursive feature elimination (SVM-RFE). Experimental results indicate that the HRV time domain features generally decrease under long-term stress, and the HRV frequency domain features have substantially significant differences under short-term stress. Further, an SVM classifier with a radial basis function kernel proved most accurate (93.11%) when using an optimal HRV feature set comprising the mean of R-R intervals (mRR), the standard deviation of R-R intervals (SDRR), and the coefficient of variance of R-R intervals (CVRR) as time domain features, and the normalized low frequency (nLF) and the normalized high frequency (nHF) as frequency domain features. Our findings indicate that the optimal HRV features identified in this study can effectively and efficiently detect stress. This knowledge facilitates development of in-facility and mobile healthcare system designs to support stress monitoring in daily life.


Subject(s)
Electrocardiography , Heart Rate/physiology , Stress, Psychological/diagnosis , Stress, Psychological/physiopathology , Support Vector Machine , Animals , Male , Models, Animal , Rats , Rats, Sprague-Dawley
13.
Int J Med Inform ; 109: 30-38, 2018 01.
Article in English | MEDLINE | ID: mdl-29195703

ABSTRACT

OBJECTIVE: The main goal of this study was to develop an automatic method based on supervised learning methods, able to distinguish healthy from pathologic arterial pulse wave (APW), and those two from noisy waveforms (non-relevant segments of the signal), from the data acquired during a clinical examination with a novel optical system. MATERIALS AND METHODS: The APW dataset analysed was composed by signals acquired in a clinical environment from a total of 213 subjects, including healthy volunteers and non-healthy patients. The signals were parameterised by means of 39pulse features: morphologic, time domain statistics, cross-correlation features, wavelet features. Multiclass Support Vector Machine Recursive Feature Elimination (SVM RFE) method was used to select the most relevant features. A comparative study was performed in order to evaluate the performance of the two classifiers: Support Vector Machine (SVM) and Artificial Neural Network (ANN). RESULTS AND DISCUSSION: SVM achieved a statistically significant better performance for this problem with an average accuracy of 0.9917±0.0024 and a F-Measure of 0.9925±0.0019, in comparison with ANN, which reached the values of 0.9847±0.0032 and 0.9852±0.0031 for Accuracy and F-Measure, respectively. A significant difference was observed between the performances obtained with SVM classifier using a different number of features from the original set available. CONCLUSION: The comparison between SVM and NN allowed reassert the higher performance of SVM. The results obtained in this study showed the potential of the proposed method to differentiate those three important signal outcomes (healthy, pathologic and noise) and to reduce bias associated with clinical diagnosis of cardiovascular disease using APW.


Subject(s)
Arteries/pathology , Neural Networks, Computer , Pulsatile Flow/physiology , Pulse Wave Analysis/methods , Support Vector Machine , Algorithms , Case-Control Studies , Equipment Design , Humans , Signal Processing, Computer-Assisted
14.
Clin Neurophysiol ; 128(12): 2400-2410, 2017 12.
Article in English | MEDLINE | ID: mdl-29096213

ABSTRACT

OBJECTIVE: Attention-deficit/hyperactivity disorder (ADHD) is the most frequent diagnosis among children who are referred to psychiatry departments. Although ADHD was discovered at the beginning of the 20th century, its diagnosis is still confronted with many problems. METHOD: A novel classification approach that discriminates ADHD and nonADHD groups over the time-frequency domain features of event-related potential (ERP) recordings that are taken during Stroop task is presented. Time-Frequency Hermite-Atomizer (TFHA) technique is used for the extraction of high resolution time-frequency domain features that are highly localized in time-frequency domain. Based on an extensive investigation, Support Vector Machine-Recursive Feature Elimination (SVM-RFE) was used to obtain the best discriminating features. RESULTS: When the best three features were used, the classification accuracy for the training dataset reached 98%, and the use of five features further improved the accuracy to 99.5%. The accuracy was 100% for the testing dataset. Based on extensive experiments, the delta band emerged as the most contributing frequency band and statistical parameters emerged as the most contributing feature group. CONCLUSION: The classification performance of this study suggests that TFHA can be employed as an auxiliary component of the diagnostic and prognostic procedures for ADHD. SIGNIFICANCE: The features obtained in this study can potentially contribute to the neuroelectrical understanding and clinical diagnosis of ADHD.


Subject(s)
Algorithms , Attention Deficit Disorder with Hyperactivity/classification , Electroencephalography/classification , Evoked Potentials/physiology , Machine Learning/classification , Stroop Test , Attention Deficit Disorder with Hyperactivity/diagnosis , Attention Deficit Disorder with Hyperactivity/physiopathology , Child , Electroencephalography/methods , Humans , Male , Time Factors
15.
J Bioinform Comput Biol ; 15(1): 1650025, 2017 Feb.
Article in English | MEDLINE | ID: mdl-27411307

ABSTRACT

Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.


Subject(s)
Computational Biology/methods , Lipoylation , Machine Learning , Proteins/metabolism , Algorithms , Amino Acids/chemistry , Colorectal Neoplasms/metabolism , Databases, Protein , Humans , Protein Processing, Post-Translational , Support Vector Machine
16.
Talanta ; 139: 198-207, 2015 Jul 01.
Article in English | MEDLINE | ID: mdl-25882427

ABSTRACT

A novel method of using hyperspectral imaging technique with the weighted combination of spectral data and image features by fuzzy neural network (FNN) was proposed for real-time prediction of polyphenol oxidase (PPO) activity in lychee pericarp. Lychee images were obtained by a hyperspectral reflectance imaging system operating in the range of 400-1000nm. A support vector machine-recursive feature elimination (SVM-RFE) algorithm was applied to eliminating variables with no or little information for the prediction from all bands, resulting in a reduced set of optimal wavelengths. Spectral information at the optimal wavelengths and image color features were then used respectively to develop calibration models for the prediction of PPO in pericarp during storage, and the results of two models were compared. In order to improve the prediction accuracy, a decision strategy was developed based on weighted combination of spectral data and image features, in which the weights were determined by FNN for a better estimation of PPO activity. The results showed that the combined decision model was the best among all of the calibration models, with high R(2) values of 0.9117 and 0.9072 and low RMSEs of 0.45% and 0.459% for calibration and prediction, respectively. These results demonstrate that the proposed weighted combined decision method has great potential for improving model performance. The proposed technique could be used for a better prediction of other internal and external quality attributes of fruits.


Subject(s)
Catechol Oxidase/analysis , Catechol Oxidase/metabolism , Image Processing, Computer-Assisted/methods , Litchi/chemistry , Models, Theoretical , Neural Networks, Computer , Algorithms , Calibration , Support Vector Machine
SELECTION OF CITATIONS
SEARCH DETAIL
...