Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 19(4): e0301528, 2024.
Article in English | MEDLINE | ID: mdl-38635694

ABSTRACT

An inexpensive and high-performing solid Coumarone resin was added to Styrene-butadiene-styrene (SBS) copolymer-modified asphalt to enhance its storage stability and road performance. To assess the effect of Coumarone resin dosage on the SBS-modified asphalt, a series of laboratory tests were conducted. The composite modified asphalt's segregation test was used to evaluate its storage stability, Dynamic Shear Rheometer (DSR) and Multiple Stress Creep Recovery (MSCR) tests were employed to investigate its high-temperature performance and permanent deformation resistance, and the Bending Beam Rheology (BBR) test was utilized to measure its low-temperature performance. Fluorescence microscopy was used to observe the composite modified asphalt's microstructure, and Fourier Transform Infrared Spectroscopy (FTIR) was conducted to study the changes in chemical structure during the modification process. The results showed that Coumarone resin can improve the compatibility of SBS and asphalt, improve the high-temperature performance and deformation resistance of SBS-modified asphalt, and adding an appropriate amount of Coumarone resin can help enhance the low-temperature cracking resistance of modified asphalt. The optimal dosage of Coumarone resin recommended for SBS-modified asphalt performance enhancement is 2% under the test conditions, as determined by comparing the test results of samples with various dosages.


Subject(s)
Benzofurans , Hydrocarbons , Styrene , Cold Temperature , Resins, Plant
2.
PLoS One ; 19(3): e0290332, 2024.
Article in English | MEDLINE | ID: mdl-38466662

ABSTRACT

BACKGROUND: Cancer diagnosis based on machine learning has become a popular application direction. Support vector machine (SVM), as a classical machine learning algorithm, has been widely used in cancer diagnosis because of its advantages in high-dimensional and small sample data. However, due to the high-dimensional feature space and high feature redundancy of gene expression data, SVM faces the problem of poor classification effect when dealing with such data. METHODS: Based on this, this paper proposes a hybrid feature selection algorithm combining information gain and grouping particle swarm optimization (IG-GPSO). The algorithm firstly calculates the information gain values of the features and ranks them in descending order according to the value. Then, ranked features are grouped according to the information index, so that the features in the group are close, and the features outside the group are sparse. Finally, grouped features are searched using grouping PSO and evaluated according to in-group and out-group. RESULTS: Experimental results show that the average accuracy (ACC) of the SVM on the feature subset selected by the IG-GPSO is 98.50%, which is significantly better than the traditional feature selection algorithm. Compared with KNN, the classification effect of the feature subset selected by the IG-GPSO is still optimal. In addition, the results of multiple comparison tests show that the feature selection effect of the IG-GPSO is significantly better than that of traditional feature selection algorithms. CONCLUSION: The feature subset selected by IG-GPSO not only has the best classification effect, but also has the least feature scale (FS). More importantly, the IG-GPSO significantly improves the ACC of SVM in cancer diagnostic.


Subject(s)
Algorithms , Neoplasms , Humans , Machine Learning , Neoplasms/diagnosis , Support Vector Machine
4.
J King Saud Univ Comput Inf Sci ; 35(9): 101731, 2023 Oct.
Article in English | MEDLINE | ID: mdl-38567001

ABSTRACT

Aim: Gene expression data is typically high dimensional with a limited number of samples and contain many features that are unrelated to the disease of interest. Existing unsupervised feature selection algorithms primarily focus on the significance of features in maintaining the data structure while not taking into account the redundancy among features. Determining the appropriate number of significant features is another challenge. Method: In this paper, we propose a clustering-guided unsupervised feature selection (CGUFS) algorithm for gene expression data that addresses these problems. Our proposed algorithm introduces three improvements over existing algorithms. For the problem that existing clustering algorithms require artificially specifying the number of clusters, we propose an adaptive k-value strategy to assign appropriate pseudo-labels to each sample by iteratively updating a change function. For the problem that existing algorithms fail to consider the redundancy among features, we propose a feature grouping strategy to group highly redundant features. For the problem that the existing algorithms cannot filter the redundant features, we propose an adaptive filtering strategy to determine the feature combinations to be retained by calculating the potentially effective features and potentially redundant features of each feature group. Result: Experimental results show that the average accuracy (ACC) and matthews correlation coefficient (MCC) indexes of the C4.5 classifier on the optimal features selected by the CGUFS algorithm reach 74.37% and 63.84%, respectively, significantly superior to the existing algorithms. Conclusion: Similarly, the average ACC and MCC indexes of the Adaboost classifier on the optimal features selected by the CGUFS algorithm are significantly superior to the existing algorithms. In addition, statistical experiment results show significant differences between the CGUFS algorithm and the existing algorithms.

5.
Article in English | MEDLINE | ID: mdl-35984792

ABSTRACT

Data imbalance is a common phenomenon in machine learning. In the imbalanced data classification, minority samples are far less than majority samples, which makes it difficult for minority to be effectively learned by classifiers A synthetic minority oversampling technique (SMOTE) improves the sensitivity of classifiers to minority by synthesizing minority samples without repetition. However, the process of synthesizing new samples in the SMOTE algorithm may lead to problems such as "noisy samples" and "boundary samples." Based on the above description, we propose a synthetic minority oversampling technique based on Gaussian mixture model filtering (GMF-SMOTE). GMF-SMOTE uses the expected maximum algorithm based on the Gaussian mixture model to group the imbalanced data. Then, the expected maximum filtering algorithm is used to filter out the "noisy samples" and "boundary samples" in the subclasses after grouping. Finally, to synthesize majority and minority samples, we design two dynamic oversampling ratios. Experimental results show that the GMF-SMOTE performs better than the traditional oversampling algorithms on 20 UCI datasets. The population averages of sensitivity and specificity indexes of random forest (RF) on the UCI datasets synthesized by GMF-SMOTE are 97.49% and 97.02%, respectively. In addition, we also record the G-mean and MCC indexes of the RF, which are 97.32% and 94.80%, respectively, significantly better than the traditional oversampling algorithms. More importantly, the two statistical tests show that GMF-SMOTE is significantly better than the traditional oversampling algorithms.

6.
J Biomed Inform ; 107: 103465, 2020 07.
Article in English | MEDLINE | ID: mdl-32512209

ABSTRACT

The problem of imbalanced data classification often exists in medical diagnosis. Traditional classification algorithms usually assume that the number of samples in each class is similar and their misclassification cost during training is equal. However, the misclassification cost of patient samples is higher than that of healthy person samples. Therefore, how to increase the identification of patients without affecting the classification of healthy individuals is an urgent problem. In order to solve the problem of imbalanced data classification in medical diagnosis, we propose a hybrid sampling algorithm called RFMSE, which combines the Misclassification-oriented Synthetic minority over-sampling technique (M-SMOTE) and Edited nearset neighbor (ENN) based on Random forest (RF). The algorithm is mainly composed of three parts. First, M-SMOTE is used to increase the number of samples in the minority class, while the over-sampling rate of M-SMOTE is the misclassification rate of RF. Then, ENN is used to remove the noise ones from the majority samples. Finally, RF is used to perform classification prediction for the samples after hybrid sampling, and the stopping criterion for iterations is determined according to the changes of the classification index (i.e. Matthews Correlation Coefficient (MCC)). When the value of MCC continuously drops, the process of iterations will be stopped. Extensive experiments conducted on ten UCI datasets demonstrate that RFMSE can effectively solve the problem of imbalanced data classification. Compared with traditional algorithms, our method can improve F-value and MCC more effectively.


Subject(s)
Algorithms , Research Design , Humans
7.
J Biomed Inform ; 78: 144-155, 2018 02.
Article in English | MEDLINE | ID: mdl-29137965

ABSTRACT

From the perspective of clinical decision-making in a Medical IoT-based healthcare system, achieving effective and efficient analysis of long-term health data for supporting wise clinical decision-making is an extremely important objective, but determining how to effectively deal with the multi-dimensionality and high volume of generated data obtained from Medical IoT-based healthcare systems is an issue of increasing importance in IoT healthcare data exploration and management. A novel classifier or predicator equipped with a good feature selection function contributes effectively to classification and prediction performance. This paper proposes a novel bagging C4.5 algorithm based on wrapper feature selection, for the purpose of supporting wise clinical decision-making in the medical and healthcare fields. In particular, the new proposed sampling method, S-C4.5-SMOTE, is not only able to overcome the problem of data distortion, but also improves overall system performance because its mechanism aims at effectively reducing the data size without distortion, by keeping datasets balanced and technically smooth. This achievement directly supports the Wrapper method of effective feature selection without the need to consider the problem of huge amounts of data; this is a novel innovation in this work.


Subject(s)
Algorithms , Clinical Decision-Making/methods , Decision Support Systems, Clinical , Machine Learning , Data Mining , Electronic Health Records , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...