Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 112
Filter
1.
Brief Bioinform ; 25(6)2024 Sep 23.
Article in English | MEDLINE | ID: mdl-39358034

ABSTRACT

We sought to develop and validate a machine learning (ML) model for predicting multidimensional frailty based on clinical and laboratory data. Moreover, an explainable ML model utilizing SHapley Additive exPlanations (SHAP) was constructed. This study enrolled 622 patients hospitalized due to decompensating episodes at a tertiary hospital. The cohort data were randomly divided into training and test sets. External validation was carried out using 131 patients from other tertiary hospitals. The frail phenotype was defined according to a self-reported questionnaire (Frailty Index). The area under the receiver operating characteristics curve was adopted to compare the performance of five ML models. The importance of the features and interpretation of the ML models were determined using the SHAP method. The proportions of cirrhotic patients with nonfrail and frail phenotypes in combined training and test sets were 87.8% and 12.2%, respectively, while they were 88.5% and 11.5% in the external validation dataset. Five ML algorithms were used, and the random forest (RF) model exhibited substantially predictive performance. Regarding the external validation, the RF algorithm outperformed other ML models. Moreover, the SHAP method demonstrated that neutrophil-to-lymphocyte ratio, age, lymphocyte-to-monocyte ratio, ascites, and albumin served as the most important predictors for frailty. At the patient level, the SHAP force plot and decision plot exhibited a clinically meaningful explanation of the RF algorithm. We constructed an ML model (RF) providing accurate prediction of frail phenotype in decompensated cirrhosis. The explainability and generalizability may foster clinicians to understand contributors to this physiologically vulnerable situation and tailor interventions.


Subject(s)
Frailty , Hospitalization , Liver Cirrhosis , Machine Learning , Humans , Liver Cirrhosis/complications , Female , Male , Middle Aged , Aged , Algorithms , ROC Curve
2.
J Hazard Mater ; 479: 135695, 2024 Nov 05.
Article in English | MEDLINE | ID: mdl-39217922

ABSTRACT

The capillary zone plays a crucial role in migration and transformation of pollutants. Light nonaqueous liquids (LNAPLs) have become the main organic pollutant in soil and groundwater environments. However, few studies have focused on the concentration distribution characteristics and quantitative expression of LNAPL pollutants within capillary zone. In this study, we conducted a sandbox-migration experiment using diesel oil as a typical LNAPL pollutant, with the capillary zone of silty sand as the research object. The variation characteristics of LNAPL pollutants (total petroleum hydrocarbon) concentration and environmental factors (moisture content, electrical conductivity, pH, and oxidationreduction potential) were essentially consistent at different locations with the same height. These characteristics differed within range of 10.0-50.0 cm and above 60.0 cm from groundwater. A model for quantitative expression of concentrations was constructed by coupling multiple environmental factors of 968 sets-7744 data via random forest algorithm. The goodness of fit (R2) for both training and test sets was greater than 0.90, and the mean absolute percentage error (MAPE) was less than 16.00 %. The absolute values of relative errors in predicting concentrations at characteristic points were less than 15.00 %. The constructed model can accurately and quantitatively express and predict concentrations in capillary zone.

3.
Heliyon ; 10(17): e37065, 2024 Sep 15.
Article in English | MEDLINE | ID: mdl-39286064

ABSTRACT

Maize (Zea mays) is an important staple crop for food security in Sub-Saharan Africa. However, there is need to increase production to feed a growing population. In Ghana, this is mainly done by increasing acreage with adverse environmental consequences, rather than yield increment per unit area. Accurate prediction of maize yields and nutrient use efficiency in production is critical to making informed decisions toward economic and ecological sustainability. We trained the random forest machine learning algorithm to predict maize yield and agronomic efficiency in Ghana using soil, climate, environment, and management factors, including fertilizer application. We calibrated and evaluated the performance of the random forest machine learning algorithm using a 5 × 10-fold nested cross-validation approach. Data from 482 maize field trials consisting of 3136 georeferenced treatment plots conducted in Ghana from 1991 to 2020 were used to train the algorithm, identify important predictor variables, and quantify the uncertainties associated with the random forest predictions. The mean error, root mean squared error, model efficiency coefficient and 90 % prediction interval coverage probability were calculated. The results obtained on test data demonstrate good prediction performance for yield (MEC = 0.81) and moderate performance for agronomic efficiency (MEC = 0.63, 0.55 and 0.54 for AE-N, AE-P and AE-K, respectively). We found that climatic variables were less important predictors than soil variables for yield prediction, but temperature was of key importance to yield prediction and rainfall to agronomic efficiency. The developed random forest models provided a better understanding of the drivers of maize yield and agronomic efficiency in a tropical climate and an insight towards improving fertilizer recommendations for sustainable maize production and food security in Sub-Saharan Africa.

4.
Clin Epigenetics ; 16(1): 122, 2024 Sep 07.
Article in English | MEDLINE | ID: mdl-39244604

ABSTRACT

BACKGROUND AND PURPOSE: Early detection, diagnosis, and treatment of colorectal cancer and its precancerous lesions can significantly improve patients' survival rates. The purpose of this research is to identify methylation markers specific to colorectal cancer tissues and validate their diagnostic capability in colorectal cancer and precancerous changes by measuring the level of DNA methylation in stool samples. METHOD: We analyzed samples from six cancer tissues and adjacent normal tissues and fecal samples from 758 participants, including 62 patients with interfering diseases. Bioinformatics databases were used to screen for candidate biomarkers for CRC, and quantitative methylation-specific PCR methods were applied for identification. The methylation levels of the candidate biomarkers in fecal and tissue samples were measured. Logistic regression and random forest models were built and validated using fecal sample data from one of the centers, and the independent or combined diagnostic value of the candidate biomarkers in fecal samples for CRC and precancerous lesions was analyzed. Finally, the diagnostic capability and stability of the model were validated at another medical center. RESULTS: This study identified two colorectal cancer CpG sites with tissue specificity. These two biomarkers have certain diagnostic power when used individually, but their diagnostic value for colorectal cancer and colorectal adenoma is more significant when they are used in combination. CONCLUSION: The results indicate that a DNA methylation biomarker combined diagnostic model based on two CpG sites, cg13096260 and cg12587766, has the potential for screening and diagnosing precancerous lesions and colorectal cancer. Additionally, compared to traditional diagnostic models, machine learning algorithms perform better but may yield more false-positive results, necessitating further investigation.


Subject(s)
Biomarkers, Tumor , Colorectal Neoplasms , DNA Methylation , Feces , Humans , Colorectal Neoplasms/genetics , Colorectal Neoplasms/diagnosis , DNA Methylation/genetics , Female , Male , Biomarkers, Tumor/genetics , Middle Aged , Retrospective Studies , Feces/chemistry , Aged , CpG Islands/genetics , Early Detection of Cancer/methods , Adult
5.
Sci Rep ; 14(1): 20895, 2024 Sep 08.
Article in English | MEDLINE | ID: mdl-39245664

ABSTRACT

Alpine natural heritage sites hold significant value due to their unique global resources. Studying land cover changes in these areas is crucial for maintaining and preserving multiple their values. This study takes Kalajun-Kuerdening, one of the components of Xinjiang Tianshan World Natural Heritage Site, as an example to analyze land cover changes and their driving factors in alpine heritage sites. Highlights include: (1) Between 1994 and 2023, Forest and Grassland increased by 55.96 km2 and 18.16 km2, with notable forest growth from 2007 to 2017. Trends in Forest changes align with forest protection policies, and a substantial amount of Bareland converted to Grassland indicates an increase in vegetation cover. (2) Elevation, precipitation, temperature, and evapotranspiration are key drivers of land cover changes, as validated by Random Forest algorithm and Geodetector model. (3) Favorable conditions for Grassland to Forest transition include annual precipitation between 275 and 375 mm, annual temperature between -2 and 3 °C, annual evapotranspiration between 580 and 750 mm, elevation between 1800 and 2600 m, and aspect between 0 to 110° and 220 to 259.9°. Continuous monitoring of land cover changes and their driving factors in mountain heritage sites contributes to the protection of the ecological environment and provides data and information support for addressing climate change, resource management, and policy making.

6.
Gene ; 933: 148928, 2024 Sep 17.
Article in English | MEDLINE | ID: mdl-39265844

ABSTRACT

In this study, we redefine the diagnostic landscape of diabetic ulcers (DUs), a major diabetes complication. Our research uncovers new biomarkers linked to immunogenic cell death (ICD) in DUs by utilizing RNA-sequencing data of Gene Expression Omnibus (GEO) analysis combined with a comprehensive database interrogation. Employing a random forest algorithm, we have developed a diagnostic model that demonstrates improved accuracy in distinguishing DUs from normal tissue, with satisfactory results from ROC analysis. Beyond mere diagnosis, our model categorizes DUs into novel molecular classifications, which may enhance our comprehension of their underlying pathophysiology. This study bridges the gap between molecular insights and clinical practice. It sets the stage for transformative strategies in DUs management, marking a significant step forward in personalized medicine for diabetic patients.

7.
J Environ Manage ; 369: 122317, 2024 Oct.
Article in English | MEDLINE | ID: mdl-39217903

ABSTRACT

The growing use of information and communication technologies (ICT) has the potential to increase productivity and improve energy efficiency. However, digital technologies also consume energy, resulting in a complex relationship between digitalization and energy demand and an uncertain net effect. To steer digital transformation towards sustainability, it is crucial to understand the conditions under which digital technologies increase or decrease firm-level energy consumption. This study examines the drivers of this relationship, focusing on German manufacturing firms and leveraging comprehensive administrative panel data from 2009 to 2017, analyzed using the Generalized Random Forest algorithm. Our results reveal that the relationship between digitalization and energy use at the firm level is heterogeneous. However, we find that digitalization more frequently increases energy use, mainly driven by a rise in electricity consumption. This increase is lower in energy-intensive industries and higher in markets with low competition. Smaller firms in structurally weak regions show higher energy consumption growth than larger firms in economically stronger regions. Our study contributes to the literature by using a non-parametric method to identify specific firm-level and external characteristics that influence the impact of digital technologies on energy demand, highlighting the need for carefully designed digitalization policies to achieve climate goals.


Subject(s)
Electricity , Germany , Digital Technology , Industry
8.
Poult Sci ; 103(11): 104201, 2024 Nov.
Article in English | MEDLINE | ID: mdl-39197340

ABSTRACT

The differences in lipids in duck eggs between the 2 rearing systems during storage have not been fully studied. Herein, we propose untargeted lipidomics combined with a random forest (RF) algorithm to identify potential marker lipids based on ultra-performance liquid chromatography‒mass spectrometry (UPLPC-MS/MS). A total of 106 and 16 differential lipids (DL) were screened in egg yolk and white, respectively. In yolk, metabolic pathway analysis of DLs revealed that glycerophospholipid metabolism and sphingolipid metabolism were the key metabolic pathways in the traditional free-range system (TFS) during storage, glycosylphosphatidylinositol-anchored biosynthesis and glyceride metabolism were the key pathways in the floor-rearing system (FRS). In egg white, the key pathway in both systems is the biosynthesis of unsaturated fatty acids. Combined with RF algorithm, 12 marker lipids were screened during storage. Therefore, this study elucidates the changes in lipids in duck eggs during storage in 2 rearing systems and provides new ideas for screening marker lipids during storage. This approach is highly important for evaluating the quality of egg and egg products and provides guidance for duck egg production.


Subject(s)
Ducks , Lipidomics , Machine Learning , Animals , Lipidomics/methods , Animal Husbandry/methods , Food Storage , Algorithms , Egg Yolk/chemistry , Tandem Mass Spectrometry/veterinary , Ovum/chemistry , Egg White/chemistry , Lipids/analysis , Lipids/chemistry , Random Forest
9.
Sci Total Environ ; 949: 174724, 2024 Nov 01.
Article in English | MEDLINE | ID: mdl-39059649

ABSTRACT

Sustained deep emission reduction in road transportation is encountering bottleneck. The Intelligent Transportation-Speed Guidance System (ITSGS) is anticipated to overcome this challenge and facilitate the achievement of low-carbon and clean transportation. Here, we compiled vehicle emission datasets collected from real-world road experiments and identified the mapping relationships between four pollutants (CO2, CO, NOx, and THC) and their influencing factors through machine learning. We developed random forest models for each pollutant and achieved strong predictive performance, with an R2 exceeding 0.85 on the test dataset for all models. The environmental benefits of ITSGS at the urban scale were quantified by combining emission models with large-scale real trajectory data from Zibo, Shandong Province. Based on temporal and spatial analyses, we found that ITSGS has varying degrees of emission reduction potential during the morning peak, flat peak, and evening peak hours. Values can range from 5.71 %-8.16 % for CO2 emissions, 13.63 %-16.25 % for NOx emissions, 13.69 %-16.45 % for CO emissions, and 4.84-7.07 % for THC emissions, respectively. Additionally, ITSGS can significantly expand the area of low transient emission zones. The best time for achieving maximum environmental benefits from ITSGS is during the workday flat peak. ITSGS limits high-speed and aggressive driving behavior, thereby smoothing the driving trajectory, reducing the frequency of speed switches, and lowering road traffic emissions. The results of the ITSGS environmental benefits evaluation will provide new insights and solutions for sustainable road traffic emission reduction. SYNOPSIS: Large-scale deployment of Intelligent Transportation - Speed Guidance System is a sustainable solution to help achieve low-carbon and clean transportation.

10.
Sci Total Environ ; 945: 174093, 2024 Oct 01.
Article in English | MEDLINE | ID: mdl-38906307

ABSTRACT

Black carbon (BC) and brown carbon (BrC) over the high-altitude Tibetan Plateau (TP) can significantly influence regional and global climate change as well as glacial melting. However, obtaining plateau-scale in situ observations is challenging due to its high altitude. By integrating reanalysis data with on-site measurements, the spatial distribution of BC and BrC can be accurately estimated using the random forest algorithm (RF). In our study, the on-site observations of BC and BrC were successively conducted at four sites from 2018 to 2021. Ground-level BC and BrC concentrations were then obtained at a spatial resolution of 0.25° × 0.25° for three periods (including Periods-1980, 2000, and 2020) using RF and multi-source data. The highest annual concentrations of BC (1363.9 ± 338.7 ng/m3) and BrC (372.1 ± 96.2 ng/m3) were observed during Period-2000. BC contributed a dominant proportion of carbonaceous aerosol, with concentrations 3-4 times higher than those of BrC across the three periods. The ratios of BrC to BC decreased from Period-1980 to Period-2020, indicating the increasing importance of BC over the TP. Spatial distributions of plateau-scale BC and BrC concentrations showed heightened levels in the southeastern TP, particularly during Period-2000. These findings significantly enhance our understanding of the spatio-temporal distribution of light-absorbing carbonaceous aerosol over the TP.

11.
Appl Environ Microbiol ; 90(7): e0048224, 2024 07 24.
Article in English | MEDLINE | ID: mdl-38832775

ABSTRACT

Wood-rotting fungi play an important role in the global carbon cycle because they are the only known organisms that digest wood, the largest carbon stock in nature. In the present study, we used linear discriminant analysis and random forest (RF) machine learning algorithms to predict white- or brown-rot decay modes from the numbers of genes encoding Carbohydrate-Active enZymes with over 98% accuracy. Unlike other algorithms, RF identified specific genes involved in cellulose and lignin degradation, including auxiliary activities (AAs) family 9 lytic polysaccharide monooxygenases, glycoside hydrolase family 7 cellobiohydrolases, and AA family 2 peroxidases, as critical factors. This study sheds light on the complex interplay between genetic information and decay modes and underscores the potential of RF for comparative genomics studies of wood-rotting fungi. IMPORTANCE: Wood-rotting fungi are categorized as either white- or brown-rot modes based on the coloration of decomposed wood. The process of classification can be influenced by human biases. The random forest machine learning algorithm effectively distinguishes between white- and brown-rot fungi based on the presence of Carbohydrate-Active enZyme genes. These findings not only aid in the classification of wood-rotting fungi but also facilitate the identification of the enzymes responsible for degrading woody biomass.


Subject(s)
Machine Learning , Wood , Wood/microbiology , Algorithms , Fungal Proteins/genetics , Fungal Proteins/metabolism , Lignin/metabolism , Carbohydrate Metabolism , Fungi/genetics , Fungi/enzymology , Fungi/classification , Cellulose/metabolism , Random Forest
12.
Anal Sci ; 40(9): 1709-1722, 2024 Sep.
Article in English | MEDLINE | ID: mdl-38836970

ABSTRACT

Coal is the primary energy source in China, widely used in energy production, industrial processes, and chemical engineering. Due to the complexity and diversity of coal quality, there is an urgent need for new technologies to achieve rapid and accurate detection and analysis of coal, aiming to improve coal resource utilization and reduce pollutant emissions. This study proposes a rapid quantitative analysis of coal using laser-induced breakdown spectroscopy combined with the random forest algorithm. Firstly, a Q-switched Nd: YAG laser at 1064 nm was employed to ablate coal samples, generating plasma, and spectral data were collected using a spectrometer. Secondly, the study explores the impact of different parameters in the preprocessing method (wavelet transform) on the predictive performance of the random forest model. It identifies elements related to coal ash content and calorific value along with their spectral information. Subsequently, to further validate the predictive performance of the model, a comparison is made with models established using support vector machine, artificial neural network, and partial least squares. Finally, under optimal parameters for spectral information preprocessing (wavelet transform with Db4 as the base function and 3 decomposition levels), a model combining wavelet transform with Random Forest is established to predict and analyze the ash content and calorific value of coal. The results demonstrate that the Wavelet Transform-Random Forest model exhibits excellent predictive performance (coal ash content: R2 = 0.9470, RMSECV = 4.8594, RMSEP = 4.8450; coal calorific value: R2 = 0.9485, RMSECV = 1.5996, RMSEP = 1.5949). Therefore, laser-induced breakdown spectroscopy combined with the random forest algorithm is an effective method for rapid and accurate detection and analysis of coal. The predicted coal composition values show high accuracy, providing insights and methods for coal composition monitoring and analysis.

13.
Sci Rep ; 14(1): 11690, 2024 05 22.
Article in English | MEDLINE | ID: mdl-38778144

ABSTRACT

This study explores the progression of intracerebral hemorrhage (ICH) in patients with mild to moderate traumatic brain injury (TBI). It aims to predict the risk of ICH progression using initial CT scans and identify clinical factors associated with this progression. A retrospective analysis of TBI patients between January 2010 and December 2021 was performed, focusing on initial CT evaluations and demographic, comorbid, and medical history data. ICH was categorized into intraparenchymal hemorrhage (IPH), petechial hemorrhage (PH), and subarachnoid hemorrhage (SAH). Within our study cohort, we identified a 22.2% progression rate of ICH among 650 TBI patients. The Random Forest algorithm identified variables such as petechial hemorrhage (PH) and countercoup injury as significant predictors of ICH progression. The XGBoost algorithm, incorporating key variables identified through SHAP values, demonstrated robust performance, achieving an AUC of 0.9. Additionally, an individual risk assessment diagram, utilizing significant SHAP values, visually represented the impact of each variable on the risk of ICH progression, providing personalized risk profiles. This approach, highlighted by an AUC of 0.913, underscores the model's precision in predicting ICH progression, marking a significant step towards enhancing TBI patient management through early identification of ICH progression risks.


Subject(s)
Brain Injuries, Traumatic , Disease Progression , Machine Learning , Humans , Male , Female , Brain Injuries, Traumatic/diagnostic imaging , Brain Injuries, Traumatic/pathology , Brain Injuries, Traumatic/complications , Middle Aged , Retrospective Studies , Adult , Cerebral Hemorrhage/diagnostic imaging , Cerebral Hemorrhage/pathology , Tomography, X-Ray Computed , Aged , Risk Assessment/methods
14.
J Environ Manage ; 360: 121212, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38801803

ABSTRACT

This study investigates the impact of green finance (GF) and green innovation (GI) on corporate credit rating (CR) performance in Chinese A-share listed firms from 2018 to 2021. The least absolute shrinkage and selection operators (LASSOs) machine learning algorithms are first used to select the critical drivers of corporate credit performance. Then, we applied partialing-out LASSO linear regression (POLR) and double selection LASSO linear regression (DSLR) machine learning techniques to check the impact of GF and GI on CR. The main results reveal that a 1% increase in GF diminishes CR by 0.26%, whereas GI promotes CR performance by 0.15%. Moreover, the heterogeneity analysis reveals a more significant negative effect of GF on the CR performance of heavily polluting firms, non-state-owned enterprises, and firms in the Western region. The findings raise policies for managing green finance and encouraging green innovation formation, as well as addressing company heterogeneity to support sustainability.


Subject(s)
Machine Learning , Algorithms , China
15.
Curr Biol ; 34(12): 2558-2569.e3, 2024 Jun 17.
Article in English | MEDLINE | ID: mdl-38776900

ABSTRACT

Herbivorous insects consume a large proportion of the energy flow in terrestrial ecosystems and play a major role in the dynamics of plant populations and communities. However, high-resolution, quantitative predictions of the global patterns of insect herbivory and their potential underlying drivers remain elusive. Here, we compiled and analyzed a dataset consisting of 9,682 records of the severity of insect herbivory from across natural communities worldwide to quantify its global patterns and environmental determinants. Global mapping revealed strong spatial variation in insect herbivory at the global scale, showing that insect herbivory did not significantly vary with latitude for herbaceous plants but increased with latitude for woody plants. We found that the cation-exchange capacity in soil was a main predictor of levels of herbivory on herbaceous plants, while climate largely determined herbivory on woody plants. We next used well-established scenarios for future climate change to forecast how spatial patterns of insect herbivory may be expected to change with climate change across the world. We project that herbivore pressure will intensify on herbaceous plants worldwide but would likely only increase in certain biomes (e.g., northern coniferous forests) for woody plants. Our assessment provides quantitative evidence of how environmental conditions shape the spatial pattern of insect herbivory, which enables a more accurate prediction of the vulnerabilities of plant communities and ecosystem functions in the Anthropocene.


Subject(s)
Climate Change , Herbivory , Insecta , Animals , Insecta/physiology , Ecosystem
16.
J Hazard Mater ; 473: 134693, 2024 Jul 15.
Article in English | MEDLINE | ID: mdl-38781855

ABSTRACT

Persistent cadmium exposure poses significant health risks to the Chinese population, underscored by its prevalence as an environmental contaminant. This study leverages a machine-learning model, fed with a comprehensive dataset of environmental and socio-economic factors, to delineate trends in cadmium exposure from 1980 to 2040. We uncovered that urinary cadmium levels peaked at 1.09 µg/g Cr in the mid-2000 s. Encouragingly, a decline is projected to 0.92 µg/g Cr by 2025, tapering further to 0.87 µg/g Cr by 2040. Despite this trend, regions heavily influenced by industrialization, such as Hunan and Guizhou, as well as industrial counties in Jilin, report stubbornly high levels of exposure. Our demographic analysis reveals a higher vulnerability among adults & adolescents over 14, with males displaying elevated cadmium concentrations. Alarmingly, the projected data suggests that by 2040, an estimated 41% of the population will endure exposure beyond the safety threshold set by the European Food Safety Authority. Our research indicates disproportionate cadmium exposure impacts, necessitating targeted interventions and policy reforms to protect vulnerable groups and public health in China.


Subject(s)
Cadmium , Environmental Exposure , Cadmium/urine , China , Humans , Male , Adult , Adolescent , Young Adult , Female , Environmental Exposure/analysis , Child , Middle Aged , Child, Preschool , Environmental Pollutants/urine , Aged , Infant , Spatio-Temporal Analysis , Machine Learning
17.
Int J Gen Med ; 17: 2299-2309, 2024.
Article in English | MEDLINE | ID: mdl-38799198

ABSTRACT

Objective: This study aimed to explore specific biochemical indicators and construct a risk prediction model for diabetic kidney disease (DKD) in patients with type 2 diabetes (T2D). Methods: This study included 234 T2D patients, of whom 166 had DKD, at the First Hospital of Jilin University from January 2021 to July 2022. Clinical characteristics, such as age, gender, and typical hematological parameters, were collected and used for modeling. Five machine learning algorithms [Extreme Gradient Boosting (XGBoost), Gradient Boosting Machine (GBM), Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF)] were used to identify critical clinical and pathological features and to build a risk prediction model for DKD. Additionally, clinical data from 70 patients (nT2D = 20, nDKD = 50) were collected for external validation from the Third Hospital of Jilin University. Results: The RF algorithm demonstrated the best performance in predicting progression to DKD, identifying five major indicators: estimated glomerular filtration rate (eGFR), glycated albumin (GA), Uric acid, HbA1c, and Zinc (Zn). The prediction model showed sufficient predictive accuracy with area under the curve (AUC) values of 0.960 (95% CI: 0.936-0.984) and 0.9326 (95% CI: 0.8747-0.9885) in the internal validation set and external validation set, respectively. The diagnostic efficacy of the RF model (AUC = 0.960) was significantly higher than each of the five features screened with the highest feature importance in the RF model. Conclusion: The online DKD risk prediction model constructed using the RF algorithm was selected based on its strong performance in the internal validation.

18.
Environ Sci Technol ; 58(17): 7270-7278, 2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38625742

ABSTRACT

Lead poisoning is globally concerning, yet limited testing hinders effective interventions in most countries. We aimed to create annual maps of county-specific blood lead levels in China from 1980 to 2040 using a machine learning model. Blood lead data from China were sourced from 1180 surveys published between 1980 and 2022. Additionally, regional statistical figures for 15 natural and socioeconomic variables were obtained or estimated as predictors. A machine learning model, using the random forest algorithm and 2973 generated samples, was created to predict county-specific blood lead levels in China from 1980 to 2040. Geometric mean blood lead levels in children (i.e., age 14 and under) decreased significantly from 104.4 µg/L in 1993 to an anticipated 40.3 µg/L by 2040. The number exceeding 100 µg/L declined dramatically, yet South Central China remains a hotspot. Lead exposure is similar among different groups, but overall adults and adolescents (i.e., age over 14), females, and rural residents exhibit slightly lower exposure compared to that of children, males, and urban residents, respectively. Our predictions indicated that despite the general reduction, one-fourth of Chinese counties rebounded during 2015-2020. This slower decline might be due to emerging lead sources like smelting and coal combustion; however, the primary factor driving the decline should be the reduction of a persistent source, legacy gasoline-derived lead. Our approach innovatively maps lead exposure without comprehensive surveys.


Subject(s)
Lead , Machine Learning , Lead/blood , China , Humans , Female , Male , Child , Adolescent , Environmental Exposure , Lead Poisoning/epidemiology , Lead Poisoning/blood
19.
Prev Med Rep ; 41: 102710, 2024 May.
Article in English | MEDLINE | ID: mdl-38576513

ABSTRACT

Objectives: To enhance the daily training quality of athletes without inducing significant physiological fatigue, aiming to achieve a balance between training efficiency and load. Design methods: Firstly, we developed an activity classification training model using the random forest algorithm and introduced the "effective training rate" (the ratio of effective activity time to total time) as a metric for assessing athlete training efficiency. Secondly, a method for rating athlete training load was established, involving qualitative and quantitative analyses of physiological fatigue through subjective fatigue scores and heart rate data. Lastly, an optimization system for training efficiency and load balance, utilizing multiple inertial sensors, was created. Athlete states were categorized into nine types based on the training load and efficiency ratings, with corresponding management recommendations provided. Results: Overall, this study, combining a sports activity recognition model with a physiological fatigue assessment model, has developed a training efficiency and load balance optimization system with excellent performance. The results indicate that the prediction accuracy of the sports activity recognition model is as high as 94.70%. Additionally, the physiological fatigue assessment model, utilizing average relative heart rate and average RPE score as evaluation metrics, demonstrates a good overall fit, validating the feasibility of this model. Conclusions: This study, based on relative heart rate and wearable devices to monitor athlete physiological fatigue, has developed a balanced optimization system for training efficiency and load. It provides a reference for athletes' physical health and fatigue levels, offering corresponding management recommendations for coaches and relevant professionals.

20.
Sci Total Environ ; 925: 171366, 2024 May 15.
Article in English | MEDLINE | ID: mdl-38438049

ABSTRACT

As a stepped cross section of farmland built along the contour lines, terrace is widely distributed on hill-slopes. It changes the original surface slope and runoff coefficient, reduces soil nutrient loss, and has become the most important soil erosion control measure in China. Accurate terrace mapping at regional scale is crucial for soil conservation, agriculture sustainability and ecological planning. Due to the influence of cloudy and rainy weather, poor data availability makes it difficult to identify terrace distribution only using optical remote sensing images in mountainous areas. In this study, we incorporated multi-spectral optical and SAR data, features of terrain, texture and time sequence information, and proposed a pixel-based supervised classification method based on sample purification strategy to obtain a 10 m resolution terraced map in a plateau mountainous region. With 610 terrace/non-terrace validation sample data, 10-fold cross-validation was used to test the classification results. For identified terrace, the values of Overall Accuracy (OA), Producer's Accuracy (PA) and User's Accuracy (UA) stay stable above 90 %, the F1 score and Kappa coefficient show the smallest fluctuation and is stable in the range of 0.90-0.93 and 0.81-0.87, respectively. The accuracy evaluation of grid units show that the uncertainty of the terrace distribution is mainly concentrated in the north and south of the study area. Slope cultivated land, low-slope terrace and non-agricultural vegetation are easily mixed due to the heterogeneity of terrace features and the spectrum similarity among these land types. It should be noted that the features of time series and texture play a key role in the terrace recognition process, rather than terrain factors, which is different from previous studies. The sample purification strategy proposed provides a more reliable regional scale terrace distribution map compared to the existing product, and is potentially transferable to other mountainous areas as a robust approach for rapid identification of terrace.

SELECTION OF CITATIONS
SEARCH DETAIL