Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 128
Filter
1.
Sci Total Environ ; 954: 176256, 2024 Sep 20.
Article in English | MEDLINE | ID: mdl-39299317

ABSTRACT

Modeling nitrate fate and transport in water sources is an essential component of predictive water quality management. Both mechanistic and data-driven models are currently in use. Mechanistic models, such as SWAT, simulate daily nitrate loads based on the results of simulating water flow. Data-driven models allow one to simulate nitrate loads and water flow independently. Performance of SWAT and deep learning model was evaluated in cases when deep learning model is used in (a) independent simulations of flow series and nitrate concentration series, and (b) in both flow rate and concentration simulations to obtain nitrate load values. The data were collected at the Tuckahoe Creek watershed in Maryland, United States. The data-driven deep learning model was built using long-short-term-memory (LSTM) and three-dimensional convolutional networks (3D Convolutional Networks) to simulate flow rate and nitrate concentration using weather data and imagery to derive leaf area index according to land use. Models were calibrated with data over training period 2014-2017 and validated with data over testing period. SWAT Nash-Sutcliffe efficiency (NSE) was 0.31 and 0.40 for flow rate and -0.26 and -0.18 for the nitrate load rate over training and testing periods, respectively. Three data-driven modeling scenarios were implemented: (1) using the observed flow rate and simulated nitrate concentration, (2) using the simulated flow rate and observed nitrate concentration, and (3) using the simulated flow rate and nitrate concentration. The deep learning model performed better than SWAT in all three scenarios with NSE from 0.49 to 0.58 for training and from 0.28 to 0.80 for testing periods with scenario 1 showing the best results. The difference in performance was most pronounced in fall and winter seasons. The deep learning modeling can be an efficient alternative to mechanistic watershed-scale water quality models provided the regular high-frequency data collection is implemented.

2.
Water Res ; 266: 122401, 2024 Sep 06.
Article in English | MEDLINE | ID: mdl-39265215

ABSTRACT

Given the frequent association between freshwater plankton and water quality degradation, several predictive models have been devised to understand and estimate their dynamics. However, the significance of biotic and abiotic interactions has been overlooked. In this study, we aimed to address the importance of the interaction term in predicting plankton community dynamics by applying graph convolution embedded long short-term memory networks (GC-LSTM) models, which can incorporate interaction terms as graph signals. Temporal graph series comprising plankton genera or environmental drivers as node features and their relationships for edge features from two distinct water bodies, a reservoir and a river, were utilized to develop these models. To assess the predictability, the performances of the GC-LSTM models on community dynamics were compared those of LSTM and GCN models at various lead times. Moreover, GNNExplainer was used to examine the global and local importance of the nodes and edges for all predictions and specific predictions, respectively. The GC-LSTM models outperformed the LSTM models, consistently showing higher prediction accuracy. Although all the models exhibited performance degradation at longer lead times, the GC-LSTM models consistently demonstrated better performance regarding each graph signal and plankton genus. GNNExplainer yielded interpretable explanations for important genera and interaction pairs among communities, revealing consistent importance patterns across different lead times at both global and local scales. These findings underscore the potential of the proposed modeling approach for forecasting community dynamics and emphasize the critical role of graph signals with interaction terms in plankton communities.

3.
Water Res ; 266: 122404, 2024 Sep 06.
Article in English | MEDLINE | ID: mdl-39276478

ABSTRACT

Groundwater salinization is a prevalent issue in coastal regions, yet accurately predicting and understanding its causal factors remains challenging due to the complexity of the groundwater system. Therefore, this study predicted groundwater salinity in multi-layered aquifers spanning the entire Mekong Delta (MD) region using machine learning (ML) models based on an in situ dataset and using three indicators (Cl-, pH, and HCO3-). We applied nine different decision tree-based models and evaluated their prediction performances. The models were trained using 13 input variables: weather (2), hydrogeological conditions (4), water levels (3), groundwater usage (2), and relative distance from water sources (2). Subsequently, by employing model interpretation techniques, we quantified the significance of factors within the model prediction. Performance evaluations of the ML models demonstrated that the Extra Trees model exhibited superior performance and demonstrated generalization capabilities in predicting Cl- concentration, whereas the Bagging and Random Forest models outperformed the other models in predicting pH and HCO3- concentration. The coefficients of determination were determined to be 0.94, 0.67, and 0.78 for Cl-, pH, and HCO3-, respectively Additionally, the model interpretation effectively identified significant factors that depended on the target variables and aquifers. In particular, salinity indicators and aquifers that were strongly influenced by the artificial usage of groundwater were identified. Therefore, our research, which provides accurate spatial predictions and interpretations of groundwater salinity in the MD, has the potential to establish a foundation for formulating effective groundwater management policies to control groundwater salinization.

4.
J Hazard Mater ; 478: 135285, 2024 Oct 05.
Article in English | MEDLINE | ID: mdl-39121738

ABSTRACT

The distribution coefficient (Kd) plays a crucial role in predicting the migration behavior of radionuclides in the soil environment. However, Kd depends on the complexities of geological and environmental factors, and existing models often do not reflect the unique soil properties. We propose a multimodal technique to predict Kd values for radionuclide adsorption in soils surrounding nuclear facilities in Republic of Korea. We integrated and trained three sub-networks reflecting different data domains: soil adsorption factors for physicochemical conditions, X-ray fluorescence (XRF) data, and X-ray diffraction (XRD) spectra for inherent soil properties. Our multimodal model achieved high performance, with a coefficient of determination (R2) of 0.84 and root mean squared error (RMSE) of 0.89 for natural log-transformed Kd. This is the first study to develop a multimodal model that simultaneously incorporates inherent soil properties and adsorption factors to predict Kd. We investigated influential peaks in XRD spectra and also revealed that pH and calcium oxide (CaO) were significant variables in soil adsorption factors and XRF data, respectively. These results promote the use of a multimodal model to predict Kd values by integrating data from different domains, providing a cost-effective and novel approach to elucidate the mechanisms of radionuclide adsorption in soil.

5.
Water Res ; 261: 122067, 2024 Sep 01.
Article in English | MEDLINE | ID: mdl-39003877

ABSTRACT

The abatement of micropollutants by ozonation can be accurately calculated by measuring the exposures of molecular ozone (O3) and hydroxyl radical (•OH) (i.e., ∫[O3]dt and ∫[•OH]dt). In the actual ozonation process, ∫[O3]dt values can be calculated by monitoring the O3 decay during the process. However, calculating ∫[•OH]dt is challenging in the field, which necessitates developing models to predict ∫[•OH]dt from measurable parameters. This study demonstrates the development of machine learning models to predict ∫[•OH]dt (the output variable) from five basic input variables (pH, dissolved organic carbon concentration, alkalinity, temperature, and O3 dose) and two optional ones (∫[O3]dt and instantaneous ozone demand, IOD). To develop the models, four different machine learning methods (random forest, support vector regression, artificial neural network, and Gaussian process regression) were employed using the input and output variables measured (or determined) in 130 different natural water samples. The results indicated that incorporating ∫[O3]dt as an input variable significantly improved the accuracy of prediction models, increasing overall R2 by 0.01-0.09, depending on the machine learning method. This suggests that ∫[O3]dt plays a crucial role as a key variable reflecting the •OH-yielding characteristics of dissolved organic matter. Conversely, IOD had a minimal impact on the accuracy of the prediction models. Generally, machine-learning-based prediction models outperformed those based on the response surface methodology developed as a control. Notably, models utilizing the Gaussian process regression algorithm demonstrated the highest coefficients of determination (overall R2 = 0.91-0.95) among the prediction models.


Subject(s)
Hydroxyl Radical , Machine Learning , Ozone , Ozone/chemistry , Hydroxyl Radical/chemistry , Kinetics , Water Purification/methods
6.
Water Res ; 262: 122086, 2024 Sep 15.
Article in English | MEDLINE | ID: mdl-39032338

ABSTRACT

Artificial intelligence has been employed to simulate and optimize the performance of membrane capacitive deionization (MCDI), an emerging ion separation process. However, a real-time control for optimal MCDI operation has not been investigated yet. In this study, we aimed to develop a reinforcement learning (RL)-based control model and investigate the model to find an energy-efficient MCDI operation strategy. To fulfill the objectives, we established three long-short term memory models to predict applied voltage, outflow pH, and outflow electrical conductivity. Also, four RL agents were trained to minimize outflow concentration and energy consumption simultaneously. Consequently, actor-critic (A2C) and proximal policy optimization (PPO2) achieved the ion separation goal (<0.8 mS/cm) as they determined the electrical current and pump speed to be low. Particularly, A2C kept the parameters consistent in charging MCDI, which caused lower energy consumption (0.0128 kWh/m3) than PPO2 (0.0363 kWh/m3). To understand the decision-making process of A2C, the Shapley additive explanation based on the decision tree model estimated the influence of input parameters on the control parameters. The results of this study demonstrate the feasibility of RL-based controls in MCDI operations. Thus, we expect that the RL-based control model can improve further and enhance the efficiency of water treatment technologies.


Subject(s)
Membranes, Artificial , Water Purification/methods , Models, Theoretical , Artificial Intelligence , Electric Conductivity
7.
Water Res ; 262: 122092, 2024 Sep 15.
Article in English | MEDLINE | ID: mdl-39032339

ABSTRACT

Owing to its simplicity of measurement, effluent conductivity is one of the most studied factors in evaluations of desalination performance based on the ion concentrations in various ion adsorption processes such as capacitive deionization (CDI) or battery electrode deionization (BDI). However, this simple conversion from effluent conductivity to ion concentration is often incorrect, thereby necessitating a more congruent method for performing real-time measurements of effluent ion concentrations. In this study, a random forest (RF)-based artificial intelligence (AI) model was developed to address this shortcoming. The proposed RF model showed an excellent prediction accuracy when it was first validated in predicting the effluent conductivity for both CDI (R2 = 0.86) and BDI (R2 = 0.95) data. Moreover, the RF model successfully predicted the concentration of each ion (Na⁺, K⁺, Ca2⁺, and Cl⁻) from the conductivity values. The accuracy of the ion concentration prediction was even higher than that of the effluent conductivity prediction, likely owing to the linear correlation between the input and output variables of the dataset. The effect of the sampling interval was also evaluated for conductivity and ion concentrations, and there was no significant difference up to sampling intervals of <80 s based on the error value of the model. These findings suggest that an RF model can be used to predict ion concentrations in CDI/BDI, which may be used as core indicators in evaluating desalination performance.


Subject(s)
Artificial Intelligence , Electric Conductivity , Electrodes , Ions , Water Purification , Water Purification/methods , Models, Theoretical , Electric Power Supplies
8.
Water Res X ; 23: 100228, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38872710

ABSTRACT

The impacts of climate change on hydrology underscore the urgency of understanding watershed hydrological patterns for sustainable water resource management. The conventional physics-based fully distributed hydrological models are limited due to computational demands, particularly in the case of large-scale watersheds. Deep learning (DL) offers a promising solution for handling large datasets and extracting intricate data relationships. Here, we propose a DL modeling framework, incorporating convolutional neural networks (CNNs) to efficiently replicate physics-based model outputs at high spatial resolution. The goal was to estimate groundwater head and surface water depth in the Sabgyo Stream Watershed, South Korea. The model datasets consisted of input variables, including elevation, land cover, soil type, evapotranspiration, rainfall, and initial hydrological conditions. The initial conditions and target data were obtained from the fully distributed hydrological model HydroGeoSphere (HGS), whereas the other inputs were actual measurements in the field. By optimizing the training sample size, input design, CNN structure, and hyperparameters, we found that CNNs with residual architectures (ResNets) yielded superior performance. The optimal DL model reduces computation time by 45 times compared to the HGS model for monthly hydrological estimations over five years (RMSE 2.35 and 0.29 m for groundwater and surface water, respectively). In addition, our DL framework explored the predictive capabilities of hydrological responses to future climate scenarios. Although the proposed model is cost-effective for hydrological simulations, further enhancements are needed to improve the accuracy of long-term predictions. Ultimately, the proposed DL framework has the potential to facilitate decision-making, particularly in large-scale and complex watersheds.

9.
Water Res ; 260: 121861, 2024 Aug 15.
Article in English | MEDLINE | ID: mdl-38875854

ABSTRACT

The rapid and efficient quantification of Escherichia coli concentrations is crucial for monitoring water quality. Remote sensing techniques and machine learning algorithms have been used to detect E. coli in water and estimate its concentrations. The application of these approaches, however, is challenged by limited sample availability and unbalanced water quality datasets. In this study, we estimated the E. coli concentration in an irrigation pond in Maryland, USA, during the summer season using demosaiced natural color (red, green, and blue: RGB) imagery in the visible and infrared spectral ranges, and a set of 14 water quality parameters. We did this by deploying four machine learning models - Random Forest (RF), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGB), and K-nearest Neighbor (KNN) - under three data utilization scenarios: water quality parameters only, combined water quality and small unmanned aircraft system (sUAS)-based RGB data, and RGB data only. To select the training and test datasets, we applied two data-splitting methods: ordinary and quantile data splitting. These methods provided a constant splitting ratio in each decile of the E. coli concentration distribution. Quantile data splitting resulted in better model performance metrics and smaller differences between the metrics for both the training and testing datasets. When trained with quantile data splitting after hyperparameter optimization, models RF, GBM, and XGB had R2 values above 0.847 for the training dataset and above 0.689 for the test dataset. The combination of water quality and RGB imagery data resulted in a higher R2 value (>0.896) for the test dataset. Shapley additive explanations (SHAP) of the relative importance of variables revealed that the visible blue spectrum intensity and water temperature were the most influential parameters in the RF model. Demosaiced RGB imagery served as a useful predictor of E. coli concentration in the studied irrigation pond.


Subject(s)
Agricultural Irrigation , Escherichia coli , Machine Learning , Ponds , Water Quality , Ponds/microbiology , Water Microbiology , Environmental Monitoring/methods , Maryland
10.
Chemosphere ; 352: 141402, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38346509

ABSTRACT

Urban surface runoff contains chemicals that can negatively affect water quality. Urban runoff studies have determined the transport dynamics of many legacy pollutants. However, less attention has been paid to determining the first-flush effects (FFE) of emerging micropollutants using suspect and non-target screening (SNTS). Therefore, this study employed suspect and non-target analyses using liquid chromatography-high resolution mass spectrometry to detect emerging pollutants in urban receiving waters during stormwater events. Time-interval sampling was used to determine occurrence trends during stormwater events. Suspect screening tentatively identified 65 substances, then, their occurrence trend was grouped using correlation analysis. Non-target peaks were prioritized through hierarchical cluster analysis, focusing on the first flush-concentrated peaks. This approach revealed 38 substances using in silico identification. Simultaneously, substances identified through homologous series observation were evaluated for their observed trends in individual events using network analysis. The results of SNTS were normalized through internal standards to assess the FFE, and the most of tentatively identified substances showed observed FFE. Our findings suggested that diverse pollutants that could not be covered by target screening alone entered urban water through stormwater runoff during the first flush. This study showcases the applicability of the SNTS in evaluating the FFE of urban pollutants, offering insights for first-flush stormwater monitoring and management.


Subject(s)
Environmental Pollutants , Water Pollutants, Chemical , Water Pollutants, Chemical/analysis , Rain , Environmental Monitoring/methods , Water Movements , Environmental Pollutants/analysis , Mass Spectrometry
11.
Chemosphere ; 352: 141462, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38364923

ABSTRACT

The migration and retention of radioactive contaminants such as 137Cesium (137Cs) in various environmental media pose significant long-term storage challenges for nuclear waste. The distribution coefficient (Kd) is a critical parameter for assessing the mobility of radioactive contaminants and is influenced by various environmental conditions. This study presents machine-learning models based on the Japan Atomic Energy Agency Sorption Database (JAEA-SDB) to predict the Kd values for Cs in solid phase groups. We used three different machine learning models: random forest (RF), artificial neural network (ANN), and convolutional neural network (CNN). The models were trained on 14 input variables from the JAEA-SDB, including factors such as the Cs concentration, solid-phase properties, and solution conditions, which were preprocessed by normalization and log-transformation. The performances of the models were evaluated using the coefficient of determination (R2) and root mean squared error (RMSE). The RF, ANN, and CNN models achieved R2 values greater than 0.97, 0.86, and 0.88, respectively. We also analyzed the variable importance of RF using an out-of-bag (OOB) and a CNN with an attention module. Our results showed that the environmental media, initial radionuclide concentration, solid phase properties, and solution conditions were significant variables for Kd prediction. Our models accurately predict Kd values for different environmental conditions and can assess the environmental risk by analyzing the behavior of radionuclides in solid phase groups. The results of this study can improve safety analyses and long-term risk assessments related to waste disposal and prevent potential hazards and sources of contamination in the surrounding environment.


Subject(s)
Cesium , Radioactive Waste , Cesium/analysis , Cesium Radioisotopes/analysis , Radioactive Waste/analysis , Japan
12.
J Hazard Mater ; 468: 133762, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38402678

ABSTRACT

Assessing the cyanobacteria disinfection in sewage and its compliance with international-standards requires determining the concentration and viability, which can be achieve using Imaging Flow Cytometry device called FlowCAM. The objective is to thoroughly investigate the sonolytic morphological changes and disinfection-performance towards toxic cyanobacteria existing in sewage using the FlowCAM. After optimizing the process conditions, over 80% decline in cyanobacterial cell counts was observed, accompanied by an additional 10-15% of cells exhibiting injuries, as confirmed through morphological investigation. Moreover, for the first time, the experimentally collected data was utilized to build deep-learning probabilistic-neural-networks (PNN) and natural-gradient-boosting (NGBoost) models for predicting disinfection efficiency and ABD area as target outputs. The findings suggest that the NGBoost model exhibited superior prediction performance for both targets, with high test coefficient of determination (R2 > 0.87) and lower test errors (RMSE < 7.10, MAE < 4.14). The confidence interval examination in NGBoost prediction performance showed a minute variation from the experimentally calculated values, suggesting a high accuracy in model prediction. Finally, SHAP analysis suggests the sonolytic time alone contributes around 50% to the cyanobacteria disinfection. Overall, the findings demonstrate the effectiveness of the FlowCAM device and the potential of machine-learning modeling in predicting disinfection outcomes.


Subject(s)
Cyanobacteria , Wastewater , Disinfection , Sewage , Machine Learning
13.
J Hazard Mater ; 465: 132995, 2024 Mar 05.
Article in English | MEDLINE | ID: mdl-38039815

ABSTRACT

Photocatalytic reactions with semiconductor-based photocatalysts have been investigated extensively for application to wastewater treatment, especially dye degradation, yet the interactions between different process parameters have rarely been reported due to their complicated reaction mechanisms. Hence, this study aims to discern the impact of each factor, and each interaction between multiple factors on reaction rate constant (k) using a decision tree model. The dyes selected as target pollutants were indigo and malachite green, and 5 different semiconductor-based photocatalysts with 17 different compositions were tested, which generated 34 input features and 1527 data points. The Boruta Shapley Additive exPlanations (SHAP) feature selection for the 34 inputs found that 11 inputs were significantly important. The decision tree model exhibited for 11 input features with an R2 value of 0.94. The SHAP feature importance analysis suggested that photocatalytic experimental conditions, with an importance of 59%, was the most important input category, followed by atomic composition (39%) and physicochemical properties (2%). Additionally, the effects on k of the synergy between the metal cocatalysts and important experimental conditions were confirmed by two feature SHAP dependence plots, regardless of importance order. This work provides insight into the single and multiple factors that affect reaction rate and mechanism.

14.
Water Res ; 249: 120928, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38043354

ABSTRACT

Climate warming is linked to earlier onset and extended duration of cyanobacterial blooms in temperate rivers. This causes an unpredictable extent of harm to the functioning of the ecosystem and public health. We used Microcystis spp. cell density data monitored for seven years (2016-2022) in ten sites across four temperate rivers of the Republic of Korea to define the phenology of cyanobacterial blooms and elucidate the climatic effect on their pattern. The day of year marking the onset, peak, and end of Microcystis growth were estimated using a Weibull function, and linear mixed-effect models were employed to analyze their relationships with environmental variables. These models identified river-specific temperatures at the beginning and end dates of cyanobacterial blooms. Furthermore, the most realistic models were employed to project future Microcystis bloom phenology, considering downscaled and quantile-mapped regional air temperatures from a general circulation model. Daily minimum and daily maximum air temperatures (mintemp and maxtemp) primarily drove the timing of the beginning and end of the bloom, respectively. The models successfully captured the spatiotemporal variations of the beginning and end dates, with mintemp and maxtemp predicted to be 24℃ (R2 = 0.68) and 16℃ (R2 = 0.35), respectively. The beginning and end dates were projected to advance considerably in the future under the Representative Concentration Pathway 2.6, 4.5, and 8.5. The simulations suggested that the largest uncertainty lies in the timing of when the bloom ends, whereas the timing of when blooming begins has less variation. Our study highlights the dependency of cyanobacterial bloom phenology on temperatures and earlier and prolonged bloom development.


Subject(s)
Cyanobacteria , Microcystis , Climate Change , Temperature , Rivers , Ecosystem , Lakes/microbiology , Eutrophication
15.
J Hazard Mater ; 465: 133196, 2024 03 05.
Article in English | MEDLINE | ID: mdl-38141299

ABSTRACT

Biological early warning system (BEWS) has been globally used for surface water quality monitoring. Despite its extensive use, BEWS has exhibited limitations, including difficulties in biological interpretation and low alarm reproducibility. This study addressed these issues by applying machine learning (ML) models to eight years of in-situ BEWS data for Daphnia magna. Six ML models were adopted to predict contamination alarms from Daphnia behavioral parameters. The light gradient boosting machine model demonstrated the most significant improvement in predicting alarms from Daphnia behaviors. Compared with the traditional BEWS alarm index, the ML model enhanced the precision and recall by 29.50% and 43.41%, respectively. The speed distribution index and swimming speed were significant parameters for predicting water quality warnings. The nonlinear relationships between the monitored Daphnia behaviors and water physicochemical water quality parameters (i.e., flow rate, Chlorophyll-a concentration, water temperature, and conductivity) were identified by ML models for simulating Daphnia behavior based on the water contaminants. These findings suggest that ML models have the potential to establish a robust framework for advancing the predictive capabilities of BEWS, providing a promising avenue for real-time and accurate assessment of water quality. Thereby, it can contribute to more proactive and effective water quality management strategies.


Subject(s)
Water Pollutants, Chemical , Water Quality , Animals , Daphnia magna , Reproducibility of Results , Swimming , Daphnia , Water Pollutants, Chemical/pharmacology
16.
Sci Total Environ ; 912: 169540, 2024 Feb 20.
Article in English | MEDLINE | ID: mdl-38145679

ABSTRACT

Recent advances in remote sensing techniques provide a new horizon for monitoring the spatiotemporal variations of harmful algal blooms (HABs) using hyperspectral data in inland water. In this study, a hierarchical concatenated variational autoencoder (HCVAE) is proposed as an efficient and accurate deep learning (DL) based bio-optical model. To demonstrate its usefulness in retrieving algal pigments, the HCVAE is applied to bloom-prone regions in Daecheong Lake, South Korea. By abstracting the similarity between highly related features using layer-wise clique-based latent-feature extraction, HCVAE reduces the computational loads in deriving outputs while preventing performance degradation. Graph-based clique-detection uses information theory-based criteria to group the related reflectance spectra. Consequently, six latent features were extracted from 79 spectral bands to consist of a multilevel hierarchy of HCVAE that can simultaneously estimate concentrations of chlorophyll-a (Chl-a) and phycocyanin (PC). Despite the parsimonious model architecture, the Chl-a and PC concentrations estimated by HCVAE closely agree with the measured concentrations, with test R2 values of 0.76 and 0.82, respectively. In addition, spatial distribution maps of algal pigments obtained from HCVAE using drone-borne reflectance successfully capture the blooming spots. Based on its multilevel hierarchical architecture, HCVAE can provide the importance of latent features along with their individual wavelengths using Shapley additive explanations. The most important latent features covered the spectral regions associated with both Chl-a and PC. The lightweight neural network DNNsel, which uses only the spectral bands of highest importance in latent-feature extraction, performed comparably to HCVAE. The study results demonstrate the utility of the multilevel hierarchical architecture as a comprehensive assessment model for near-real-time drone-borne sensing of HABs. Moreover, HCVAE is applicable to a wide range of environmental big data, as it can handle numerous sets of features.


Subject(s)
Cyanobacteria , Deep Learning , Unmanned Aerial Devices , Environmental Monitoring/methods , Chlorophyll A , Harmful Algal Bloom , Lakes , Plants
17.
Water Res X ; 21: 100207, 2023 Dec 01.
Article in English | MEDLINE | ID: mdl-38098887

ABSTRACT

Water quality is substantially influenced by a multitude of dynamic and interrelated variables, including climate conditions, landuse and seasonal changes. Deep learning models have demonstrated predictive power of water quality due to the superior ability to automatically learn complex patterns and relationships from variables. Long short-term memory (LSTM), one of deep learning models for water quality prediction, is a type of recurrent neural network that can account for longer-term traits of time-dependent data. It is the most widely applied network used to predict the time series of water quality variables. First, we reviewed applications of a standalone LSTM and discussed its calculation time, prediction accuracy, and good robustness with process-driven numerical models and the other machine learning. This review was expanded into the LSTM model with data pre-processing techniques, including the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise method and Synchrosqueezed Wavelet Transform. The review then focused on the coupling of LSTM with a convolutional neural network, attention network, and transfer learning. The coupled networks demonstrated their performance over the standalone LSTM model. We also emphasized the influence of the static variables in the model and used the transformation method on the dataset. Outlook and further challenges were addressed. The outlook for research and application of LSTM in hydrology concludes the review.

18.
Water Res ; 246: 120662, 2023 Nov 01.
Article in English | MEDLINE | ID: mdl-37804805

ABSTRACT

Early warning systems for harmful cyanobacterial blooms (HCBs) that enable precautional control measures within water bodies and in water works are largely based on inferential time-series modelling. Among deep learning techniques, convolutional neural networks (CNNs) are widely applied for recognition of pictorial, acoustic and thermal images. Time-frequency images of environmental drivers generated by wavelets may provide crucial signals for modelling of HCBs to be recognized by CNNs. This study applies CNNs for time-series modelling of HCBs of Microcystis sp. in four South Korean rivers between 2016 and 2022 by means of time-frequency images of environmental drivers within the lead time of HCBs. After estimating the cardinal dates of beginning, peak, and ending of HCBs, wavelet analysis identified key drivers by phase analysis and generated time-frequency images of the drivers within the cardinal dates for 3, 4 and 5 years. Performances of CNNs were compared in terms of four determinants of input images: methods of estimating critical timings, the number of segments, time-series continuity, and image size. The resulting CNNs predicted high or low intensities of HCBs with a mean accuracy of 97.79 ± 0.06% and F1-score 97.49 ± 0.06% for training dataset, and a mean accuracy of 95.01 ± 0.06% and F1-score 93.30 ± 0.07% for testing dataset. Predictions of Microcystis abundances by CNNs achieved a mean MSE of 2.58 ± 2.46 and a mean R2 of 0.78 ± 0.20 for training, and a mean MSE of 2.76 ± 2.42 and a mean R2 of 0.55 ± 0.20 for testing dataset. Precipitation and discharge appeared to be the best performing drivers for qualitative and quantitative predictions of HCBs pointing at the nonstationary nature of river habitats. This study highlights the opportunities of time-series modelling by CNNs driven by wavelet generated time-frequency images of key environmental variables for forecasting of HCBs.


Subject(s)
Cyanobacteria , Microcystis , Neural Networks, Computer , Rivers , Water
19.
Water Res ; 246: 120710, 2023 Nov 01.
Article in English | MEDLINE | ID: mdl-37857009

ABSTRACT

Several preprocessing procedures are required for the classification of microplastics (MPs) in aquatic systems using spectroscopic analysis. Procedures such as oxidation, which are employed to remove natural organic matter (NOM) from MPs, can be time- and cost-intensive. Furthermore, the identification process is prone to errors due to the subjective judgment of the operators. Therefore, in this study, deep learning (DL) was applied to improve the classification accuracies for mixtures of microplastic and natural organic matter (MP-NOM). A convolutional neural network (CNN)-based DL model with a spatial attention mechanism was adopted to classify substances from their Raman spectra. Subsequently, the classification results were compared with those obtained using conventional Raman spectral library software to evaluate the applicability of the model. Additionally, the crucial spectral band for training the DL model was investigated by applying gradient-weighted class activation mapping (Grad-CAM) as a post-processing technique. The model achieved an accuracy of 99.54%, which is much higher than the 31.44% achieved by the Raman spectral library. The Grad-CAM approach confirmed that the DL model can effectively identify MPs based on their visually prominent peaks in the Raman spectra. Furthermore, by tracking distinctive spectra without relying solely on visually prominent peaks, we can accurately classify MPs with less prominent peaks, which are characterized by a high standard deviation of intensity. These findings demonstrate the potential for automated and objective classification of MPs without the need for NOM preprocessing, indicating a promising direction for future research in microplastic classification.


Subject(s)
Deep Learning , Microplastics , Plastics , Neural Networks, Computer , Software
20.
Environ Res ; 239(Pt 1): 117217, 2023 Dec 15.
Article in English | MEDLINE | ID: mdl-37775002

ABSTRACT

Marine organic aerosols play crucial roles in global climatic systems. However, their chemical properties and relationships with various potential organic sources still need clarification. This study employed high-resolution mass spectrometry to investigate the identity, origin, and transportation of organic aerosols in pristine Antarctic environments (King Sejong Station; 62.2°S, 58.8°W), where complex ocean-cryosphere-atmosphere interactions occur. First, we classified the aerosol samples into three clusters based on their air mass transport history. Next, we investigated the relationship between organic aerosols and their potential sources, including organic matter dissolved in the open ocean, coastal waters, and runoff waters. Cluster 1 (C1), in which the aerosols mainly originated from the open ocean area (i.e., pelagic zone-influenced), exhibited a higher abundance of lipid-like and protein-like organic aerosols than cluster 3 (C3), with ratios 1.8- and 1.6-times higher, respectively. In contrast, C3, characterized by longer air mass retention over sea ice and land areas (i.e., inshore-influenced), had higher lignin- and condensed aromatic structures (CAS)-like organic aerosols by 2.2- and 3.4-times compared to C1. Cluster 2 (C2) has intermediate characteristics between C1 and C3 concerning the chemical properties of the aerosols and air mass travel history. Notably, the chemical properties of the aerosols assigned to C1 are closely related to those of phytoplankton-derived organics enriched in the open ocean. In contrast, those of C3 are comparable to those of terrestrial plant-derived organics enriched in coastal and runoff waters. These findings help evaluate the source-dependent properties of organic aerosols in changing Antarctic environment.


Subject(s)
Atmosphere , Ice Cover , Antarctic Regions , Aerosols , Lignin
SELECTION OF CITATIONS
SEARCH DETAIL