Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-38781066

ABSTRACT

Numerous real-world decision or control problems involve multiple conflicting objectives whose relative importance (preference) is required to be weighed in different scenarios. While Pareto optimality is desired, environmental uncertainties (e.g., environmental changes or observational noises) may mislead the agent into performing suboptimal policies. In this article, we present a novel multiobjective optimization paradigm, robust multiobjective reinforcement learning (RMORL) considering environmental uncertainties, to train a single model that can approximate robust Pareto-optimal policies across the entire preference space. To enhance policy robustness against environmental changes, an environmental disturbance is modeled as an adversarial agent across the entire preference space via incorporating a zero-sum game into a multiobjective Markov decision process (MOMDP). Additionally, we devise an adversarial defense technique against observational perturbations, which ensures that policy variations, perturbed by adversarial attacks on state observations, remain within bounds under any specified preferences. The proposed technique is assessed in five multiobjective environments with continuous action spaces, showcasing its effectiveness through comparisons with competitive baselines, which encompass classical and state-of-the-art schemes.

2.
Nat Commun ; 14(1): 7434, 2023 Nov 16.
Article in English | MEDLINE | ID: mdl-37973874

ABSTRACT

Inverse Protein Folding (IPF) is an important task of protein design, which aims to design sequences compatible with a given backbone structure. Despite the prosperous development of algorithms for this task, existing methods tend to rely on noisy predicted residues located in the local neighborhood when generating sequences. To address this limitation, we propose an entropy-based residue selection method to remove noise in the input residue context. Additionally, we introduce ProRefiner, a memory-efficient global graph attention model to fully utilize the denoised context. Our proposed method achieves state-of-the-art performance on multiple sequence design benchmarks in different design settings. Furthermore, we demonstrate the applicability of ProRefiner in redesigning Transposon-associated transposase B, where six out of the 20 variants we propose exhibit improved gene editing activity.


Subject(s)
Algorithms , Proteins , Entropy , Proteins/genetics , Proteins/chemistry , Protein Folding
3.
IEEE Trans Cybern ; PP2023 Nov 21.
Article in English | MEDLINE | ID: mdl-37988210

ABSTRACT

Existing multiagent exploration works focus on how to explore in the fully cooperative task, which is insufficient in the environment with nonstationarity induced by agent interactions. To tackle this issue, we propose When to Explore (WToE), a simple yet effective variational exploration method to learn WToE under nonstationary environments. WToE employs an interaction-oriented adaptive exploration mechanism to adapt to environmental changes. We first propose a novel graphical model that uses a latent random variable to model the step-level environmental change resulting from interaction effects. Leveraging this graphical model, we employ the supervised variational auto-encoder (VAE) framework to derive a short-term inferred policy from historical trajectories to deal with the nonstationarity. Finally, agents engage in exploration when the short-term inferred policy diverges from the current actor policy. The proposed approach theoretically guarantees the convergence of the Q -value function. In our experiments, we validate our exploration mechanism in grid examples, multiagent particle environments and the battle game of MAgent environments. The results demonstrate the superiority of WToE over multiple baselines and existing exploration methods, such as MAEXQ, NoisyNets, EITI, and PR2.

4.
Article in English | MEDLINE | ID: mdl-37581972

ABSTRACT

Deep reinforcement learning (RL) typically requires a tremendous number of training samples, which are not practical in many applications. State abstraction and world models are two promising approaches for improving sample efficiency in deep RL. However, both state abstraction and world models may degrade the learning performance. In this article, we propose an abstracted model-based policy learning (AMPL) algorithm, which improves the sample efficiency of deep RL. In AMPL, a novel state abstraction method via multistep bisimulation is first developed to learn task-related latent state spaces. Hence, the original Markov decision processes (MDPs) are compressed into abstracted MDPs. Then, a causal transformer model predictor (CTMP) is designed to approximate the abstracted MDPs and generate long-horizon simulated trajectories with a smaller multistep prediction error. Policies are efficiently learned through these trajectories within the abstracted MDPs via a modified multistep soft actor-critic algorithm with a λ -target. Moreover, theoretical analysis shows that the AMPL algorithm can improve sample efficiency during the training process. On Atari games and the DeepMind Control (DMControl) suite, AMPL surpasses current state-of-the-art deep RL algorithms in terms of sample efficiency. Furthermore, DMControl tasks with moving noises are conducted, and the results demonstrate that AMPL is robust to task-irrelevant observational distractors and significantly outperforms the existing approaches.

5.
Article in English | MEDLINE | ID: mdl-37021882

ABSTRACT

Deep reinforcement learning (DRL) and deep multiagent reinforcement learning (MARL) have achieved significant success across a wide range of domains, including game artificial intelligence (AI), autonomous vehicles, and robotics. However, DRL and deep MARL agents are widely known to be sample inefficient that millions of interactions are usually needed even for relatively simple problem settings, thus preventing the wide application and deployment in real-industry scenarios. One bottleneck challenge behind is the well-known exploration problem, i.e., how efficiently exploring the environment and collecting informative experiences that could benefit policy learning toward the optimal ones. This problem becomes more challenging in complex environments with sparse rewards, noisy distractions, long horizons, and nonstationary co-learners. In this article, we conduct a comprehensive survey on existing exploration methods for both single-agent RL and multiagent RL. We start the survey by identifying several key challenges to efficient exploration. Then, we provide a systematic survey of existing approaches by classifying them into two major categories: uncertainty-oriented exploration and intrinsic motivation-oriented exploration. Beyond the above two main branches, we also include other notable exploration methods with different ideas and techniques. In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks. According to our algorithmic and empirical investigation, we finally summarize the open problems of exploration in DRL and deep MARL and point out a few future directions.

6.
IEEE Trans Neural Netw Learn Syst ; 34(8): 3966-3978, 2023 Aug.
Article in English | MEDLINE | ID: mdl-34723813

ABSTRACT

Communicating agents with each other in a distributed manner and behaving as a group are essential in multi-agent reinforcement learning. However, real-world multi-agent systems suffer from restrictions on limited bandwidth communication. If the bandwidth is fully occupied, some agents are not able to send messages promptly to others, causing decision delay and impairing cooperative effects. Recent related work has started to address the problem but still fails in maximally reducing the consumption of communication resources. In this article, we propose an event-triggered communication network (ETCNet) to enhance communication efficiency in multi-agent systems by communicating only when necessary. For different task requirements, two paradigms of the ETCNet framework, event-triggered sending network (ETSNet) and event-triggered receiving network (ETRNet), are proposed for learning efficient sending and receiving protocols, respectively. Leveraging the information theory, the limited bandwidth is translated to the penalty threshold of an event-triggered strategy, which determines whether an agent at each step participates in communication or not. Then, the design of the event-triggered strategy is formulated as a constrained Markov decision problem and reinforcement learning finds the feasible and optimal communication protocol that satisfies the limited bandwidth constraint. Experiments on typical multi-agent tasks demonstrate that ETCNet outperforms other methods in reducing bandwidth occupancy and still preserves the cooperative performance of multi-agent systems at the most.

7.
IEEE Trans Cybern ; 53(10): 6443-6455, 2023 Oct.
Article in English | MEDLINE | ID: mdl-35749334

ABSTRACT

In single-agent Markov decision processes, an agent can optimize its policy based on the interaction with the environment. In multiplayer Markov games (MGs), however, the interaction is nonstationary due to the behaviors of other players, so the agent has no fixed optimization objective. The challenge becomes finding equilibrium policies for all players. In this research, we treat the evolution of player policies as a dynamical process and propose a novel learning scheme for Nash equilibrium. The core is to evolve one's policy according to not just its current in-game performance, but an aggregation of its performance over history. We show that for a variety of MGs, players in our learning scheme will provably converge to a point that is an approximation to Nash equilibrium. Combined with neural networks, we develop an empirical policy optimization algorithm, which is implemented in a reinforcement-learning framework and runs in a distributed way, with each player optimizing its policy based on own observations. We use two numerical examples to validate the convergence property on small-scale MGs, and a pong example to show the potential on large games.

8.
Bioinformatics ; 39(1)2023 01 01.
Article in English | MEDLINE | ID: mdl-36539203

ABSTRACT

MOTIVATION: In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience. In this article, we propose two frameworks to support automatic medical consultation, namely doctor-patient dialogue understanding and task-oriented interaction. We create a new large medical dialogue dataset with multi-level fine-grained annotations and establish five independent tasks, including named entity recognition, dialogue act classification, symptom label inference, medical report generation and diagnosis-oriented dialogue policy. RESULTS: We report a set of benchmark results for each task, which shows the usability of the dataset and sets a baseline for future studies. AVAILABILITY AND IMPLEMENTATION: Both code and data are available from https://github.com/lemuria-wchen/imcs21. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Benchmarking , Machine Learning , Humans , Referral and Consultation
9.
Brief Bioinform ; 23(6)2022 11 19.
Article in English | MEDLINE | ID: mdl-36151714

ABSTRACT

The three-dimensional genome structure plays a key role in cellular function and gene regulation. Single-cell Hi-C (high-resolution chromosome conformation capture) technology can capture genome structure information at the cell level, which provides the opportunity to study how genome structure varies among different cell types. Recently, a few methods are well designed for single-cell Hi-C clustering. In this manuscript, we perform an in-depth benchmark study of available single-cell Hi-C data clustering methods to implement an evaluation system for multiple clustering frameworks based on both human and mouse datasets. We compare eight methods in terms of visualization and clustering performance. Performance is evaluated using four benchmark metrics including adjusted rand index, normalized mutual information, homogeneity and Fowlkes-Mallows index. Furthermore, we also evaluate the eight methods for the task of separating cells at different stages of the cell cycle based on single-cell Hi-C data.


Subject(s)
Chromatin , Chromosomes , Humans , Mice , Animals , Cluster Analysis , Genome , Molecular Conformation
10.
Article in English | MEDLINE | ID: mdl-37015360

ABSTRACT

Domain generalization aims to learn knowledge invariant across different distributions while semantically meaningful for downstream tasks from multiple source domains, to improve the model's generalization ability on unseen target domains. The fundamental objective is to understand the underlying "invariance" behind these observational distributions and such invariance has been shown to have a close connection to causality. While many existing approaches make use of the property that causal features are invariant across domains, we consider the invariance of the average causal effect of the features to the labels. This invariance regularizes our training approach in which interventions are performed on features to enforce stability of the causal prediction by the classifier across domains. Our work thus sheds some light on the domain generalization problem by introducing invariance of the mechanisms into the learning process. Experiments on several benchmark datasets demonstrate the performance of the proposed method against SOTAs.

11.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: mdl-33517357

ABSTRACT

Accurately identifying potential drug-target interactions (DTIs) is a key step in drug discovery. Although many related experimental studies have been carried out for identifying DTIs in the past few decades, the biological experiment-based DTI identification is still timeconsuming and expensive. Therefore, it is of great significance to develop effective computational methods for identifying DTIs. In this paper, we develop a novel 'end-to-end' learning-based framework based on heterogeneous 'graph' convolutional networks for 'DTI' prediction called end-to-end graph (EEG)-DTI. Given a heterogeneous network containing multiple types of biological entities (i.e. drug, protein, disease, side-effect), EEG-DTI learns the low-dimensional feature representation of drugs and targets using a graph convolutional networks-based model and predicts DTIs based on the learned features. During the training process, EEG-DTI learns the feature representation of nodes in an end-to-end mode. The evaluation test shows that EEG-DTI performs better than existing state-of-art methods. The data and source code are available at: https://github.com/MedicineBiology-AI/EEG-DTI.


Subject(s)
Computer Simulation , Drug Development , Drug Discovery , Machine Learning , Pharmaceutical Preparations/chemistry , Software , Drug-Related Side Effects and Adverse Reactions , Humans , Proteins/chemistry , Proteins/metabolism
12.
IEEE Trans Cybern ; 51(3): 1666-1677, 2021 Mar.
Article in English | MEDLINE | ID: mdl-31425137

ABSTRACT

Most existing robust principal component analysis (PCA) and 2-D PCA (2DPCA) methods involving the l2 -norm can mitigate the sensitivity to outliers in the domains of image analysis and pattern recognition. However, existing approaches neither preserve the structural information of data in the optimization objective nor have the robustness of generalized performance. To address the above problems, we propose two novel center-weight-based models, namely, centered PCA (C-PCA) and generalized centered 2DPCA with l2,p -norm minimization (GC-2DPCA), which are developed for vector- and matrix-based data, respectively. The C-PCA can preserve the structural information of data by measuring the similarity between the data points and can also retain the PCA's original desirable properties such as the rotational invariance. Furthermore, GC-2DPCA can learn efficient and robust projection matrices to suppress outliers by utilizing the variations between each row of the image matrix and employing power p of l2,1 -norm. We also propose an efficient algorithm to solve the C-PCA model and an iterative optimization algorithm to solve the GC-2DPCA model, and we theoretically analyze their convergence properties. Experiments on three public databases show that our models yield significant improvements over the state-of-the-art PCA and 2DPCA approaches.

13.
Brief Bioinform ; 22(2): 2096-2105, 2021 03 22.
Article in English | MEDLINE | ID: mdl-32249297

ABSTRACT

MOTIVATION: The emergence of abundant biological networks, which benefit from the development of advanced high-throughput techniques, contributes to describing and modeling complex internal interactions among biological entities such as genes and proteins. Multiple networks provide rich information for inferring the function of genes or proteins. To extract functional patterns of genes based on multiple heterogeneous networks, network embedding-based methods, aiming to capture non-linear and low-dimensional feature representation based on network biology, have recently achieved remarkable performance in gene function prediction. However, existing methods do not consider the shared information among different networks during the feature learning process. RESULTS: Taking the correlation among the networks into account, we design a novel semi-supervised autoencoder method to integrate multiple networks and generate a low-dimensional feature representation. Then we utilize a convolutional neural network based on the integrated feature embedding to annotate unlabeled gene functions. We test our method on both yeast and human datasets and compare with three state-of-the-art methods. The results demonstrate the superior performance of our method. We not only provide a comprehensive analysis of the performance of the newly proposed algorithm but also provide a tool for extracting features of genes based on multiple networks, which can be used in the downstream machine learning task. AVAILABILITY: DeepMNE-CNN is freely available at https://github.com/xuehansheng/DeepMNE-CNN. CONTACT: jiajiepeng@nwpu.edu.cn; shang@nwpu.edu.cn; jianye.hao@tju.edu.cn.


Subject(s)
Deep Learning , Neural Networks, Computer , Algorithms , Gene Regulatory Networks , Genes, Fungal , Humans , Molecular Sequence Annotation , Yeasts/genetics
14.
Nucleic Acids Res ; 49(D1): D1413-D1419, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33010177

ABSTRACT

SC2disease (http://easybioai.com/sc2disease/) is a manually curated database that aims to provide a comprehensive and accurate resource of gene expression profiles in various cell types for different diseases. With the development of single-cell RNA sequencing (scRNA-seq) technologies, uncovering cellular heterogeneity of different tissues for different diseases has become feasible by profiling transcriptomes across cell types at the cellular level. In particular, comparing gene expression profiles between different cell types and identifying cell-type-specific genes in various diseases offers new possibilities to address biological and medical questions. However, systematic, hierarchical and vast databases of gene expression profiles in human diseases at the cellular level are lacking. Thus, we reviewed the literature prior to March 2020 for studies which used scRNA-seq to study diseases with human samples, and developed the SC2disease database to summarize all the data by different diseases, tissues and cell types. SC2disease documents 946 481 entries, corresponding to 341 cell types, 29 tissues and 25 diseases. Each entry in the SC2disease database contains comparisons of differentially expressed genes between different cell types, tissues and disease-related health status. Furthermore, we reanalyzed gene expression matrix by unified pipeline to improve the comparability between different studies. For each disease, we also compare cell-type-specific genes with the corresponding genes of lead single nucleotide polymorphisms (SNPs) identified in genome-wide association studies (GWAS) to implicate cell type specificity of the traits.


Subject(s)
Autism Spectrum Disorder/genetics , Autoimmune Diseases/genetics , Cardiovascular Diseases/genetics , Databases, Factual , Gastrointestinal Diseases/genetics , Neoplasms/genetics , Neurodegenerative Diseases/genetics , Virus Diseases/genetics , Algorithms , Autism Spectrum Disorder/metabolism , Autism Spectrum Disorder/pathology , Autoimmune Diseases/metabolism , Autoimmune Diseases/pathology , Cardiovascular Diseases/metabolism , Cardiovascular Diseases/pathology , Gastrointestinal Diseases/metabolism , Gastrointestinal Diseases/pathology , Gene Expression Profiling , Genetic Heterogeneity , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , Internet , Neoplasms/metabolism , Neoplasms/pathology , Neurodegenerative Diseases/metabolism , Neurodegenerative Diseases/pathology , Organ Specificity , Polymorphism, Single Nucleotide , Single-Cell Analysis/methods , Software , Transcriptome , Virus Diseases/metabolism , Virus Diseases/pathology
15.
Anal Bioanal Chem ; 412(1): 81-91, 2020 Jan.
Article in English | MEDLINE | ID: mdl-31953713

ABSTRACT

Methods for detecting mycotoxins are very important because of the great health hazards of mycotoxins. However, there is a high background and low signal-to-noise ratio in real-time sensing, and therefore it is difficult to meet the fast, accurate, and convenient requirements for control of food quality. Here we constructed a quantitative fluorescence image analysis based on multicolor upconversion nanocrystal (UCN)-encoded microspheres for detection of ochratoxin A and zearalenone. The background-free encoding image signal of UCN-doped microspheres was captured by fluorescence microscopy under near-infrared excitation, whereas the detection image signal of phycoerythrin-labeled secondary antibodies conjugated to the microspheres was captured under blue light excitation. We custom-wrote an algorithm to analyze the two images for the same sample in 10 s, and only the gray value in the red channel of the secondary probe confirmed the quantity. The results showed that this novel detection platform performed feasible and reliable fluorescence image measurements by this method. Additionally, the limit of detection of was 0.34721 ng/mL for ochratoxin A and 0.41162 ng/mL for zearalenone. We envision that this UCN encoding strategy will be usefully applied for fast, accurate, and convenient testing of multiple food contaminants to ensure the safety of the food.


Subject(s)
Microspheres , Ochratoxins/analysis , Zearalenone/analysis , Food Contamination/analysis , Immunoassay/methods , Limit of Detection , Nanoparticles/chemistry , Signal-To-Noise Ratio
16.
BMC Bioinformatics ; 20(Suppl 18): 575, 2019 Nov 25.
Article in English | MEDLINE | ID: mdl-31760945

ABSTRACT

BACKGROUND: Influenza is an infectious respiratory disease that can cause serious public health hazard. Due to its huge threat to the society, precise real-time forecasting of influenza outbreaks is of great value to our public. RESULTS: In this paper, we propose a new deep neural network structure that forecasts a real-time influenza-like illness rate (ILI%) in Guangzhou, China. Long short-term memory (LSTM) neural networks is applied to precisely forecast accurateness due to the long-term attribute and diversity of influenza epidemic data. We devise a multi-channel LSTM neural network that can draw multiple information from different types of inputs. We also add attention mechanism to improve forecasting accuracy. By using this structure, we are able to deal with relationships between multiple inputs more appropriately. Our model fully consider the information in the data set, targetedly solving practical problems of the Guangzhou influenza epidemic forecasting. CONCLUSION: We assess the performance of our model by comparing it with different neural network structures and other state-of-the-art methods. The experimental results indicate that our model has strong competitiveness and can provide effective real-time influenza epidemic forecasting.


Subject(s)
Forecasting/methods , Influenza, Human/epidemiology , Neural Networks, Computer , China/epidemiology , Disease Outbreaks , Humans , Public Health/statistics & numerical data
17.
BMC Bioinformatics ; 20(Suppl 18): 571, 2019 Nov 25.
Article in English | MEDLINE | ID: mdl-31760946

ABSTRACT

BACKGROUND: Collective cell migration is a significant and complex phenomenon that affects many basic biological processes. The coordination between leader cell and follower cell affects the rate of collective cell migration. However, there are still very few papers on the impacts of the stimulus signal released by the leader on the follower. Tracking cell movement using 3D time-lapse microscopy images provides an unprecedented opportunity to systematically study and analyze collective cell migration. RESULTS: Recently, deep reinforcement learning algorithms have become very popular. In our paper, we also use this method to train the number of cells and control signals. By experimenting with single-follower cell and multi-follower cells, it is concluded that the number of stimulation signals is proportional to the rate of collective movement of the cells. Such research provides a more diverse approach and approach to studying biological problems. CONCLUSION: Traditional research methods are always based on real-life scenarios, but as the number of cells grows exponentially, the research process is too time consuming. Agent-based modeling is a robust framework that approximates cells to isotropic, elastic, and sticky objects. In this paper, an agent-based modeling framework is used to establish a simulation platform for simulating collective cell migration. The goal of the platform is to build a biomimetic environment to demonstrate the importance of stimuli between the leading and following cells.


Subject(s)
Cell Movement , Cells/cytology , Time-Lapse Imaging/methods , Algorithms , Animals , Computer Simulation , Humans
18.
Bioinformatics ; 35(21): 4364-4371, 2019 11 01.
Article in English | MEDLINE | ID: mdl-30977780

ABSTRACT

MOTIVATION: A microRNA (miRNA) is a type of non-coding RNA, which plays important roles in many biological processes. Lots of studies have shown that miRNAs are implicated in human diseases, indicating that miRNAs might be potential biomarkers for various types of diseases. Therefore, it is important to reveal the relationships between miRNAs and diseases/phenotypes. RESULTS: We propose a novel learning-based framework, MDA-CNN, for miRNA-disease association identification. The model first captures interaction features between diseases and miRNAs based on a three-layer network including disease similarity network, miRNA similarity network and protein-protein interaction network. Then, it employs an auto-encoder to identify the essential feature combination for each pair of miRNA and disease automatically. Finally, taking the reduced feature representation as input, it uses a convolutional neural network to predict the final label. The evaluation results show that the proposed framework outperforms some state-of-the-art approaches in a large margin on both tasks of miRNA-disease association prediction and miRNA-phenotype association prediction. AVAILABILITY AND IMPLEMENTATION: The source code and data are available at https://github.com/Issingjessica/MDA-CNN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Neural Networks, Computer , Algorithms , Humans , MicroRNAs , Software
19.
BMC Bioinformatics ; 19(Suppl 5): 119, 2018 04 11.
Article in English | MEDLINE | ID: mdl-29671391

ABSTRACT

BACKGROUND: The cooperation of cells in biological systems is similar to that of agents in cooperative multi-agent systems. Research findings in multi-agent systems literature can provide valuable inspirations to biological research. The well-coordinated states in cell systems can be viewed as desirable social norms in cooperative multi-agent systems. One important research question is how a norm can rapidly emerge with limited communication resources. RESULTS: In this work, we propose a learning approach which can trade off the agents' performance of coordinating on a consistent norm and the communication cost involved. During the learning process, the agents can dynamically adjust their coordination set according to their own observations and pick out the most crucial agents to coordinate with. In this way, our method significantly reduces the coordination dependence among agents. CONCLUSION: The experiment results show that our method can efficiently facilitate the social norm emergence among agents, and also scale well to large-scale populations.


Subject(s)
Cell Communication , Cells/metabolism , Algorithms , Humans , Models, Biological
20.
Entropy (Basel) ; 20(4)2018 Mar 29.
Article in English | MEDLINE | ID: mdl-33265327

ABSTRACT

Networks will continue to become increasingly heterogeneous as we move toward 5G. Meanwhile, the intelligent programming of the core network makes the available radio resource be more changeable rather than static. In such a dynamic and heterogeneous network environment, how to help terminal users select optimal networks to access is challenging. Prior implementations of network selection are usually applicable for the environment with static radio resources, while they cannot handle the unpredictable dynamics in 5G network environments. To this end, this paper considers both the fluctuation of radio resources and the variation of user demand. We model the access network selection scenario as a multiagent coordination problem, in which a bunch of rationally terminal users compete to maximize their benefits with incomplete information about the environment (no prior knowledge of network resource and other users' choices). Then, an adaptive learning based strategy is proposed, which enables users to adaptively adjust their selections in response to the gradually or abruptly changing environment. The system is experimentally shown to converge to Nash equilibrium, which also turns out to be both Pareto optimal and socially optimal. Extensive simulation results show that our approach achieves significantly better performance compared with two learning and non-learning based approaches in terms of load balancing, user payoff and the overall bandwidth utilization efficiency. In addition, the system has a good robustness performance under the condition with non-compliant terminal users.

SELECTION OF CITATIONS
SEARCH DETAIL
...