Search | VHL Regional Portal

1.

MultiFair: Model Fairness With Multiple Sensitive Attributes.

Tian, Huan; Liu, Bo; Zhu, Tianqing; Zhou, Wanlei; Yu, Philip S.

IEEE Trans Neural Netw Learn Syst ; PP2024 Apr 22.

Article in English | MEDLINE | ID: mdl-38648122

ABSTRACT

While existing fairness interventions show promise in mitigating biased predictions, most studies concentrate on single-attribute protections. Although a few methods consider multiple attributes, they either require additional constraints or prediction heads, incurring high computational overhead or jeopardizing the stability of the training process. More critically, they consider per-attribute protection approaches, raising concerns about fairness gerrymandering where certain attribute combinations remain unfair. This work aims to construct a neutral domain containing fused information across all subgroups and attributes. It delivers fair predictions as the fused input contains neutralized information for all considered attributes. Specifically, we adopt mixup operations to generate samples with fused information. However, our experiments reveal that directly adopting the operations leads to degraded prediction results. The excessive mixup operations result in unrecognizable training data. To this end, we design three distinct mixup schemes that balance information fusion across attributes while retaining distinct visual features critical for training valid models. Extensive experiments with multiple datasets and up to eight sensitive attributes demonstrate that the proposed MultiFair method can deliver fairness protections for multiple attributes while maintaining valid prediction results.

2.

Balancing Learning Model Privacy, Fairness, and Accuracy With Early Stopping Criteria.

Zhang, Tao; Zhu, Tianqing; Gao, Kun; Zhou, Wanlei; Yu, Philip S.

IEEE Trans Neural Netw Learn Syst ; 34(9): 5557-5569, 2023 Sep.

Article in English | MEDLINE | ID: mdl-34878980

ABSTRACT

As deep learning models mature, one of the most prescient questions we face is: what is the ideal tradeoff between accuracy, fairness, and privacy (AFP)? Unfortunately, both the privacy and the fairness of a model come at the cost of its accuracy. Hence, an efficient and effective means of fine-tuning the balance between this trinity of needs is critical. Motivated by some curious observations in privacy-accuracy tradeoffs with differentially private stochastic gradient descent (DP-SGD), where fair models sometimes result, we conjecture that fairness might be better managed as an indirect byproduct of this process. Hence, we conduct a series of analyses, both theoretical and empirical, on the impacts of implementing DP-SGD in deep neural network models through gradient clipping and noise addition. The results show that, in deep learning, the number of training epochs is central to striking a balance between AFP because DP-SGD makes the training less stable, providing the possibility of model updates at a low discrimination level without much loss in accuracy. Based on this observation, we designed two different early stopping criteria to help analysts choose the optimal epoch at which to stop training a model so as to achieve their ideal tradeoff. Extensive experiments show that our methods can achieve an ideal balance between AFP.

3.

Model-Based Self-Advising for Multi-Agent Learning.

Ye, Dayong; Zhu, Tianqing; Zhu, Congcong; Zhou, Wanlei; Yu, Philip S.

IEEE Trans Neural Netw Learn Syst ; 34(10): 7934-7945, 2023 Oct.

Article in English | MEDLINE | ID: mdl-35157599

ABSTRACT

In multiagent learning, one of the main ways to improve learning performance is to ask for advice from another agent. Contemporary advising methods share a common limitation that a teacher agent can only advise a student agent if the teacher has experience with an identical state. However, in highly complex learning scenarios, such as autonomous driving, it is rare for two agents to experience exactly the same state, which makes the advice less of a learning aid and more of a one-time instruction. In these scenarios, with contemporary methods, agents do not really help each other learn, and the main outcome of their back and forth requests for advice is an exorbitant communications' overhead. In human interactions, teachers are often asked for advice on what to do in situations that students are personally unfamiliar with. In these, we generally draw from similar experiences to formulate advice. This inspired us to provide agents with the same ability when asked for advice on an unfamiliar state. Hence, we propose a model-based self-advising method that allows agents to train a model based on states similar to the state in question to inform its response. As a result, the advice given can not only be used to resolve the current dilemma but also many other similar situations that the student may come across in the future via self-advising. Compared with contemporary methods, our method brings a significant improvement in learning performance with much lower communication overheads.

4.

Differential Advising in Multiagent Reinforcement Learning.

Ye, Dayong; Zhu, Tianqing; Cheng, Zishuo; Zhou, Wanlei; Yu, Philip S.

IEEE Trans Cybern ; 52(6): 5508-5521, 2022 Jun.

Article in English | MEDLINE | ID: mdl-33232260

ABSTRACT

Agent advising is one of the main approaches to improve agent learning performance by enabling agents to share advice. Existing advising methods have a common limitation that an adviser agent can offer advice to an advisee agent only if the advice is created in the same state as the advisee's state. However, in complex environments, it is a very strong requirement that two states are the same, because a state may consist of multiple dimensions and two states being the same means that all these dimensions in the two states are correspondingly identical. Therefore, this requirement may limit the applicability of existing advising methods to complex environments. In this article, inspired by the differential privacy scheme, we propose a differential advising method that relaxes this requirement by enabling agents to use advice in a state even if the advice is created in a slightly different state. Compared with the existing methods, agents using the proposed method have more opportunity to take advice from others. This article is the first to adopt the concept of differential privacy on advising to improve agent learning performance instead of addressing security issues. The experimental results demonstrate that the proposed method is more efficient in complex environments than the existing methods.

Subject(s)

Learning , Reinforcement, Psychology

5.

Differentially Private Malicious Agent Avoidance in Multiagent Advising Learning.

Ye, Dayong; Zhu, Tianqing; Zhou, Wanlei; Yu, Philip S.

IEEE Trans Cybern ; 50(10): 4214-4227, 2020 Oct.

Article in English | MEDLINE | ID: mdl-30990207

ABSTRACT

Agent advising is one of the key approaches to improve agent learning performance by enabling agents to ask for advice between each other. Existing agent advising approaches have two limitations. The first limitation is that all the agents in a system are assumed to be friendly and cooperative. However, in the real world, malicious agents may exist and provide false advice to hinder the learning performance of other agents. The second limitation is that the analysis of communication overhead in these approaches is either overlooked or simplified. However, in communication-constrained environments, communication overhead has to be carefully considered. To overcome the two limitations, this paper proposes a novel differentially private agent advising approach. Our approach employs the Laplace mechanism to add noise on the rewards used by student agents to select teacher agents. By using the differential privacy technique, the proposed approach can reduce the impact of malicious agents without identifying them. Also, by adopting the privacy budget concept, the proposed approach can naturally control communication overhead. The experimental results demonstrate the effectiveness of the proposed approach.

6.

Null Model and Community Structure in Multiplex Networks.

Zhai, Xuemeng; Zhou, Wanlei; Fei, Gaolei; Liu, Weiyi; Xu, Zhoujun; Jiao, Chengbo; Lu, Cai; Hu, Guangmin.

Sci Rep ; 8(1): 3245, 2018 02 19.

Article in English | MEDLINE | ID: mdl-29459696

ABSTRACT

The multiple relationships among objects in complex systems can be described well by multiplex networks, which contain rich information of the connections between objects. The null model of networks, which can be used to quantify the specific nature of a network, is a powerful tool for analysing the structural characteristics of complex systems. However, the null model for multiplex networks remains largely unexplored. In this paper, we propose a null model for multiplex networks based on the node redundancy degree, which is a natural measure for describing the multiple relationships in multiplex networks. Based on this model, we define the modularity of multiplex networks to study the community structures in multiplex networks and demonstrate our theory in practice through community detection in four real-world networks. The results show that our model can reveal the community structures in multiplex networks and indicate that our null model is a useful approach for providing new insights into the specific nature of multiplex networks, which are difficult to quantify.

7.

Breast cancer prognosis risk estimation using integrated gene expression and clinical data.

Saini, Ashish; Hou, Jingyu; Zhou, Wanlei.

Biomed Res Int ; 2014: 459203, 2014.

Article in English | MEDLINE | ID: mdl-24949450

ABSTRACT

BACKGROUND: Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information contained in established clinical markers. Nevertheless, small sample sizes in individual microarray datasets remain a bottleneck in generating robust gene signatures that show limited predictive power. The aim of this study is to achieve high classification accuracy for the good prognosis group and then achieve high classification accuracy for the poor prognosis group. METHODS: We propose a novel algorithm called the IPRE (integrated prognosis risk estimation) algorithm. We used integrated microarray datasets from multiple studies to increase the sample sizes (â¼ 2,700 samples). The IPRE algorithm consists of a virtual chromosome for the extraction of the prognostic gene signature that has 79 genes, and a multivariate logistic regression model that incorporates clinical data along with expression data to generate the risk score formula that accurately categorizes breast cancer patients into two prognosis groups. RESULTS: The evaluation on two testing datasets showed that the IPRE algorithm achieved high classification accuracies of 82% and 87%, which was far greater than any existing algorithms.

Subject(s)

Breast Neoplasms/genetics , Microarray Analysis , Prognosis , Algorithms , Biomarkers, Tumor/genetics , Breast Neoplasms/diagnosis , Breast Neoplasms/pathology , Databases, Genetic , Female , Gene Expression Regulation, Neoplastic , Humans , Logistic Models , Risk Factors

8.

RRHGE: a novel approach to classify the estrogen receptor based breast cancer subtypes.

Saini, Ashish; Hou, Jingyu; Zhou, Wanlei.

ScientificWorldJournal ; 2014: 362141, 2014.

Article in English | MEDLINE | ID: mdl-24563630

ABSTRACT

BACKGROUND: Breast cancer is the most common type of cancer among females with a high mortality rate. It is essential to classify the estrogen receptor based breast cancer subtypes into correct subclasses, so that the right treatments can be applied to lower the mortality rate. Using gene signatures derived from gene interaction networks to classify breast cancers has proven to be more reproducible and can achieve higher classification performance. However, the interactions in the gene interaction network usually contain many false-positive interactions that do not have any biological meanings. Therefore, it is a challenge to incorporate the reliability assessment of interactions when deriving gene signatures from gene interaction networks. How to effectively extract gene signatures from available resources is critical to the success of cancer classification. METHODS: We propose a novel method to measure and extract the reliable (biologically true or valid) interactions from gene interaction networks and incorporate the extracted reliable gene interactions into our proposed RRHGE algorithm to identify significant gene signatures from microarray gene expression data for classifying ER+ and ER- breast cancer samples. RESULTS: The evaluation on real breast cancer samples showed that our RRHGE algorithm achieved higher classification accuracy than the existing approaches.

Subject(s)

Algorithms , Breast Neoplasms , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Receptors, Estrogen , Adult , Aged , Breast Neoplasms/classification , Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Datasets as Topic , Female , Humans , Middle Aged , Oligonucleotide Array Sequence Analysis , Receptors, Estrogen/biosynthesis , Receptors, Estrogen/genetics

9.

Lazy collaborative filtering for data sets with missing values.

Ren, Yongli; Li, Gang; Zhang, Jun; Zhou, Wanlei.

IEEE Trans Cybern ; 43(6): 1822-34, 2013 Dec.

Article in English | MEDLINE | ID: mdl-23757575

ABSTRACT

As one of the biggest challenges in research on recommender systems, the data sparsity issue is mainly caused by the fact that users tend to rate a small proportion of items from the huge number of available items. This issue becomes even more problematic for the neighborhood-based collaborative filtering (CF) methods, as there are even lower numbers of ratings available in the neighborhood of the query item. In this paper, we aim to address the data sparsity issue in the context of neighborhood-based CF. For a given query (user, item), a set of key ratings is first identified by taking the historical information of both the user and the item into account. Then, an auto-adaptive imputation (AutAI) method is proposed to impute the missing values in the set of key ratings. We present a theoretical analysis to show that the proposed imputation method effectively improves the performance of the conventional neighborhood-based CF methods. The experimental results show that our new method of CF with AutAI outperforms six existing recommendation methods in terms of accuracy.

Subject(s)

Algorithms , Artificial Intelligence , Data Mining/methods , Databases, Factual , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Sample Size

10.

An effective non-parametric method for globally clustering genes from expression profiles.

Hou, Jingyu; Shi, Wei; Li, Gang; Zhou, Wanlei.

Med Biol Eng Comput ; 45(12): 1175-85, 2007 Dec.

Article in English | MEDLINE | ID: mdl-17943335

ABSTRACT

Clustering is widely used in bioinformatics to find gene correlation patterns. Although many algorithms have been proposed, these are usually confronted with difficulties in meeting the requirements of both automation and high quality. In this paper, we propose a novel algorithm for clustering genes from their expression profiles. The unique features of the proposed algorithm are twofold: it takes into consideration global, rather than local, gene correlation information in clustering processes; and it incorporates clustering quality measurement into the clustering processes to implement non-parametric, automatic and global optimal gene clustering. The evaluation on simulated and real gene data sets demonstrates the effectiveness of the algorithm.

Subject(s)

Algorithms , Gene Expression Profiling , Oligonucleotide Array Sequence Analysis , Animals , Cluster Analysis , Computational Biology , Humans

11.

Identifying cis-regulatory elements by statistical analysis and phylogenetic footprinting and analyzing their coexistence and related gene ontology.

Shi, Wei; Zhou, Wanlei; Xu, Dakang.

Physiol Genomics ; 31(3): 374-84, 2007 Nov 14.

Article in English | MEDLINE | ID: mdl-17848606

ABSTRACT

Discovery of cis-regulatory elements in gene promoters is a highly challenging research issue in computational molecular biology. This paper presents a novel approach to searching putative cis-regulatory elements in human promoters by first finding 8-mer sequences of high statistical significance from gene promoters of humans, mice, and Drosophila melanogaster, respectively, and then identifying the most conserved ones across the three species (phylogenetic footprinting). In this study, a conservation analysis on both closely related species (humans and mice) and distantly related species (humans/mice and Drosophila) is conducted not only to examine more candidates but also to improve the prediction accuracy. We have found 124 putative cis-regulatory elements and grouped these into 20 clusters. The investigation on the coexistence of these clusters in human gene promoters reveals that SP1, EGR, and NRF-1 are the dominant clusters appearing in the combinatorial combination of up to five clusters. Gene Ontology (GO) analysis also shows that many GO categories of transcription factors binding to these cis-regulatory elements match the GO categories of genes whose promoters contain these elements. Compared with previous research, the contribution of this study lies not only in the finding of new cis-regulatory elements, but also in its pioneering exploration on the coexistence of discovered elements and the GO relationship between transcription factors and regulated genes. This exploration verifies the putative cis-regulatory elements that have been found from this study and also gives new insight on the regulation mechanisms of gene expression.

Subject(s)

Phylogeny , Regulatory Sequences, Nucleic Acid , Animals , Drosophila , Humans , Mice , Promoter Regions, Genetic , Transcription Factors/metabolism

12.

Frequency distribution of TATA Box and extension sequences on human promoters.

Shi, Wei; Zhou, Wanlei.

BMC Bioinformatics ; 7 Suppl 4: S2, 2006 Dec 12.

Article in English | MEDLINE | ID: mdl-17217512

ABSTRACT

BACKGROUND: TATA box is one of the most important transcription factor binding sites. But the exact sequences of TATA box are still not very clear. RESULTS: In this study, we conduct a dedicated analysis on the frequency distribution of TATA Box and its extension sequences on human promoters. Sixteen TATA elements derived from the TATA Box motif, TATAWAWN, are classified into three distribution patterns: peak, bottom-peak, and bottom. Fourteen TATA extension sequences are predicted to be the new TATA Box elements due to their high motif factors, which indicate their statistical significance. Statistical analysis on the promoters of mice, zebrafish and drosophila melanogaster verifies seven of these elements. It is also observed that the distribution of TATA elements on the promoters of housekeeping genes are very similar with their distribution on the promoters of tissue specific genes in human. CONCLUSION: The dedicated statistical analysis on TATA box and its extension sequences yields new TATA elements. The statistical significance of these elements has been verified on random data sets by calculating their p values.

Subject(s)

Promoter Regions, Genetic/genetics , Sequence Analysis, DNA/methods , TATA Box/genetics , Transcription Factors/genetics , Base Sequence , Binding Sites , Data Interpretation, Statistical , Gene Frequency/genetics , Humans , Molecular Sequence Data , Protein Binding , Sequence Alignment/methods , Statistical Distributions

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL