Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
1.
IEEE Trans Neural Netw Learn Syst ; 34(11): 9562-9567, 2023 Nov.
Article in English | MEDLINE | ID: mdl-35333722

ABSTRACT

The ResNet and its variants have achieved remarkable successes in various computer vision tasks. Despite its success in making gradient flow through building blocks, the information communication of intermediate layers of blocks is ignored. To address this issue, in this brief, we propose to introduce a regulator module as a memory mechanism to extract complementary features of the intermediate layers, which are further fed to the ResNet. In particular, the regulator module is composed of convolutional recurrent neural networks (RNNs) [e.g., convolutional long short-term memories (LSTMs) or convolutional gated recurrent units (GRUs)], which are shown to be good at extracting spatio-temporal information. We named the new regulated network as regulated residual network (RegNet). The regulator module can be easily implemented and appended to any ResNet architecture. Experimental results on three image classification datasets have demonstrated the promising performance of the proposed architecture compared with the standard ResNet, squeeze-and-excitation ResNet, and other state-of-the-art architectures.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3363-3377, 2023 03.
Article in English | MEDLINE | ID: mdl-35687622

ABSTRACT

Food is significant to human daily life. In this paper, we are interested in learning structural representations for lengthy recipes, that can benefit the recipe generation and food cross-modal retrieval tasks. Different from the common vision-language data, here the food images contain mixed ingredients and target recipes are lengthy paragraphs, where we do not have annotations on structure information. To address the above limitations, we propose a novel method to unsupervisedly learn the sentence-level tree structures for the cooking recipes. Our approach brings together several novel ideas in a systematic framework: (1) exploiting an unsupervised learning approach to obtain the sentence-level tree structure labels before training; (2) generating trees of target recipes from images with the supervision of tree structure labels learned from (1); and (3) integrating the learned tree structures into the recipe generation and food cross-modal retrieval procedure. Our proposed model can produce good-quality sentence-level tree structures and coherent recipes. We achieve the state-of-the-art recipe generation and food cross-modal retrieval performance on the benchmark Recipe1M dataset.


Subject(s)
Algorithms , Cooking , Humans , Language
3.
IEEE Trans Image Process ; 31: 5150-5162, 2022.
Article in English | MEDLINE | ID: mdl-35901005

ABSTRACT

Video captioning targets interpreting the complex visual contents as text descriptions, which requires the model to fully understand video scenes including objects and their interactions. Prevailing methods adopt off-the-shelf object detection networks to give object proposals and use the attention mechanism to model the relations between objects. They often miss some undefined semantic concepts of the pretrained model and fail to identify exact predicate relationships between objects. In this paper, we investigate an open research task of generating text descriptions for the given videos, and propose Cross-Modal Graph (CMG) with meta concepts for video captioning. Specifically, to cover the useful semantic concepts in video captions, we weakly learn the corresponding visual regions for text descriptions, where the associated visual regions and textual words are named cross-modal meta concepts. We further build meta concept graphs dynamically with the learned cross-modal meta concepts. We also construct holistic video-level and local frame-level video graphs with the predicted predicates to model video sequence structures. We validate the efficacy of our proposed techniques with extensive experiments and achieve state-of-the-art results on two public datasets.

4.
IEEE Trans Cybern ; 52(2): 1233-1246, 2022 Feb.
Article in English | MEDLINE | ID: mdl-32559172

ABSTRACT

Deep learning methods are becoming the de-facto standard for generic visual recognition in the literature. However, their adaptations to industrial scenarios, such as visual recognition for machines, product streamlines, etc., which consist of countless components, have not been investigated well yet. Compared with the generic object detection, there is some strong structural knowledge in these scenarios (e.g., fixed relative positions of components, component relationships, etc.). A case worth exploring could be automated visual inspection for trains, where there are various correlated components. However, the dominant object detection paradigm is limited by treating the visual features of each object region separately without considering common sense knowledge among objects. In this article, we propose a novel automated visual inspection framework for trains exploring structural knowledge for train component detection, which is called SKTCD. SKTCD is an end-to-end trainable framework, in which the visual features of train components and structural knowledge (including hierarchical scene contexts and spatial-aware component relationships) are jointly exploited for train component detection. We propose novel residual multiple gated recurrent units (Res-MGRUs) that can optimally fuse the visual features of train components and messages from the structural knowledge in a weighted-recurrent way. In order to verify the feasibility of SKTCD, a dataset that contains high-resolution images captured from moving trains has been collected, in which 18 590 critical train components are manually annotated. Extensive experiments on this dataset and on the PASCAL VOC dataset have demonstrated that SKTCD outperforms the existing challenging baselines significantly. The dataset as well as the source code can be downloaded online (https://github.com/smartprobe/SKCD).


Subject(s)
Software
5.
IEEE Trans Pattern Anal Mach Intell ; 44(6): 2872-2893, 2022 06.
Article in English | MEDLINE | ID: mdl-33497329

ABSTRACT

Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings. The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets. We first conduct a comprehensive overview with in-depth analysis for closed-world person Re-ID from three different perspectives, including deep feature representation learning, deep metric learning and ranking optimization. With the performance saturation under closed-world setting, the research focus for person Re-ID has recently shifted to the open-world setting, facing more challenging issues. This setting is closer to practical applications under specific scenarios. We summarize the open-world Re-ID in terms of five different aspects. By analyzing the advantages of existing methods, we design a powerful AGW baseline, achieving state-of-the-art or at least comparable performance on twelve datasets for four different Re-ID tasks. Meanwhile, we introduce a new evaluation metric (mINP) for person Re-ID, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re-ID system for real applications. Finally, some important yet under-investigated open issues are discussed.


Subject(s)
Biometric Identification , Deep Learning , Algorithms , Biometric Identification/methods , Humans , Image Processing, Computer-Assisted/methods , Neural Networks, Computer
6.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 8896-8909, 2022 12.
Article in English | MEDLINE | ID: mdl-34762585

ABSTRACT

In recent years, Siamese network based trackers have significantly advanced the state-of-the-art in real-time tracking. Despite their success, Siamese trackers tend to suffer from high memory costs, which restrict their applicability to mobile devices with tight memory budgets. To address this issue, we propose a distilled Siamese tracking framework to learn small, fast and accurate trackers (students), which capture critical knowledge from large Siamese trackers (teachers) by a teacher-students knowledge distillation model. This model is intuitively inspired by the one teacher versus multiple students learning method typically employed in schools. In particular, our model contains a single teacher-student distillation module and a student-student knowledge sharing mechanism. The former is designed using a tracking-specific distillation strategy to transfer knowledge from a teacher to students. The latter is utilized for mutual learning between students to enable in-depth knowledge understanding. Extensive empirical evaluations on several popular Siamese trackers demonstrate the generality and effectiveness of our framework. Moreover, the results on five tracking benchmarks show that the proposed distilled trackers achieve compression rates of up to 18× and frame-rates of 265 FPS, while obtaining comparable tracking accuracy compared to base models.


Subject(s)
Algorithms , Learning , Humans
7.
IEEE Trans Image Process ; 31: 379-391, 2022.
Article in English | MEDLINE | ID: mdl-34874857

ABSTRACT

Existing person re-identification (Re-ID) methods usually rely heavily on large-scale thoroughly annotated training data. However, label noise is unavoidable due to inaccurate person detection results or annotation errors in real scenes. It is extremely challenging to learn a robust Re-ID model with label noise since each identity has very limited annotated training samples. To avoid fitting to the noisy labels, we propose to learn a prefatory model using a large learning rate at the early stage with a self-label refining strategy, in which the labels and network are jointly optimized. To further enhance the robustness, we introduce an online co-refining (CORE) framework with dynamic mutual learning, where networks and label predictions are online optimized collaboratively by distilling the knowledge from other peer networks. Moreover, it also reduces the negative impact of noisy labels using a favorable selective consistency strategy. CORE has two primary advantages: it is robust to different noise types and unknown noise ratios; it can be easily trained without much additional effort on the architecture design. Extensive experiments on Re-ID and image classification demonstrate that CORE outperforms its counterparts by a large margin under both practical and simulated noise settings. Notably, it also improves the state-of-the-art unsupervised Re-ID performance under standard settings. Code is available at https://github.com/mangye16/ReID-Label-Noise.


Subject(s)
Algorithms , Humans
8.
Article in English | MEDLINE | ID: mdl-34101583

ABSTRACT

Despite the success of stochastic variance-reduced gradient (SVRG) algorithms in solving large-scale problems, their stochastic gradient complexity often scales linearly with data size and is expensive for huge data. Accordingly, we propose a hybrid stochastic-deterministic minibatch proximal gradient~(HSDMPG) algorithm for strongly convex problems with linear prediction structure, e.g.~least squares and logistic/softmax regression. HSDMPG~enjoys improved computational complexity that is data-size-independent for large-scale problems. It iteratively samples an evolving~minibatch of individual losses to estimate the original problem, and efficiently minimizes the sampled smaller-sized subproblems. For strongly convex loss of n components, HSDMPG~attains an ϵ-optimization-error within [Formula: see text] stochastic gradient evaluations, where κ is condition number, ζ = 1 for quadratic loss and ζ = 2 for generic loss. For large-scale problems, our complexity outperforms those of SVRG-type algorithms with/without dependence on data size. Particularly, when ϵ = O(1/√n) which matches the intrinsic excess error of a learning model and is sufficient for generalization, our complexity for quadratic and generic losses is respectively O (n0.5log2(n)) and O (n0.5log3(n)), which for the first time achieves optimal generalization in less than a single pass over data. Besides, we extend HSDMPG~to online strongly convex problems and prove its higher efficiency over the prior algorithms. Numerical results demonstrate the computational advantages of~HSDMPG.

9.
IEEE Trans Pattern Anal Mach Intell ; 43(7): 2413-2428, 2021 07.
Article in English | MEDLINE | ID: mdl-31940522

ABSTRACT

This paper conducts a systematic study on the role of visual attention in video object pattern understanding. By elaborately annotating three popular video segmentation datasets (DAVIS 16, Youtube-Objects, and SegTrack V2) with dynamic eye-tracking data in the unsupervised video object segmentation (UVOS) setting. For the first time, we quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgments during dynamic, task-driven viewing. Such novel observations provide an in-depth insight of the underlying rationale behind video object pattens. Inspired by these findings, we decouple UVOS into two sub-tasks: UVOS-driven Dynamic Visual Attention Prediction (DVAP) in spatiotemporal domain, and Attention-Guided Object Segmentation (AGOS) in spatial domain. Our UVOS solution enjoys three major advantages: 1) modular training without using expensive video segmentation annotations, instead, using more affordable dynamic fixation data to train the initial video attention module and using existing fixation-segmentation paired static/image data to train the subsequent segmentation module; 2) comprehensive foreground understanding through multi-source learning; and 3) additional interpretability from the biologically-inspired and assessable attention. Experiments on four popular benchmarks show that, even without using expensive video object mask annotations, our model achieves compelling performance compared with state-of-the-arts and enjoys fast processing speed (10 fps on a single GPU). Our collected eye-tracking data and algorithm implementations have been made publicly available at https://github.com/wenguanwang/AGS.

10.
IEEE Trans Pattern Anal Mach Intell ; 43(10): 3365-3387, 2021 Oct.
Article in English | MEDLINE | ID: mdl-32217470

ABSTRACT

Image Super-Resolution (SR) is an important class of image processing techniqueso enhance the resolution of images and videos in computer vision. Recent years have witnessed remarkable progress of image super-resolution using deep learning techniques. This article aims to provide a comprehensive survey on recent advances of image super-resolution using deep learning approaches. In general, we can roughly group the existing studies of SR techniques into three major categories: supervised SR, unsupervised SR, and domain-specific SR. In addition, we also cover some other important issues, such as publicly available benchmark datasets and performance evaluation metrics. Finally, we conclude this survey by highlighting several future directions and open issues which should be further addressed by the community in the future.

11.
Neural Netw ; 127: 182-192, 2020 Jul.
Article in English | MEDLINE | ID: mdl-32361548

ABSTRACT

The accuracy of deep learning (e.g., convolutional neural networks) for an image classification task critically relies on the amount of labeled training data. Aiming to solve an image classification task on a new domain that lacks labeled data but gains access to cheaply available unlabeled data, unsupervised domain adaptation is a promising technique to boost the performance without incurring extra labeling cost, by assuming images from different domains share some invariant characteristics. In this paper, we propose a new unsupervised domain adaptation method named Domain-Adversarial Residual-Transfer (DART) learning of deep neural networks to tackle cross-domain image classification tasks. In contrast to the existing unsupervised domain adaption approaches, the proposed DART not only learns domain-invariant features via adversarial training, but also achieves robust domain-adaptive classification via a residual-transfer strategy, all in an end-to-end training framework. We evaluate the performance of the proposed method for cross-domain image classification tasks on several well-known benchmark data sets, in which our method clearly outperforms the state-of-the-art approaches.


Subject(s)
Neural Networks, Computer , Pattern Recognition, Automated/methods , Unsupervised Machine Learning , Deep Learning/trends , Humans , Pattern Recognition, Automated/trends , Unsupervised Machine Learning/trends
12.
IEEE Trans Neural Netw Learn Syst ; 31(11): 4933-4945, 2020 Nov.
Article in English | MEDLINE | ID: mdl-31940565

ABSTRACT

The overestimation caused by function approximation is a well-known property in Q-learning algorithms, especially in single-critic models, which leads to poor performance in practical tasks. However, the opposite property, underestimation, which often occurs in Q-learning methods with double critics, has been largely left untouched. In this article, we investigate the underestimation phenomenon in the recent twin delay deep deterministic actor-critic algorithm and theoretically demonstrate its existence. We also observe that this underestimation bias does indeed hurt performance in various experiments. Considering the opposite properties of single-critic and double-critic methods, we propose a novel triplet-average deep deterministic policy gradient algorithm that takes the weighted action value of three target critics to reduce the estimation bias. Given the connection between estimation bias and approximation error, we suggest averaging previous target values to reduce per-update error and further improve performance. Extensive empirical results over various continuous control tasks in OpenAI gym show that our approach outperforms the state-of-the-art methods.

13.
IEEE Trans Cybern ; 50(5): 1833-1843, 2020 May.
Article in English | MEDLINE | ID: mdl-30629527

ABSTRACT

Learning graphs from data automatically have shown encouraging performance on clustering and semisupervised learning tasks. However, real data are often corrupted, which may cause the learned graph to be inexact or unreliable. In this paper, we propose a novel robust graph learning scheme to learn reliable graphs from the real-world noisy data by adaptively removing noise and errors in the raw data. We show that our proposed model can also be viewed as a robust version of manifold regularized robust principle component analysis (RPCA), where the quality of the graph plays a critical role. The proposed model is able to boost the performance of data clustering, semisupervised classification, and data recovery significantly, primarily due to two key factors: 1) enhanced low-rank recovery by exploiting the graph smoothness assumption and 2) improved graph construction by exploiting clean data recovered by RPCA. Thus, it boosts the clustering, semisupervised classification, and data recovery performance overall. Extensive experiments on image/document clustering, object recognition, image shadow removal, and video background subtraction reveal that our model outperforms the previous state-of-the-art methods.

14.
J Comput Chem ; 35(15): 1111-21, 2014 Jun 05.
Article in English | MEDLINE | ID: mdl-24648309

ABSTRACT

Elastic network models (ENM) are based on the idea that the geometry of a protein structure provides enough information for computing its fluctuations around its equilibrium conformation. This geometry is represented as an elastic network (EN) that is, a network of links between residues. A spring is associated with each of these links. The normal modes of the protein are then identified with the normal modes of the corresponding network of springs. Standard approaches for generating ENs rely on a cutoff distance. There is no consensus on how to choose this cutoff. In this work, we propose instead to filter the set of all residue pairs in a protein using the concept of alpha shapes. The main alpha shape we considered is based on the Delaunay triangulation of the Cα positions; we referred to the corresponding EN as EN(∞). We have shown that heterogeneous anisotropic network models, called αHANMs, that are based on EN(∞) reproduce experimental B-factors very well, with correlation coefficients above 0.99 and root-mean-square deviations below 0.1 Å(2) for a large set of high resolution protein structures. The construction of EN(∞) is simple to implement and may be used automatically for generating ENs for all types of ENMs.


Subject(s)
Proteins/chemistry , Algorithms , Anisotropy , Computer Simulation , Models, Chemical , Models, Molecular , Protein Conformation
15.
BMC Bioinformatics ; 15: 57, 2014 Feb 26.
Article in English | MEDLINE | ID: mdl-24568581

ABSTRACT

BACKGROUND: Binding free energy and binding hot spots at protein-protein interfaces are two important research areas for understanding protein interactions. Computational methods have been developed previously for accurate prediction of binding free energy change upon mutation for interfacial residues. However, a large number of interrupted and unimportant atomic contacts are used in the training phase which caused accuracy loss. RESULTS: This work proposes a new method, ßACVASA, to predict the change of binding free energy after alanine mutations. ßACVASA integrates accessible surface area (ASA) and our newly defined ß contacts together into an atomic contact vector (ACV). A ß contact between two atoms is a direct contact without being interrupted by any other atom between them. A ß contact's potential contribution to protein binding is also supposed to be inversely proportional to its ASA to follow the water exclusion hypothesis of binding hot spots. Tested on a dataset of 396 alanine mutations, our method is found to be superior in classification performance to many other methods, including Robetta, FoldX, HotPOINT, an ACV method of ß contacts without ASA integration, and ACVASA methods (similar to ßACVASA but based on distance-cutoff contacts). Based on our data analysis and results, we can draw conclusions that: (i) our method is powerful in the prediction of binding free energy change after alanine mutation; (ii) ß contacts are better than distance-cutoff contacts for modeling the well-organized protein-binding interfaces; (iii) ß contacts usually are only a small fraction number of the distance-based contacts; and (iv) water exclusion is a necessary condition for a residue to become a binding hot spot. CONCLUSIONS: ßACVASA is designed using the advantages of both ß contacts and water exclusion. It is an excellent tool to predict binding free energy changes and binding hot spots after alanine mutation.


Subject(s)
Computational Biology/methods , Protein Binding , Proteins/chemistry , Proteins/metabolism , Thermodynamics , Alanine/chemistry , Alanine/metabolism , Models, Molecular , Mutation , Water
16.
IEEE Trans Pattern Anal Mach Intell ; 36(3): 536-49, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24457509

ABSTRACT

Recent years have witnessed a number of studies on distance metric learning to improve visual similarity search in content-based image retrieval (CBIR). Despite their successes, most existing methods on distance metric learning are limited in two aspects. First, they usually assume the target proximity function follows the family of Mahalanobis distances, which limits their capacity of measuring similarity of complex patterns in real applications. Second, they often cannot effectively handle the similarity measure of multimodal data that may originate from multiple resources. To overcome these limitations, this paper investigates an online kernel similarity learning framework for learning kernel-based proximity functions which goes beyond the conventional linear distance metric learning approaches. Based on the framework, we propose a novel online multiple kernel similarity (OMKS) learning method which learns a flexible nonlinear proximity function with multiple kernels to improve visual similarity search in CBIR. We evaluate the proposed technique for CBIR on a variety of image data sets in which encouraging results show that OMKS outperforms the state-of-the-art techniques significantly.

17.
IEEE Trans Pattern Anal Mach Intell ; 36(3): 550-63, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24457510

ABSTRACT

Auto face annotation, which aims to detect human faces from a facial image and assign them proper human names, is a fundamental research problem and beneficial to many real-world applications. In this work, we address this problem by investigating a retrieval-based annotation scheme of mining massive web facial images that are freely available over the Internet. In particular, given a facial image, we first retrieve the top $(n)$ similar instances from a large-scale web facial image database using content-based image retrieval techniques, and then use their labels for auto annotation. Such a scheme has two major challenges: 1) how to retrieve the similar facial images that truly match the query, and 2) how to exploit the noisy labels of the top similar facial images, which may be incorrect or incomplete due to the nature of web images. In this paper, we propose an effective Weak Label Regularized Local Coordinate Coding (WLRLCC) technique, which exploits the principle of local coordinate coding by learning sparse features, and employs the idea of graph-based weak label regularization to enhance the weak labels of the similar facial images. An efficient optimization algorithm is proposed to solve the WLRLCC problem. Moreover, an effective sparse reconstruction scheme is developed to perform the face annotation task. We conduct extensive empirical studies on several web facial image databases to evaluate the proposed WLRLCC algorithm from different aspects. The experimental results validate its efficacy. We share the two constructed databases "WDB" (714,454 images of 6,025 people) and "ADB" (126,070 images of 1,200 people) with the public. To further improve the efficiency and scalability, we also propose an offline approximation scheme (AWLRLCC) which generally maintains comparable results but significantly reduces the annotation time.


Subject(s)
Biometric Identification/methods , Databases, Factual , Face/anatomy & histology , Image Processing, Computer-Assisted/methods , Algorithms , Artificial Intelligence , Female , Humans , Internet , Male
18.
Article in English | MEDLINE | ID: mdl-26355502

ABSTRACT

Coupling graphs are newly introduced in this paper to meet many application needs particularly in the field of bioinformatics. A coupling graph is a two-layer graph complex, in which each node from one layer of the graph complex has at least one connection with the nodes in the other layer, and vice versa. The coupling graph model is sufficiently powerful to capture strong and inherent associations between subgraph pairs in complicated applications. The focus of this paper is on mining algorithms of frequent coupling subgraphs and bioinformatics application. Although existing frequent subgraph mining algorithms are competent to identify frequent subgraphs from a graph database, they perform poorly on frequent coupling subgraph mining because they generate many irrelevant subgraphs. We propose a novel graph transformation technique to transform a coupling graph into a generic graph. Based on the transformed coupling graphs, existing graph mining methods are then utilized to discover frequent coupling subgraphs. We prove that the transformation is precise and complete and that the restoration is reversible. Experiments carried out on a database containing 10,511 coupling graphs show that our proposed algorithm reduces the mining time very much in comparison with the existing subgraph mining algorithms. Moreover, we demonstrate the usefulness of frequent coupling subgraphs by applying our algorithm to make accurate predictions of epitopes in antibody-antigen binding.


Subject(s)
Algorithms , Computational Biology/methods , Data Mining/methods , Epitopes, B-Lymphocyte/chemistry , Databases, Protein , Models, Molecular , Models, Statistical
19.
PLoS One ; 8(4): e59737, 2013.
Article in English | MEDLINE | ID: mdl-23630569

ABSTRACT

Specific binding between proteins plays a crucial role in molecular functions and biological processes. Protein binding interfaces and their atomic contacts are typically defined by simple criteria, such as distance-based definitions that only use some threshold of spatial distance in previous studies. These definitions neglect the nearby atomic organization of contact atoms, and thus detect predominant contacts which are interrupted by other atoms. It is questionable whether such kinds of interrupted contacts are as important as other contacts in protein binding. To tackle this challenge, we propose a new definition called beta (ß) atomic contacts. Our definition, founded on the ß-skeletons in computational geometry, requires that there is no other atom in the contact spheres defined by two contact atoms; this sphere is similar to the van der Waals spheres of atoms. The statistical analysis on a large dataset shows that ß contacts are only a small fraction of conventional distance-based contacts. To empirically quantify the importance of ß contacts, we design ßACV, an SVM classifier with ß contacts as input, to classify homodimers from crystal packing. We found that our ßACV is able to achieve the state-of-the-art classification performance superior to SVM classifiers with distance-based contacts as input. Our ßACV also outperforms several existing methods when being evaluated on several datasets in previous works. The promising empirical performance suggests that ß contacts can truly identify critical specific contacts in protein binding interfaces. ß contacts thus provide a new model for more precise description of atomic organization in protein quaternary structures than distance-based contacts.


Subject(s)
Proteins/chemistry , Crystallography, X-Ray , Decision Trees , Hydrogen Bonding , Hydrophobic and Hydrophilic Interactions , Models, Chemical , Models, Molecular , Protein Interaction Domains and Motifs , Protein Structure, Quaternary , Proteins/classification , Support Vector Machine
20.
PLoS One ; 7(12): e50821, 2012.
Article in English | MEDLINE | ID: mdl-23272073

ABSTRACT

A multi-interface domain is a domain that can shape multiple and distinctive binding sites to contact with many other domains, forming a hub in domain-domain interaction networks. The functions played by the multiple interfaces are usually different, but there is no strict bijection between the functions and interfaces as some subsets of the interfaces play the same function. This work applies graph theory and algorithms to discover fingerprints for the multiple interfaces of a domain and to establish associations between the interfaces and functions, based on a huge set of multi-interface proteins from PDB. We found that about 40% of proteins have the multi-interface property, however the involved multi-interface domains account for only a tiny fraction (1.8%) of the total number of domains. The interfaces of these domains are distinguishable in terms of their fingerprints, indicating the functional specificity of the multiple interfaces in a domain. Furthermore, we observed that both cooperative and distinctive structural patterns, which will be useful for protein engineering, exist in the multiple interfaces of a domain.


Subject(s)
Databases, Protein , Proteins/chemistry , Algorithms , Animals , Binding Sites , Cluster Analysis , Crystallography, X-Ray/methods , Humans , Ligands , Models, Molecular , Models, Statistical , Molecular Conformation , Proteasome Endopeptidase Complex/chemistry , Protein Binding , Protein Conformation , Protein Interaction Mapping , Protein Structure, Secondary , Protein Structure, Tertiary
SELECTION OF CITATIONS
SEARCH DETAIL
...