Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
2.
Nat Genet ; 50(12): 1754, 2018 12.
Article in English | MEDLINE | ID: mdl-30420650

ABSTRACT

In the version of the article published, the author list is not accurate. Igor Cima and Min-Han Tan should have been authors, appearing after Mark Wong in the author list, while Paul Jongjoon Choi should not have been listed as an author. Igor Cima and Min-Han Tan both have the affiliation Institute of Bioengineering and Nanotechnology, Singapore, Singapore, and their contributions should have been noted in the Author Contributions section as "I.C. preprocessed Primary Cell Atlas data with inputs from M.-H.T." The following description of the contribution of Paul Jongjoon Choi should not have appeared: "P.J.C. supported the smFISH experiments." In the 'RCA: global panel' section of the Online Methods, the following sentence should have appeared as the second sentence, "An expression atlas of human primary cells (the Primary Cell Atlas) was preprocessed similarly to in ref. 55," with new reference 55 (Cima, I. et al. Tumor-derived circulating endothelial cell clusters in colorectal cancer. Science Transl. Med. 8, 345ra89, 2016).

3.
Nat Genet ; 49(5): 708-718, 2017 May.
Article in English | MEDLINE | ID: mdl-28319088

ABSTRACT

Intratumoral heterogeneity is a major obstacle to cancer treatment and a significant confounding factor in bulk-tumor profiling. We performed an unbiased analysis of transcriptional heterogeneity in colorectal tumors and their microenvironments using single-cell RNA-seq from 11 primary colorectal tumors and matched normal mucosa. To robustly cluster single-cell transcriptomes, we developed reference component analysis (RCA), an algorithm that substantially improves clustering accuracy. Using RCA, we identified two distinct subtypes of cancer-associated fibroblasts (CAFs). Additionally, epithelial-mesenchymal transition (EMT)-related genes were found to be upregulated only in the CAF subpopulation of tumor samples. Notably, colorectal tumors previously assigned to a single subtype on the basis of bulk transcriptomics could be divided into subgroups with divergent survival probability by using single-cell signatures, thus underscoring the prognostic value of our approach. Overall, our results demonstrate that unbiased single-cell RNA-seq profiling of tumor and matched normal samples provides a unique opportunity to characterize aberrant cell states within a tumor.


Subject(s)
Colorectal Neoplasms/genetics , Gene Expression Profiling/methods , Gene Expression Regulation, Neoplastic , Single-Cell Analysis/methods , Transcriptome , A549 Cells , Algorithms , Cell Line , Cell Line, Tumor , Cluster Analysis , Colorectal Neoplasms/pathology , Epithelial-Mesenchymal Transition/genetics , Fibroblasts/metabolism , Genetic Heterogeneity , Humans , Immunohistochemistry , In Situ Hybridization, Fluorescence , K562 Cells , Principal Component Analysis , Prognosis , Sequence Analysis, RNA/methods , Survival Analysis
4.
BMC Genomics ; 12 Suppl 3: S11, 2011 Nov 30.
Article in English | MEDLINE | ID: mdl-22479704

ABSTRACT

BACKGROUND: Granzyme B is a serine protease which cleaves at unique tetrapeptide sequences. It is involved in several signaling cross-talks with caspases and functions as a pivotal mediator in a broad range of cellular processes such as apoptosis and inflammation. The granzyme B degradome constitutes proteins from a myriad of functional classes with many more expected to be discovered. However, the experimental discovery and validation of bona fide granzyme B substrates require time consuming and laborious efforts. As such, computational methods for the prediction of substrates would be immensely helpful. RESULTS: We have compiled a dataset of 580 experimentally verified granzyme B cleavage sites and found distinctive patterns of residue conservation and position-specific residue propensities which could be useful for in silico prediction using machine learning algorithms. We trained a series of support vector machines (SVM) classifiers employing Bayes Feature Extraction to predict cleavage sites using sequence windows of diverse lengths and compositions. The SVM classifiers achieved accuracy and AROC scores between 71.00% to 86.50% and 0.78 to 0.94 respectively on independent test sets. We have applied our prediction method on the Chikungunya viral proteome and identified several regulatory domains of viral proteins to be potential sites of granzyme B cleavage, suggesting direct antiviral activity of granzyme B during host-viral innate immune responses. CONCLUSIONS: We have compiled a comprehensive dataset of granzyme B cleavage sites and developed an accurate SVM-based prediction method utilizing Bayes Feature Extraction to identify novel substrates of granzyme B in silico. The prediction server is available online, together with reference datasets and supplementary materials.


Subject(s)
Computational Biology , Databases, Factual , Granzymes/metabolism , Bayes Theorem , Chikungunya virus/metabolism , Proteome/metabolism , Support Vector Machine , Viral Proteins/metabolism
5.
BMC Genomics ; 11 Suppl 4: S21, 2010 Dec 02.
Article in English | MEDLINE | ID: mdl-21143805

ABSTRACT

BACKGROUND: The identification of B-cell epitopes on antigens has been a subject of intense research as the knowledge of these markers has great implications for the development of peptide-based diagnostics, therapeutics and vaccines. As experimental approaches are often laborious and time consuming, in silico methods for prediction of these immunogenic regions are critical. Such efforts, however, have been significantly hindered by high variability in the length and composition of the epitope sequences, making naïve modeling methods difficult to apply. RESULTS: We analyzed two benchmark datasets and found that linear B-cell epitopes possess distinctive residue conservation and position-specific residue propensities which could be exploited for epitope discrimination in silico. We developed a support vector machines (SVM) prediction model employing Bayes Feature Extraction to predict linear B-cell epitopes of diverse lengths (12- to 20-mers). The best SVM classifier achieved an accuracy of 74.50% and AROC of 0.84 on an independent test set and was shown to outperform existing linear B-cell epitope prediction algorithms. In addition, we applied our model to a dataset of antigenic proteins with experimentally-verified epitopes and found it to be generally effective for discriminating the epitopes from non-epitopes. CONCLUSION: We developed a SVM prediction model utilizing Bayes Feature Extraction and showed that it was effective in discriminating epitopes from non-epitopes in benchmark datasets and annotated antigenic proteins. A web server for predicting linear B-cell epitopes was developed and is available, together with supplementary materials, at http://www.immunopred.org/bayesb/index.html.


Subject(s)
Bayes Theorem , Epitopes, B-Lymphocyte , Algorithms , Antigens/chemistry , Antigens/immunology , Benchmarking , Computer Simulation , Epitopes, B-Lymphocyte/immunology , Internet , Peptides/chemistry , Peptides/immunology , Predictive Value of Tests
6.
PLoS One ; 5(6): e11267, 2010 Jun 23.
Article in English | MEDLINE | ID: mdl-20585645

ABSTRACT

BACKGROUND: Symptomatic infection by dengue virus (DENV) can range from dengue fever (DF) to dengue haemorrhagic fever (DHF), however, the determinants of DF or DHF progression are not completely understood. It is hypothesised that host innate immune response factors are involved in modulating the disease outcome and the expression levels of genes involved in this response could be used as early prognostic markers for disease severity. METHODOLOGY/PRINCIPAL FINDINGS: mRNA expression levels of genes involved in DENV innate immune responses were measured using quantitative real time PCR (qPCR). Here, we present a novel application of the support vector machines (SVM) algorithm to analyze the expression pattern of 12 genes in peripheral blood mononuclear cells (PBMCs) of 28 dengue patients (13 DHF and 15 DF) during acute viral infection. The SVM model was trained using gene expression data of these genes and achieved the highest accuracy of approximately 85% with leave-one-out cross-validation. Through selective removal of gene expression data from the SVM model, we have identified seven genes (MYD88, TLR7, TLR3, MDA5, IRF3, IFN-alpha and CLEC5A) that may be central in differentiating DF patients from DHF, with MYD88 and TLR7 observed to be the most important. Though the individual removal of expression data of five other genes had no impact on the overall accuracy, a significant combined role was observed when the SVM model of the two main genes (MYD88 and TLR7) was re-trained to include the five genes, increasing the overall accuracy to approximately 96%. CONCLUSIONS/SIGNIFICANCE: Here, we present a novel use of the SVM algorithm to classify DF and DHF patients, as well as to elucidate the significance of the various genes involved. It was observed that seven genes are critical in classifying DF and DHF patients: TLR3, MDA5, IRF3, IFN-alpha, CLEC5A, and the two most important MYD88 and TLR7. While these preliminary results are promising, further experimental investigation is necessary to validate their specific roles in dengue disease.


Subject(s)
Dengue/classification , Gene Expression , Dengue/genetics , Dengue/immunology , Humans , Immunity, Innate/genetics , RNA, Messenger/genetics
7.
BMC Genomics ; 10 Suppl 3: S6, 2009 Dec 03.
Article in English | MEDLINE | ID: mdl-19958504

ABSTRACT

BACKGROUND: Caspases belong to a class of cysteine proteases which function as critical effectors in cellular processes such as apoptosis and inflammation by cleaving substrates immediately after unique tetrapeptide sites. With hundreds of reported substrates and many more expected to be discovered, the elucidation of the caspase degradome will be an important milestone in the study of these proteases in human health and disease. Several computational methods for predicting caspase cleavage sites have been developed recently for identifying potential substrates. However, as most of these methods are based primarily on the detection of the tetrapeptide cleavage sites - a factor necessary but not sufficient for predicting in vivo substrate cleavage - prediction outcomes will inevitably include many false positives. RESULTS: In this paper, we show that structural factors such as the presence of disorder and solvent exposure in the vicinity of the cleavage site are important and can be used to enhance results from cleavage site prediction. We constructed a two-step model incorporating cleavage site prediction and these factors to predict caspase substrates. Sequences are first predicted for cleavage sites using CASVM or GraBCas. Predicted cleavage sites are then scored, ranked and filtered against a cut-off based on their propensities for locating in disordered and solvent exposed regions. Using an independent dataset of caspase substrates, the model was shown to achieve greater positive predictive values compared to CASVM or GraBCas alone, and was able to reduce the false positives pool by up to 13% and 53% respectively while retaining all true positives. We applied our prediction model on the family of receptor tyrosine kinases (RTKs) and highlighted several members as potential caspase targets. The results suggest that RTKs may be generally regulated by caspase cleavage and in some cases, promote the induction of apoptotic cell death - a function distinct from their role as transducers of survival and growth signals. CONCLUSION: As a step towards the prediction of in vivo caspase substrates, we have developed an accurate method incorporating cleavage site prediction and structural factors. The multi-factor model augments existing methods and complements experimental efforts to define the caspase degradome on the systems-wide basis.


Subject(s)
Caspases/chemistry , Models, Biological , Proteome/analysis , Amino Acid Sequence , Caspases/metabolism , Humans , Molecular Sequence Data , Protein Structure, Secondary
8.
Bioinformatics ; 23(23): 3241-3, 2007 Dec 01.
Article in English | MEDLINE | ID: mdl-17599937

ABSTRACT

UNLABELLED: Caspases belong to a unique class of cysteine proteases which function as critical effectors of apoptosis, inflammation and other important cellular processes. Caspases cleave substrates at specific tetrapeptide sites after a highly conserved aspartic acid residue. Prediction of such cleavage sites will complement structural and functional studies on substrates cleavage as well as discovery of new substrates. We have recently developed a support vector machines (SVM) method to address this issue. Our algorithm achieved an accuracy ranging from 81.25 to 97.92%, making it one of the best methods currently available. CASVM is the web server implementation of our SVM algorithms, written in Perl and hosted on a Linux platform. The server can be used for predicting non-canonical caspase substrate cleavage sites. We have also included a relational database containing experimentally verified caspase substrates retrievable using accession IDs, keywords or sequence similarity. AVAILABILITY: http://www.casbase.org/casvm/index.html


Subject(s)
Artificial Intelligence , Caspases/chemistry , Internet , Pattern Recognition, Automated/methods , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Software , Algorithms , Amino Acid Sequence , Binding Sites , Enzyme Activation , Molecular Sequence Data , Protein Binding , Substrate Specificity
9.
BMC Bioinformatics ; 7 Suppl 5: S14, 2006 Dec 18.
Article in English | MEDLINE | ID: mdl-17254298

ABSTRACT

BACKGROUND: Caspases belong to a class of cysteine proteases which function as critical effectors in apoptosis and inflammation by cleaving substrates immediately after unique sites. Prediction of such cleavage sites will complement structural and functional studies on substrates cleavage as well as discovery of new substrates. Recently, different computational methods have been developed to predict the cleavage sites of caspase substrates with varying degrees of success. As the support vector machines (SVM) algorithm has been shown to be useful in several biological classification problems, we have implemented an SVM-based method to investigate its applicability to this domain. RESULTS: A set of unique caspase substrates cleavage sites were obtained from literature and used for evaluating the SVM method. Datasets containing (i) the tetrapeptide cleavage sites, (ii) the tetrapeptide cleavage sites, augmented by two adjacent residues, P1' and P2' amino acids and (iii) the tetrapeptide cleavage sites with ten additional upstream and downstream flanking sequences (where available) were tested. The SVM method achieved an accuracy ranging from 81.25% to 97.92% on independent test sets. The SVM method successfully predicted the cleavage of a novel caspase substrate and its mutants. CONCLUSION: This study presents an SVM approach for predicting caspase substrate cleavage sites based on the cleavage sites and the downstream and upstream flanking sequences. The method shows an improvement over existing methods and may be useful for predicting hitherto undiscovered cleavage sites.


Subject(s)
Algorithms , Amino Acid Motifs , Caspases/metabolism , Computational Biology/methods , Sequence Analysis, Protein/methods , Amino Acid Sequence , Binding Sites , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...