Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Oncotarget ; 7(3): 2555-71, 2016 Jan 19.
Article in English | MEDLINE | ID: mdl-26700623

ABSTRACT

The selection of therapeutic targets is a critical aspect of antibody-drug conjugate research and development. In this study, we applied computational methods to select candidate targets overexpressed in three major breast cancer subtypes as compared with a range of vital organs and tissues. Microarray data corresponding to over 8,000 tissue samples were collected from the public domain. Breast cancer samples were classified into molecular subtypes using an iterative ensemble approach combining six classification algorithms and three feature selection techniques, including a novel kernel density-based method. This feature selection method was used in conjunction with differential expression and subcellular localization information to assemble a primary list of targets. A total of 50 cell membrane targets were identified, including one target for which an antibody-drug conjugate is in clinical use, and six targets for which antibody-drug conjugates are in clinical trials for the treatment of breast cancer and other solid tumors. In addition, 50 extracellular proteins were identified as potential targets for non-internalizing strategies and alternative modalities. Candidate targets linked with the epithelial-to-mesenchymal transition were identified by analyzing differential gene expression in epithelial and mesenchymal tumor-derived cell lines. Overall, these results show that mining human gene expression data has the power to select and prioritize breast cancer antibody-drug conjugate targets, and the potential to lead to new and more effective cancer therapeutics.


Subject(s)
Antibodies, Monoclonal/metabolism , Antineoplastic Agents/metabolism , Biomarkers, Tumor/genetics , Breast Neoplasms/classification , Computational Biology/methods , Immunoconjugates/genetics , Breast Neoplasms/drug therapy , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Drug Delivery Systems , Epithelial Cells , Epithelial-Mesenchymal Transition , Female , Gene Expression Profiling , Humans , Tumor Cells, Cultured
2.
BMC Bioinformatics ; 13: 54, 2012 Apr 04.
Article in English | MEDLINE | ID: mdl-22475802

ABSTRACT

BACKGROUND: Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space. RESULTS: We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples. CONCLUSIONS: Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.


Subject(s)
Algorithms , Data Mining , Gene Expression Profiling , Oligonucleotide Array Sequence Analysis , Animals , Arabidopsis , Brassica napus/genetics , Brassica napus/growth & development , Cluster Analysis , Cotyledon/metabolism , Malaria/immunology , Mice
3.
BMC Bioinformatics ; 11: 229, 2010 May 06.
Article in English | MEDLINE | ID: mdl-20459620

ABSTRACT

BACKGROUND: Modern high throughput experimental techniques such as DNA microarrays often result in large lists of genes. Computational biology tools such as clustering are then used to group together genes based on their similarity in expression profiles. Genes in each group are probably functionally related. The functional relevance among the genes in each group is usually characterized by utilizing available biological knowledge in public databases such as Gene Ontology (GO), KEGG pathways, association between a transcription factor (TF) and its target genes, and/or gene networks. RESULTS: We developed GOAL: Gene Ontology AnaLyzer, a software tool specifically designed for the functional evaluation of gene groups. GOAL implements and supports efficient and statistically rigorous functional interpretations of gene groups through its integration with available GO, TF-gene association data, and association with KEGG pathways. In order to facilitate more specific functional characterization of a gene group, we implement three GO-tree search strategies rather than one as in most existing GO analysis tools. Furthermore, GOAL offers flexibility in deployment. It can be used as a standalone tool, a plug-in to other computational biology tools, or a web server application. CONCLUSION: We developed a functional evaluation software tool, GOAL, to perform functional characterization of a gene group. GOAL offers three GO-tree search strategies and combines its strength in function integration, portability and visualization, and its flexibility in deployment. Furthermore, GOAL can be used to evaluate and compare gene groups as the output from computational biology tools such as clustering algorithms.


Subject(s)
Genes , Genomics/methods , Software , Databases, Genetic , Gene Expression Profiling/methods , Gene Regulatory Networks , Oligonucleotide Array Sequence Analysis
4.
J Bioinform Comput Biol ; 8(1): 19-38, 2010 Feb.
Article in English | MEDLINE | ID: mdl-20183872

ABSTRACT

An unsupervised multi-strategy approach has been developed to identify informative genes from high throughput genomic data. Several statistical methods have been used in the field to identify differentially expressed genes. Since different methods generate different lists of genes, it is very challenging to determine the most reliable gene list and the appropriate method. This paper presents a multi-strategy method, in which a combination of several data analysis techniques are applied to a given dataset and a confidence measure is established to select genes from the gene lists generated by these techniques to form the core of our final selection. The remainder of the genes that form the peripheral region are subject to exclusion or inclusion into the final selection. This paper demonstrates this methodology through its application to an in-house cancer genomics dataset and a public dataset. The results indicate that our method provides more reliable list of genes, which are validated using biological knowledge, biological experiments, and literature search. We further evaluated our multi-strategy method by consolidating two pairs of independent datasets, each pair is for the same disease, but generated by different labs using different platforms. The results showed that our method has produced far better results.


Subject(s)
Gene Expression Profiling/statistics & numerical data , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Animals , Artificial Intelligence , Cell Transformation, Neoplastic/drug effects , Cell Transformation, Neoplastic/genetics , Computational Biology , Databases, Genetic , Decision Trees , Genomics/statistics & numerical data , Humans , Leukemia, Myeloid, Acute/genetics , Mice , Transforming Growth Factor beta/pharmacology
5.
Int J Comput Biol Drug Des ; 1(3): 275-94, 2008.
Article in English | MEDLINE | ID: mdl-20054993

ABSTRACT

Current breast cancer predictive signatures are not unique. Can we use this fact to our advantage to improve prediction? From the machine learning perspective, it is well known that combining multiple classifiers can improve classification performance. We propose an ensemble machine learning approach which consists of choosing feature subsets and learning predictive models from them. We then combine models based on certain model fusion criteria and we also introduce a tuning parameter to control sensitivity. Our method significantly improves classification performance with a particular emphasis on sensitivity which is critical to avoid misclassifying poor prognosis patients as good prognosis.


Subject(s)
Artificial Intelligence , Breast Neoplasms/genetics , Breast Neoplasms/mortality , Survival Analysis , Algorithms , Breast Neoplasms/classification , Computational Biology , Computer Simulation , Databases, Genetic , Female , Gene Expression Profiling/statistics & numerical data , Humans , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Prognosis
SELECTION OF CITATIONS
SEARCH DETAIL
...