Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Cancers (Basel) ; 14(5)2022 Feb 24.
Article in English | MEDLINE | ID: mdl-35267493

ABSTRACT

Cancer tissue-of-origin specific biomarkers are needed for effective diagnosis, monitoring, and treatment of cancers. In this study, we analyzed transcriptomics data from 37 cancer types provided by The Cancer Genome Atlas (TCGA) to identify cancer tissue-of-origin specific gene expression signatures. We developed a deep neural network model to classify cancers based on gene expression data. The model achieved a predictive accuracy of >97% across cancer types indicating the presence of distinct cancer tissue-of-origin specific gene expression signatures. We interpreted the model using Shapley additive explanations to identify specific gene signatures that significantly contributed to cancer-type classification. We evaluated the model and the validity of gene signatures using an independent test data set from the International Cancer Genome Consortium. In conclusion, we present a robust neural network model for accurate classification of cancers based on gene expression data and also provide a list of gene signatures that are valuable for developing biomarker panels for determining cancer tissue-of-origin. These gene signatures serve as valuable biomarkers for determining tissue-of-origin for cancers of unknown primary.

2.
EBioMedicine ; 76: 103759, 2022 Feb.
Article in English | MEDLINE | ID: mdl-35033986

ABSTRACT

BACKGROUND: While blood transfusion is an essential cornerstone of hematological care, patients requiring repetitive transfusion remain at persistent risk of alloimmunization due to the diversity of human blood group polymorphisms. Despite the promise, user friendly methods to accurately identify blood types from next-generation sequencing data are currently lacking. To address this unmet need, we have developed RBCeq, a novel genetic blood typing algorithm to accurately identify 36 blood group systems. METHODS: RBCeq can predict complex blood groups such as RH, and ABO that require identification of small indels and copy number variants. RBCeq also reports clinically significant, rare, and novel variants with potential clinical relevance that may lead to the identification of novel blood group alleles. FINDINGS: The RBCeq algorithm demonstrated 99·07% concordance when validated on 402 samples which included 29 antigens with serology and 9 antigens with SNP-array validation in 14 blood group systems and 59 antigens validation on manual predicted phenotype from variant call files. We have also developed a user-friendly web server that generates detailed blood typing reports with advanced visualization (https://www.rbceq.org/). INTERPRETATION: RBCeq will assist blood banks and immunohematology laboratories by overcoming existing methodological limitations like scalability, reproducibility, and accuracy when genotyping and phenotyping in multi-ethnic populations. This Amazon Web Services (AWS) cloud based platform has the potential to reduce pre-transfusion testing time and to increase sample processing throughput, ultimately improving quality of patient care. FUNDING: This work was supported in part by Advance Queensland Research Fellowship, MRFF Genomics Health Futures Mission (76,757), and the Australian Red Cross LifeBlood. The Australian governments fund the Australian Red Cross Lifeblood for the provision of blood, blood products and services to the Australian community.


Subject(s)
Blood Group Antigens , Blood Grouping and Crossmatching , Algorithms , Australia , Blood Group Antigens/genetics , Genotype , Humans , Reproducibility of Results
3.
Front Genet ; 9: 250, 2018.
Article in English | MEDLINE | ID: mdl-30065749

ABSTRACT

Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) is a powerful genomic technology that is used for the global mapping and analysis of open chromatin regions. However, for users to process and analyze such data they either have to use a number of complicated bioinformatic tools or attempt to use the currently available ATAC-seq analysis software, which are not very user friendly and lack visualization of the ATAC-seq results. Because of these issues, biologists with minimal bioinformatics background who wish to process and analyze their own ATAC-seq data by themselves will find these tasks difficult and ultimately will need to seek help from bioinformatics experts. Moreover, none of the available tools provide complete solution for ATAC-seq data analysis. Therefore, to enable non-programming researchers to analyze ATAC-seq data on their own, we developed a tool called Graphical User interface for the Analysis and Visualization of ATAC-seq data (GUAVA). GUAVA is a standalone software that provides users with a seamless solution from beginning to end including adapter trimming, read mapping, the identification and differential analysis of ATAC-seq peaks, functional annotation, and the visualization of ATAC-seq results. We believe GUAVA will be a highly useful and time-saving tool for analyzing ATAC-seq data for biologists with minimal or no bioinformatics background. Since GUAVA can also operate through command-line, it can easily be integrated into existing pipelines, thus providing flexibility to users with computational experience.

4.
Database (Oxford) ; 20182018 01 01.
Article in English | MEDLINE | ID: mdl-29992322

ABSTRACT

The identification and functional characterization of novel biomarkers in cancer requires survival analysis and gene expression analysis of both patient samples and cell line models. To help facilitate this process, we have developed KM-Express. KM-Express holds an extensive manually curated transcriptomic profile of 45 different datasets for prostate and breast cancer with phenotype and pathoclinical information, spanning from clinical samples to cell lines. KM-Express also contains The Cancer Genome Atlas datasets for 30 other cancer types with matching cell line expression data for 23 of them. We present KM-Express as a hypothesis generation tool for researchers to identify potential new prognostic RNA biomarkers as well as targets for further downstream functional cell-based studies. Specifically, KM-Express allows users to compare the expression level of genes in different groups of patients based on molecular, genetic, clinical and pathological status. Moreover, KM-Express aids the design of biological experiments based on the expression profile of the genes in different cell lines. Thus, KM-Express provides a one-stop analysis from bench work to clinical prospects. We have used this tool to successfully evaluate the prognostic potential of previously published biomarkers for prostate cancer and breast cancer. We believe KM-Express will accelerate the translation of biomedical research from bench to bed.Database URL: http://ec2-52-201-246-161.compute-1.amazonaws.com/kmexpress/index.php.


Subject(s)
Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Gene Expression Regulation, Neoplastic , Internet , Prostatic Neoplasms/genetics , Software , Cell Line, Tumor , Databases, Genetic , Female , Humans , Male , Reproducibility of Results , Survival Analysis
5.
Front Plant Sci ; 7: 1203, 2016.
Article in English | MEDLINE | ID: mdl-27582746

ABSTRACT

Andrographis paniculata is an important medicinal plant containing various bioactive terpenoids and flavonoids. Despite its importance in herbal medicine, no ready-to-use transcript sequence information of this plant is made available in the public data base, this study mainly deals with the sequencing of RNA from A. paniculata leaf using Illumina HiSeq™ 2000 platform followed by the de novo transcriptome assembly. A total of 189.22 million high quality paired reads were generated and 1,70,724 transcripts were predicted in the primary assembly. Secondary assembly generated a transcriptome size of ~88 Mb with 83,800 clustered transcripts. Based on the similarity searches against plant non-redundant protein database, gene ontology, and eukaryotic orthologous groups, 49,363 transcripts were annotated constituting upto 58.91% of the identified unigenes. Annotation of transcripts-using kyoto encyclopedia of genes and genomes database-revealed 5606 transcripts plausibly involved in 140 pathways including biosynthesis of terpenoids and other secondary metabolites. Transcription factor analysis showed 6767 unique transcripts belonging to 97 different transcription factor families. A total number of 124 CYP450 transcripts belonging to seven divergent clans have been identified. Transcriptome revealed 146 different transcripts coding for enzymes involved in the biosynthesis of terpenoids of which 35 contained terpene synthase motifs. This study also revealed 32,341 simple sequence repeats (SSRs) in 23,168 transcripts. Assembled sequences of transcriptome of A. paniculata generated in this study are made available, for the first time, in the TSA database, which provides useful information for functional and comparative genomic analysis besides identification of key enzymes involved in the various pathways of secondary metabolism.

6.
BMC Genomics ; 16: 692, 2015 Sep 15.
Article in English | MEDLINE | ID: mdl-26369665

ABSTRACT

BACKGROUND: Developing drought-tolerant rice varieties with higher yield under water stressed conditions provides a viable solution to serious yield-reduction impact of drought. Understanding the molecular regulation of this polygenic trait is crucial for the eventual success of rice molecular breeding programmes. microRNAs have received tremendous attention recently due to its importance in negative regulation. In plants, apart from regulating developmental and physiological processes, microRNAs have also been associated with different biotic and abiotic stresses. Hence here we chose to analyze the differential expression profiles of microRNAs in three drought treated rice varieties: Vandana (drought-tolerant), Aday Sel (drought-tolerant) and IR64 (drought-susceptible) in greenhouse conditions via high-throughput sequencing. RESULTS: Twenty-six novel microRNA candidates involved in the regulation of diverse biological processes were identified based on the detection of miRNA*. Out of their 110 predicted targets, we confirmed 16 targets from 5 novel microRNA candidates. In the differential expression analysis, mature microRNA members from 49 families of known Oryza sativa microRNA were differentially expressed in leaf and stem respectively with over 28 families having at least a similar mature microRNA member commonly found to be differentially expressed between both tissues. Via the sequence profiling data of leaf samples, we identified osa-miR397a/b, osa-miR398b, osa-miR408-5p and osa-miR528-5p as being down-regulated in two drought-tolerant rice varieties and up-regulated in the drought-susceptible variety. These microRNAs are known to be involved in regulating starch metabolism, antioxidant defence, respiration and photosynthesis. A wide range of biological processes were found to be regulated by the target genes of all the identified differentially expressed microRNAs between both tissues, namely root development (5.3-5.7 %), cell transport (13.2-18.4 %), response to stress (10.5-11.3 %), lignin catabolic process (3.8-5.3 %), metabolic processes (32.1-39.5 %), oxidation-reduction process (9.4-13.2 %) and DNA replication (5.7-7.9 %). The predicted target genes of osa-miR166e-3p, osa-miR166h-5p*, osa-miR169r-3p* and osa-miR397a/b were found to be annotated to several of the aforementioned biological processes. CONCLUSIONS: The experimental design of this study, which features rice varieties with different drought tolerance and tissue specificity (leaf and stem), has provided new microRNA profiling information. The potentially regulatory importance of the microRNA genes mentioned above and their target genes would require further functional analyses.


Subject(s)
Adaptation, Biological/genetics , Droughts , Gene Expression Regulation, Plant , MicroRNAs/genetics , Multigene Family , Oryza/genetics , Chromosome Mapping , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling , Gene Regulatory Networks , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation , Oryza/metabolism , Plant Leaves/genetics , Plant Leaves/metabolism , Signal Transduction , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL
...