Pesquisa | Portal Regional da BVS

1.

27 MHz constant field dielectric warming of kidneys cryopreserved by vitrification.

Wowk, Brian; Phan, John; Pagotan, Roberto; Galvez, Erika; Fahy, Gregory M.

Cryobiology ; 115: 104893, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38609033

RESUMO

Organs cryopreserved by vitrification are exposed to the lowest possible concentration of cryoprotectants for the least time necessary to successfully avoid ice formation. Faster cooling and warming rates enable lower concentrations and perfusion times, reducing toxicity. Since warming rates necessary to avoid ice formation during recovery from vitrification are typically faster than cooling rates necessary for vitrification, warming speed is a major determining factor for successful vitrification. Dielectric warming uses an oscillating electric field to directly heat water and cryoprotectant molecules inside organs to achieve warming that's faster and more uniform than can be achieved by heat conduction from the organ surface. This work studied 27 MHz dielectric warming of rabbit kidneys perfused with M22 vitrification solution. The 27 MHz frequency was chosen because its long wavelength and penetration depth are suitable for human organs, because it had an anticipated favorable temperature of maximum dielectric absorption in M22, and because it's an allocated frequency for industrial and amateur use with inexpensive amplifiers available. Previously vitrified kidneys were warmed from -100 °C by placement in a 27 MHz electric field formed between parallel capacitor plates in a resonant circuit. Power was varied during warming to maintain constant electric field amplitude between the plates. Maximum power absorption occurred near -70 °C, with a peak warming rate near 150 °C/min in 50 mL total volume with approximately 500 W power. After some optimization, it was possible to warm â¼13 g vitrified kidneys with unprecedentedly little injury from medullary ice formation and a favorable serum creatinine trend after transplant. Distinct behaviors of power absorption and system tuning observed as a function of temperature during warming are promising for non-invasive thermometry and future automated control of the warming process at even faster rates with user-defined temperature dependence.

Assuntos

Criopreservação , Crioprotetores , Rim , Vitrificação , Animais , Coelhos , Criopreservação/métodos , Crioprotetores/química , Temperatura Alta , Preservação de Órgãos/métodos , Preservação de Órgãos/instrumentação

2.

Finding Candida auris in public metagenomic repositories.

Mario-Vasquez, Jorge E; Bagal, Ujwal R; Lowe, Elijah; Morgulis, Aleksandr; Phan, John; Sexton, D Joseph; Shiryev, Sergey; Slatkevicius, Rytis; Welsh, Rory; Litvintseva, Anastasia P; Blumberg, Matthew; Agarwala, Richa; Chow, Nancy A.

PLoS One ; 19(1): e0291406, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38241320

RESUMO

Candida auris is a newly emerged multidrug-resistant fungus capable of causing invasive infections with high mortality. Despite intense efforts to understand how this pathogen rapidly emerged and spread worldwide, its environmental reservoirs are poorly understood. Here, we present a collaborative effort between the U.S. Centers for Disease Control and Prevention, the National Center for Biotechnology Information, and GridRepublic (a volunteer computing platform) to identify C. auris sequences in publicly available metagenomic datasets. We developed the MetaNISH pipeline that uses SRPRISM to align sequences to a set of reference genomes and computes a score for each reference genome. We used MetaNISH to scan ~300,000 SRA metagenomic runs from 2010 onwards and identified five datasets containing C. auris reads. Finally, GridRepublic has implemented a prospective C. auris molecular monitoring system using MetaNISH and volunteer computing.

Assuntos

Candida , Candidíase , Humanos , Candida/genética , Candidíase/microbiologia , Candida auris , Estudos Prospectivos , Metagenômica , Antifúngicos/uso terapêutico

3.

A cloud-based resource for genome coordinate-based exploration and large-scale analysis of chromosome aberrations and gene fusions in cancer.

Wang, Janet; Zheng, Jeanne; Lee, Elaine E; Aguilar, Boris; Phan, John; Abdilleh, Kawther; Taylor, Ronald C; Longabaugh, William; Johansson, Bertil; Mertens, Fredrik; Mitelman, Felix; Pot, David; LaFramboise, Thomas.

Genes Chromosomes Cancer ; 62(8): 441-448, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-36695636

RESUMO

Cytogenetic analysis provides important information on the genetic mechanisms of cancer. The Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer (Mitelman DB) is the largest catalog of acquired chromosome aberrations, presently comprising >70 000 cases across multiple cancer types. Although this resource has enabled the identification of chromosome abnormalities leading to specific cancers and cancer mechanisms, a large-scale, systematic analysis of these aberrations and their downstream implications has been difficult due to the lack of a standard, automated mapping from aberrations to genomic coordinates. We previously introduced CytoConverter as a tool that automates such conversions. CytoConverter has now been updated with improved interpretation of karyotypes and has been integrated with the Mitelman DB, providing a comprehensive mapping of the 70 000+ cases to genomic coordinates, as well as visualization of the frequencies of chromosomal gains and losses. Importantly, all CytoConverter-generated genomic coordinates are publicly available in Google BigQuery, a cloud-based data warehouse, facilitating data exploration and integration with other datasets hosted by the Institute for Systems Biology Cancer Gateway in the Cloud (ISB-CGC) Resource. We demonstrate the use of BigQuery for integrative analysis of Mitelman DB with other cancer datasets, including a comparison of the frequency of imbalances identified in Mitelman DB cases with those found in The Cancer Genome Atlas (TCGA) copy number datasets. This solution provides opportunities to leverage the power of cloud computing for low-cost, scalable, and integrated analysis of chromosome aberrations and gene fusions in cancer.

Assuntos

Computação em Nuvem , Neoplasias , Humanos , Aberrações Cromossômicas , Cariotipagem , Neoplasias/genética , Fusão Gênica

4.

MycoSNP: A Portable Workflow for Performing Whole-Genome Sequencing Analysis of Candida auris.

Bagal, Ujwal R; Phan, John; Welsh, Rory M; Misas, Elizabeth; Wagner, Darlene; Gade, Lalitha; Litvintseva, Anastasia P; Cuomo, Christina A; Chow, Nancy A.

Methods Mol Biol ; 2517: 215-228, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35674957

RESUMO

Candida auris is an urgent public health threat characterized by high drug-resistant rates and rapid spread in healthcare settings worldwide. As part of the C. auris response, molecular surveillance has helped public health officials track the global spread and investigate local outbreaks. Here, we describe whole-genome sequencing analysis methods used for routine C. auris molecular surveillance in the United States; methods include reference selection, reference preparation, quality assessment and control of sequencing reads, read alignment, and single-nucleotide polymorphism calling and filtration. We also describe the newly developed pipeline MycoSNP, a portable workflow for performing whole-genome sequencing analysis of fungal organisms including C. auris.

Assuntos

Candida auris , Candidíase , Antifúngicos/uso terapêutico , Candida auris/genética , Candidíase/microbiologia , Humanos , Estados Unidos , Sequenciamento Completo do Genoma , Fluxo de Trabalho

5.

SL-Cloud: A Cloud-based resource to support synthetic lethal interaction discovery.

Tercan, Bahar; Qin, Guangrong; Kim, Taek-Kyun; Aguilar, Boris; Phan, John; Longabaugh, William; Pot, David; Kemp, Christopher J; Chambwe, Nyasha; Shmulevich, Ilya.

F1000Res ; 11: 493, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36761837

RESUMO

Synthetic lethal interactions (SLIs), genetic interactions in which the simultaneous inactivation of two genes leads to a lethal phenotype, are promising targets for therapeutic intervention in cancer, as exemplified by the recent success of PARP inhibitors in treating BRCA1/2-deficient tumors. We present SL-Cloud, a new component of the Institute for Systems Biology Cancer Gateway in the Cloud (ISB-CGC), that provides an integrated framework of cloud-hosted data resources and curated workflows to enable facile prediction of SLIs. This resource addresses two main challenges related to SLI inference: the need to wrangle and preprocess large multi-omic datasets and the availability of multiple comparable prediction approaches. SL-Cloud enables customizable computational inference of SLIs and testing of prediction approaches across multiple datasets. We anticipate that cancer researchers will find utility in this tool for discovery of SLIs to support further investigation into potential drug targets for anticancer therapies.

Assuntos

Computação em Nuvem , Neoplasias , Humanos , Neoplasias/genética , Biologia de Sistemas , Multiômica

6.

Impact of RNA-seq data analysis algorithms on gene expression estimation and downstream prediction.

Tong, Li; Wu, Po-Yen; Phan, John H; Hassazadeh, Hamid R; Tong, Weida; Wang, May D.

Sci Rep ; 10(1): 17925, 2020 10 21.

Artigo em Inglês | MEDLINE | ID: mdl-33087762

RESUMO

To use next-generation sequencing technology such as RNA-seq for medical and health applications, choosing proper analysis methods for biomarker identification remains a critical challenge for most users. The US Food and Drug Administration (FDA) has led the Sequencing Quality Control (SEQC) project to conduct a comprehensive investigation of 278 representative RNA-seq data analysis pipelines consisting of 13 sequence mapping, three quantification, and seven normalization methods. In this article, we focused on the impact of the joint effects of RNA-seq pipelines on gene expression estimation as well as the downstream prediction of disease outcomes. First, we developed and applied three metrics (i.e., accuracy, precision, and reliability) to quantitatively evaluate each pipeline's performance on gene expression estimation. We then investigated the correlation between the proposed metrics and the downstream prediction performance using two real-world cancer datasets (i.e., SEQC neuroblastoma dataset and the NIH/NCI TCGA lung adenocarcinoma dataset). We found that RNA-seq pipeline components jointly and significantly impacted the accuracy of gene expression estimation, and its impact was extended to the downstream prediction of these cancer outcomes. Specifically, RNA-seq pipelines that produced more accurate, precise, and reliable gene expression estimation tended to perform better in the prediction of disease outcome. In the end, we provided scenarios as guidelines for users to use these three metrics to select sensible RNA-seq pipelines for the improved accuracy, precision, and reliability of gene expression estimation, which lead to the improved downstream gene expression-based prediction of disease outcome.

Assuntos

Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Análise de Dados , Conjuntos de Dados como Assunto , Humanos , Análise em Microsséries , Valor Preditivo dos Testes , Prognóstico , Controle de Qualidade

7.

Using targeted next-generation sequencing to characterize genetic differences associated with insecticide resistance in Culex quinquefasciatus populations from the southern U.S.

Kothera, Linda; Phan, John; Ghallab, Enas; Delorey, Mark; Clark, Rebecca; Savage, Harry M.

PLoS One ; 14(7): e0218397, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31269040

RESUMO

Resistance to insecticides can hamper the control of mosquitoes such as Culex quinquefasciatus, known to vector arboviruses such as West Nile virus and others. The strong selective pressure exerted on a mosquito population by the use of insecticides can result in heritable genetic changes associated with resistance. We sought to characterize genetic differences between insecticide resistant and susceptible Culex quinquefasciatus mosquitoes using targeted DNA sequencing. To that end, we developed a panel of 122 genes known or hypothesized to be involved in insecticide resistance, and used an Ion Torrent PGM sequencer to sequence 125 unrelated individuals from seven populations in the southern U.S. whose resistance phenotypes to permethrin and malathion were known from previous CDC bottle bioassay testing. Data analysis consisted of discovering SNPs (Single Nucleotide Polymorphism) and genes with evidence of copy number variants (CNVs) statistically associated with resistance. Ten of the seventeen genes found to be present in higher copy numbers were experimentally validated with real-time PCR. Of those, six, including the gene with the knock-down resistance (kdr) mutation, showed evidence of a ≥ 1.5 fold increase compared to control DNA. The SNP analysis revealed 228 unique SNPs that had significant p-values for both a Fisher's Exact Test and the Cochran-Armitage Test for Trend. We calculated the population frequency for each of the 64 nonsynonymous SNPs in this group. Several genes not previously well characterized represent potential candidates for diagnostic assays when further validation is conducted.

Assuntos

Culex/genética , Resistência a Inseticidas , Inseticidas/farmacologia , Malation/farmacologia , Mutação , Permetrina/farmacologia , Polimorfismo de Nucleotídeo Único , Animais , Arizona , Sequenciamento de Nucleotídeos em Larga Escala , Resistência a Inseticidas/genética , Louisiana , Texas

8.

Discovery of Lipidome Alterations Following Traumatic Brain Injury via High-Resolution Metabolomics.

Hogan, Scott R; Phan, John H; Alvarado-Velez, Melissa; Wang, May Dongmei; Bellamkonda, Ravi V; Fernández, Facundo M; LaPlaca, Michelle C.

J Proteome Res ; 17(6): 2131-2143, 2018 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-29671324

RESUMO

Traumatic brain injury (TBI) can occur across wide segments of the population, presenting in a heterogeneous manner that makes diagnosis inconsistent and management challenging. Biomarkers offer the potential to objectively identify injury status, severity, and phenotype by measuring the relative concentrations of endogenous molecules in readily accessible biofluids. Through a data-driven, discovery approach, novel biomarker candidates for TBI were identified in the serum lipidome of adult male Sprague-Dawley rats in the first week following moderate controlled cortical impact (CCI). Serum samples were analyzed in positive and negative modes by ultraperformance liquid chromatography-mass spectrometry (UPLC-MS). A predictive panel for the classification of injured and uninjured sera samples, consisting of 26 dysregulated species belonging to a variety of lipid classes, was developed with a cross-validated accuracy of 85.3% using omniClassifier software to optimize feature selection. Polyunsaturated fatty acids (PUFAs) and PUFA-containing diacylglycerols were found to be upregulated in sera from injured rats, while changes in sphingolipids and other membrane phospholipids were also observed, many of which map to known secondary injury pathways. Overall, the identified biomarker panel offers viable molecular candidates representing lipids that may readily cross the blood-brain barrier (BBB) and aid in the understanding of TBI pathophysiology.

Assuntos

Biomarcadores/sangue , Lesões Encefálicas Traumáticas/metabolismo , Metabolismo dos Lipídeos , Metabolômica/métodos , Animais , Lesões Encefálicas Traumáticas/sangue , Lesões Encefálicas Traumáticas/diagnóstico , Cromatografia Líquida , Masculino , Ratos , Ratos Sprague-Dawley , Software , Espectrometria de Massas em Tandem

9.

Integration of Multi-Modal Biomedical Data to Predict Cancer Grade and Patient Survival.

Phan, John H; Hoffman, Ryan; Kothari, Sonal; Wu, Po-Yen; Wang, May D.

IEEE EMBS Int Conf Biomed Health Inform ; 2016: 577-580, 2016 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-27493999

RESUMO

The Big Data era in Biomedical research has resulted in large-cohort data repositories such as The Cancer Genome Atlas (TCGA). These repositories routinely contain hundreds of matched patient samples for genomic, proteomic, imaging, and clinical data modalities, enabling holistic and multi-modal integrative analysis of human disease. Using TCGA renal and ovarian cancer data, we conducted a novel investigation of multi-modal data integration by combining histopathological image and RNA-seq data. We compared the performances of two integrative prediction methods: majority vote and stacked generalization. Results indicate that integration of multiple data modalities improves prediction of cancer grade and outcome. Specifically, stacked generalization, a method that integrates multiple data modalities to produce a single prediction result, outperforms both single-data-modality prediction and majority vote. Moreover, stacked generalization reveals the contribution of each data modality (and specific features within each data modality) to the final prediction result and may provide biological insights to explain prediction performance.

10.

A Multi-Modal Graph-Based Semi-Supervised Pipeline for Predicting Cancer Survival.

Hassanzadeh, Hamid Reza; Phan, John H; Wang, May D.

Proceedings (IEEE Int Conf Bioinformatics Biomed) ; 2016: 184-189, 2016 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-32655981

RESUMO

Cancer survival prediction is an active area of research that can help prevent unnecessary therapies and improve patient's quality of life. Gene expression profiling is being widely used in cancer studies to discover informative biomarkers that aid predict different clinical endpoint prediction. We use multiple modalities of data derived from RNA deep-sequencing (RNA-seq) to predict survival of cancer patients. Despite the wealth of information available in expression profiles of cancer tumors, fulfilling the aforementioned objective remains a big challenge, for the most part, due to the paucity of data samples compared to the high dimension of the expression profiles. As such, analysis of transcriptomic data modalities calls for state-of-the-art big-data analytics techniques that can maximally use all the available data to discover the relevant information hidden within a significant amount of noise. In this paper, we propose a pipeline that predicts cancer patients' survival by exploiting the structure of the input (manifold learning) and by leveraging the unlabeled samples using Laplacian support vector machines, a graph-based semi supervised learning (GSSL) paradigm. We show that under certain circumstances, no single modality per se will result in the best accuracy and by fusing different models together via a stacked generalization strategy, we may boost the accuracy synergistically. We apply our approach to two cancer datasets and present promising results. We maintain that a similar pipeline can be used for predictive tasks where labeled samples are expensive to acquire.

11.

Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

Zhang, Wenqian; Yu, Ying; Hertwig, Falk; Thierry-Mieg, Jean; Zhang, Wenwei; Thierry-Mieg, Danielle; Wang, Jian; Furlanello, Cesare; Devanarayan, Viswanath; Cheng, Jie; Deng, Youping; Hero, Barbara; Hong, Huixiao; Jia, Meiwen; Li, Li; Lin, Simon M; Nikolsky, Yuri; Oberthuer, André; Qing, Tao; Su, Zhenqiang; Volland, Ruth; Wang, Charles; Wang, May D; Ai, Junmei; Albanese, Davide; Asgharzadeh, Shahab; Avigad, Smadar; Bao, Wenjun; Bessarabova, Marina; Brilliant, Murray H; Brors, Benedikt; Chierici, Marco; Chu, Tzu-Ming; Zhang, Jibin; Grundy, Richard G; He, Min Max; Hebbring, Scott; Kaufman, Howard L; Lababidi, Samir; Lancashire, Lee J; Li, Yan; Lu, Xin X; Luo, Heng; Ma, Xiwen; Ning, Baitang; Noguera, Rosa; Peifer, Martin; Phan, John H; Roels, Frederik; Rosswog, Carolina.

Genome Biol ; 16: 133, 2015 Jun 25.

Artigo em Inglês | MEDLINE | ID: mdl-26109056

RESUMO

BACKGROUND: Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model. RESULTS: We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models. CONCLUSIONS: We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.

Assuntos

Perfilação da Expressão Gênica , Neuroblastoma/genética , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de RNA , Adolescente , Adulto , Criança , Pré-Escolar , Determinação de Ponto Final , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Modelos Genéticos , Neuroblastoma/classificação , Neuroblastoma/diagnóstico , Células Tumorais Cultivadas , Adulto Jovem

12.

Effect of low-expression gene filtering on detection of differentially expressed genes in RNA-seq data.

Sha, Ying; Phan, John H; Wang, May D.

Annu Int Conf IEEE Eng Med Biol Soc ; 2015: 6461-4, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26737772

RESUMO

We compare methods for filtering RNA-seq lowexpression genes and investigate the effect of filtering on detection of differentially expressed genes (DEGs). Although RNA-seq technology has improved the dynamic range of gene expression quantification, low-expression genes may be indistinguishable from sampling noise. The presence of noisy, low-expression genes can decrease the sensitivity of detecting DEGs. Thus, identification and filtering of these low-expression genes may improve DEG detection sensitivity. Using the SEQC benchmark dataset, we investigate the effect of different filtering methods on DEG detection sensitivity. Moreover, we investigate the effect of RNA-seq pipelines on optimal filtering thresholds. Results indicate that the filtering threshold that maximizes the total number of DEGs closely corresponds to the threshold that maximizes DEG detection sensitivity. Transcriptome reference annotation, expression quantification method, and DEG detection method are statistically significant RNA-seq pipeline factors that affect the optimal filtering threshold.

Assuntos

RNA/análise , Análise de Sequência de RNA , Transcriptoma , Encéfalo/metabolismo , Humanos , RNA/química , Reação em Cadeia da Polimerase em Tempo Real

13.

The impact of RNA-seq aligners on gene expression estimation.

Yang, Cheng; Wu, Po-Yen; Tong, Li; Phan, John H; Wang, May D.

ACM BCB ; 2015: 462-471, 2015 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-27583310

RESUMO

While numerous RNA-seq data analysis pipelines are available, research has shown that the choice of pipeline influences the results of differentially expressed gene detection and gene expression estimation. Gene expression estimation is a key step in RNA-seq data analysis, since the accuracy of gene expression estimates profoundly affects the subsequent analysis. Generally, gene expression estimation involves sequence alignment and quantification, and accurate gene expression estimation requires accurate alignment. However, the impact of aligners on gene expression estimation remains unclear. We address this need by constructing nine pipelines consisting of nine spliced aligners and one quantifier. We then use simulated data to investigate the impact of aligners on gene expression estimation. To evaluate alignment, we introduce three alignment performance metrics, (1) the percentage of reads aligned, (2) the percentage of reads aligned with zero mismatch (ZeroMismatchPercentage), and (3) the percentage of reads aligned with at most one mismatch (ZeroOneMismatchPercentage). We then evaluate the impact of alignment performance on gene expression estimation using three metrics, (1) gene detection accuracy, (2) the number of genes falsely quantified (FalseExpNum), and (3) the number of genes with falsely estimated fold changes (FalseFcNum). We found that among various pipelines, FalseExpNum and FalseFcNum are correlated. Moreover, FalseExpNum is linearly correlated with the percentage of reads aligned and ZeroMismatchPercentage, and FalseFcNum is linearly correlated with ZeroMismatchPercentage. Because of this correlation, the percentage of reads aligned and ZeroMismatchPercentage may be used to assess the performance of gene expression estimation for all RNA-seq datasets.

14.

A semi-supervised method for predicting cancer survival using incomplete clinical data.

Hassanzadeh, Hamid Reza; Phan, John H; Wang, May D.

Annu Int Conf IEEE Eng Med Biol Soc ; 2015: 210-3, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26736237

RESUMO

Prediction of survival for cancer patients is an open area of research. However, many of these studies focus on datasets with a large number of patients. We present a novel method that is specifically designed to address the challenge of data scarcity, which is often the case for cancer datasets. Our method is able to use unlabeled data to improve classification by adopting a semi-supervised training approach to learn an ensemble classifier. The results of applying our method to three cancer datasets show the promise of semi-supervised learning for prediction of cancer survival.

Assuntos

Algoritmos , Bases de Dados Factuais , Neoplasias/mortalidade , Feminino , Humanos , Neoplasias Renais/mortalidade , Neoplasias Ovarianas/mortalidade , Neoplasias Pancreáticas/mortalidade , Prognóstico

15.

Detection of blur artifacts in histopathological whole-slide images of endomyocardial biopsies.

Phan, John H; Bhatia, Ajay K; Cundiff, Caitlin A; Shehata, Bahig M; Wang, May D.

Annu Int Conf IEEE Eng Med Biol Soc ; 2015: 727-30, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26736365

RESUMO

Histopathological whole-slide images (WSIs) have emerged as an objective and quantitative means for image-based disease diagnosis. However, WSIs may contain acquisition artifacts that affect downstream image feature extraction and quantitative disease diagnosis. We develop a method for detecting blur artifacts in WSIs using distributions of local blur metrics. As features, these distributions enable accurate classification of WSI regions as sharp or blurry. We evaluate our method using over 1000 portions of an endomyocardial biopsy (EMB) WSI. Results indicate that local blur metrics accurately detect blurry image regions.

Assuntos

Coração , Artefatos , Biópsia , Humanos

16.

Cardiovascular transcriptomics and epigenomics using next-generation sequencing: challenges, progress, and opportunities.

Wu, Po-Yen; Chandramohan, Raghu; Phan, John H; Mahle, William T; Gaynor, J William; Maher, Kevin O; Wang, May D.

Circ Cardiovasc Genet ; 7(5): 701-10, 2014 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-25518043

Assuntos

Doenças Cardiovasculares/genética , Doenças Cardiovasculares/metabolismo , Epigenômica/métodos , Epigenômica/tendências , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/tendências , Imunoprecipitação da Cromatina , Biologia Computacional/métodos , Humanos , Análise de Sequência de RNA , Transcriptoma

17.

Detecting and correcting systematic variation in large-scale RNA sequencing data.

Li, Sheng; Labaj, Pawel P; Zumbo, Paul; Sykacek, Peter; Shi, Wei; Shi, Leming; Phan, John; Wu, Po-Yen; Wang, May; Wang, Charles; Thierry-Mieg, Danielle; Thierry-Mieg, Jean; Kreil, David P; Mason, Christopher E.

Nat Biotechnol ; 32(9): 888-95, 2014 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-25150837

RESUMO

High-throughput RNA sequencing (RNA-seq) enables comprehensive scans of entire transcriptomes, but best practices for analyzing RNA-seq data have not been fully defined, particularly for data collected with multiple sequencing platforms or at multiple sites. Here we used standardized RNA samples with built-in controls to examine sources of error in large-scale RNA-seq studies and their impact on the detection of differentially expressed genes (DEGs). Analysis of variations in guanine-cytosine content, gene coverage, sequencing error rate and insert size allowed identification of decreased reproducibility across sites. Moreover, commonly used methods for normalization (cqn, EDASeq, RUV2, sva, PEER) varied in their ability to remove these systematic biases, depending on sample complexity and initial data quality. Normalization methods that combine data from genes across sites are strongly recommended to identify and remove site-specific effects and can substantially improve RNA-seq studies.

Assuntos

Análise de Sequência de RNA/métodos , Controle de Qualidade , Reprodutibilidade dos Testes

18.

Removing batch effects from histopathological images for enhanced cancer diagnosis.

Kothari, Sonal; Phan, John H; Stokes, Todd H; Osunkoya, Adeboye O; Young, Andrew N; Wang, May D.

IEEE J Biomed Health Inform ; 18(3): 765-72, 2014 May.

Artigo em Inglês | MEDLINE | ID: mdl-24808220

RESUMO

Researchers have developed computer-aided decision support systems for translational medicine that aim to objectively and efficiently diagnose cancer using histopathological images. However, the performance of such systems is confounded by nonbiological experimental variations or "batch effects" that can commonly occur in histopathological data, especially when images are acquired using different imaging devices and patient samples. This is even more problematic in large-scale studies in which cross-laboratory sharing of large volumes of data is necessary. Batch effects can change quantitative morphological image features and decrease the prediction performance. Using four batches of renal tumor images, we compare one image-level and five feature-level batch effect removal methods. Principal component variation analysis shows that batch is a large source of variance in image features. Results show that feature-level normalization methods reduce batch-contributed variance to almost zero. Moreover, feature-level normalization, especially ComBatN, improves cross-batch and combined-batch prediction performance. Compared to no normalization, ComBatN improves performance in 83% and 90% of cross-batch and combined-batch prediction models, respectively.

Assuntos

Histocitoquímica/métodos , Interpretação de Imagem Assistida por Computador/métodos , Aplicações da Informática Médica , Neoplasias/diagnóstico , Neoplasias/patologia , Análise por Conglomerados , Humanos , Processamento de Imagem Assistida por Computador , Neoplasias/química

19.

Investigation of factors affecting RNA-seq gene expression calls.

Harati, Sahar; Phan, John H; Wang, May D.

Annu Int Conf IEEE Eng Med Biol Soc ; 2014: 5232-5, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25571173

RESUMO

RNA-seq enables quantification of the human transcriptome. Estimation of gene expression is a fundamental issue in the analysis of RNA-seq data. However, there is an inherent ambiguity in distinguishing between genes with very low expression and experimental or transcriptional noise. We conducted an exploratory investigation of some factors that may affect gene expression calls. We observed that the distribution of reads that map to exonic, intronic, and intergenic regions are distinct. These distributions may provide useful insights into the behavior of gene expression noise. Moreover, we observed that these distributions are qualitatively similar between two sequence mapping algorithms. Finally, we examined the relationship between gene length and gene expression calls, and observed that they are correlated. This preliminary investigation is important for RNA-seq gene expression analysis because it may lead to more effective algorithms for distinguishing between true gene expression and experimental or transcriptional noise.

Assuntos

Perfilação da Expressão Gênica , Análise de Sequência de RNA/métodos , DNA Intergênico/genética , Éxons/genética , Regulação da Expressão Gênica , Humanos , Íntrons/genética , Transcriptoma/genética

20.

omniClassifier: a Desktop Grid Computing System for Big Data Prediction Modeling.

Phan, John H; Kothari, Sonal; Wang, May D.

ACM BCB ; 2014: 514-523, 2014 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-27532062

RESUMO

Robust prediction models are important for numerous science, engineering, and biomedical applications. However, best-practice procedures for optimizing prediction models can be computationally complex, especially when choosing models from among hundreds or thousands of parameter choices. Computational complexity has further increased with the growth of data in these fields, concurrent with the era of "Big Data". Grid computing is a potential solution to the computational challenges of Big Data. Desktop grid computing, which uses idle CPU cycles of commodity desktop machines, coupled with commercial cloud computing resources can enable research labs to gain easier and more cost effective access to vast computing resources. We have developed omniClassifier, a multi-purpose prediction modeling application that provides researchers with a tool for conducting machine learning research within the guidelines of recommended best-practices. omniClassifier is implemented as a desktop grid computing system using the Berkeley Open Infrastructure for Network Computing (BOINC) middleware. In addition to describing implementation details, we use various gene expression datasets to demonstrate the potential scalability of omniClassifier for efficient and robust Big Data prediction modeling. A prototype of omniClassifier can be accessed at http://omniclassifier.bme.gatech.edu/.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA