Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38407991

RESUMO

MOTIVATION: Complex tissues are dynamic ecosystems consisting of molecularly distinct yet interacting cell types. Computational deconvolution aims to dissect bulk tissue data into cell type compositions and cell-specific expressions. With few exceptions, most existing deconvolution tools exploit supervised approaches requiring various types of references that may be unreliable or even unavailable for specific tissue microenvironments. RESULTS: We previously developed a fully unsupervised deconvolution method-Convex Analysis of Mixtures (CAM), that enables estimation of cell type composition and expression from bulk tissues. We now introduce CAM3.0 tool that improves this framework with three new and highly efficient algorithms, namely, radius-fixed clustering to identify reliable markers, linear programming to detect an initial scatter simplex, and a smart floating search for the optimum latent variable model. The comparative experimental results obtained from both realistic simulations and case studies show that the CAM3.0 tool can help biologists more accurately identify known or novel cell markers, determine cell proportions, and estimate cell-specific expressions, complementing the existing tools particularly when study- or datatype-specific references are unreliable or unavailable. AVAILABILITY AND IMPLEMENTATION: The open-source R Scripts of CAM3.0 is freely available at https://github.com/ChiungTingWu/CAM3/(https://github.com/Bioconductor/Contributions/issues/3205). A user's guide and a vignette are provided.


Assuntos
Algoritmos , Ecossistema , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos
2.
Bioinform Adv ; 2(1): vbac076, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36330358

RESUMO

Motivation: Data normalization is essential to ensure accurate inference and comparability of gene expression measures across samples or conditions. Ideally, gene expression data should be rescaled based on consistently expressed reference genes. However, to normalize biologically diverse samples, the most commonly used reference genes exhibit striking expression variability and size-factor or distribution-based normalization methods can be problematic when the amount of asymmetry in differential expression is significant. Results: We report an efficient and accurate data-driven method-Cosine score-based iterative normalization (Cosbin)-to normalize biologically diverse samples. Based on the Cosine scores of cross-condition expression patterns, the Cosbin pipeline iteratively eliminates asymmetric differentially expressed genes, identifies consistently expressed genes, and calculates sample-wise normalization factors. We demonstrate the superior performance and enhanced utility of Cosbin compared with six representative peer methods using both simulation and real multi-omics expression datasets. Implemented in open-source R scripts and specifically designed to address normalization bias due to significant asymmetry in differential expression across multiple conditions, the Cosbin tool complements rather than replaces the existing methods and will allow biologists to more accurately detect true molecular signals among diverse phenotypic groups. Availability and implementation: The R scripts of Cosbin pipeline are freely available at https://github.com/MinjieSh/Cosbin. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

3.
Bioinform Adv ; 2(1): vbac037, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35673616

RESUMO

Motivation: Ideally, a molecularly distinct subtype would be composed of molecular features that are expressed uniquely in the subtype of interest but in no others-so-called marker genes (MGs). MG plays a critical role in the characterization, classification or deconvolution of tissue or cell subtypes. We and others have recognized that the test statistics used by most methods do not exactly satisfy the MG definition and often identify inaccurate MG. Results: We report an efficient and accurate data-driven method, formulated as a Cosine-based One-sample Test (COT) in scatter space, to detect MG among many subtypes using subtype expression profiles. Fundamentally different from existing approaches, the test statistic in COT precisely matches the mathematical definition of an ideal MG. We demonstrate the performance and utility of COT on both simulated and real gene expression and proteomics data. The open source Python/R tool will allow biologists to efficiently detect MG and perform a more comprehensive and unbiased molecular characterization of tissue or cell subtypes in many biomedical contexts. Nevertheless, COT complements not replaces existing methods. Availability and implementation: The Python COT software with a detailed user's manual and a vignette are freely available at https://github.com/MintaYLu/COT. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

4.
Sci Rep ; 12(1): 1067, 2022 01 20.
Artigo em Inglês | MEDLINE | ID: mdl-35058491

RESUMO

Missing values are a major issue in quantitative proteomics analysis. While many methods have been developed for imputing missing values in high-throughput proteomics data, a comparative assessment of imputation accuracy remains inconclusive, mainly because mechanisms contributing to true missing values are complex and existing evaluation methodologies are imperfect. Moreover, few studies have provided an outlook of future methodological development. We first re-evaluate the performance of eight representative methods targeting three typical missing mechanisms. These methods are compared on both simulated and masked missing values embedded within real proteomics datasets, and performance is evaluated using three quantitative measures. We then introduce fused regularization matrix factorization, a low-rank global matrix factorization framework, capable of integrating local similarity derived from additional data types. We also explore a biologically-inspired latent variable modeling strategy-convex analysis of mixtures-for missing value imputation and present preliminary experimental results. While some winners emerged from our comparative assessment, the evaluation is intrinsically imperfect because performance is evaluated indirectly on artificial missing or masked values not authentic missing values. Nevertheless, we show that our fused regularization matrix factorization provides a novel incorporation of external and local information, and the exploratory implementation of convex analysis of mixtures presents a biologically plausible new approach.


Assuntos
Interpretação Estatística de Dados , Proteômica/estatística & dados numéricos , Algoritmos , Proteômica/métodos
5.
Bioinformatics ; 38(5): 1403-1410, 2022 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-34904628

RESUMO

MOTIVATION: Complex biological tissues are often a heterogeneous mixture of several molecularly distinct cell subtypes. Both subtype compositions and subtype-specific (STS) expressions can vary across biological conditions. Computational deconvolution aims to dissect patterns of bulk tissue data into subtype compositions and STS expressions. Existing deconvolution methods can only estimate averaged STS expressions in a population, while many downstream analyses such as inferring co-expression networks in particular subtypes require subtype expression estimates in individual samples. However, individual-level deconvolution is a mathematically underdetermined problem because there are more variables than observations. RESULTS: We report a sample-wise Convex Analysis of Mixtures (swCAM) method that can estimate subtype proportions and STS expressions in individual samples from bulk tissue transcriptomes. We extend our previous CAM framework to include a new term accounting for between-sample variations and formulate swCAM as a nuclear-norm and ℓ2,1-norm regularized matrix factorization problem. We determine hyperparameter values using cross-validation with random entry exclusion and obtain a swCAM solution using an efficient alternating direction method of multipliers. Experimental results on realistic simulation data show that swCAM can accurately estimate STS expressions in individual samples and successfully extract co-expression networks in particular subtypes that are otherwise unobtainable using bulk data. In two real-world applications, swCAM analysis of bulk RNASeq data from brain tissue of cases and controls with bipolar disorder or Alzheimer's disease identified significant changes in cell proportion, expression pattern and co-expression module in patient neurons. Comparative evaluation of swCAM versus peer methods is also provided. AVAILABILITY AND IMPLEMENTATION: The R Scripts of swCAM are freely available at https://github.com/Lululuella/swCAM. A user's guide and a vignette are provided. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Humanos , Perfilação da Expressão Gênica/métodos , Simulação por Computador
6.
Sci Rep ; 11(1): 332, 2021 01 11.
Artigo em Inglês | MEDLINE | ID: mdl-33432005

RESUMO

Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissues. Classic differential analysis assumes a null hypothesis whose test statistic is not subtype-specific, thus can produce a high false positive rate and/or lower detection power. Here we first introduce a One-Versus-Everyone Fold Change (OVE-FC) test for detecting SDEGs. We then propose a scaled test statistic (OVE-sFC) for assessing the statistical significance of SDEGs that applies a mixture null distribution model and a tailored permutation test. The OVE-FC/sFC test was validated on both type 1 error rate and detection power using extensive simulation data sets generated from real gene expression profiles of purified subtype samples. The OVE-FC/sFC test was then applied to two benchmark gene expression data sets of purified subtype samples and detected many known or previously unknown SDEGs. Subsequent supervised deconvolution results on synthesized bulk expression data, obtained using the SDEGs detected from the independent purified expression data by the OVE-FC/sFC test, showed superior performance in deconvolution accuracy when compared with popular peer methods.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Modelos Genéticos
7.
Bioinformatics ; 36(12): 3927-3929, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32219387

RESUMO

SUMMARY: We develop a fully unsupervised deconvolution method to dissect complex tissues into molecularly distinctive tissue or cell subtypes based on bulk expression profiles. We implement an R package, deconvolution by Convex Analysis of Mixtures (debCAM) that can automatically detect tissue/cell-specific markers, determine the number of constituent subtypes, calculate subtype proportions in individual samples and estimate tissue/cell-specific expression profiles. We demonstrate the performance and biomedical utility of debCAM on gene expression, methylation, proteomics and imaging data. With enhanced data preprocessing and prior knowledge incorporation, debCAM software tool will allow biologists to perform a more comprehensive and unbiased characterization of tissue remodeling in many biomedical contexts. AVAILABILITY AND IMPLEMENTATION: http://bioconductor.org/packages/debCAM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteômica , Software , Expressão Gênica
8.
Bioinformatics ; 36(9): 2862-2871, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31950989

RESUMO

MOTIVATION: Liquid chromatography-mass spectrometry (LC-MS) is a standard method for proteomics and metabolomics analysis of biological samples. Unfortunately, it suffers from various changes in the retention times (RT) of the same compound in different samples, and these must be subsequently corrected (aligned) during data processing. Classic alignment methods such as in the popular XCMS package often assume a single time-warping function for each sample. Thus, the potentially varying RT drift for compounds with different masses in a sample is neglected in these methods. Moreover, the systematic change in RT drift across run order is often not considered by alignment algorithms. Therefore, these methods cannot effectively correct all misalignments. For a large-scale experiment involving many samples, the existence of misalignment becomes inevitable and concerning. RESULTS: Here, we describe an integrated reference-free profile alignment method, neighbor-wise compound-specific Graphical Time Warping (ncGTW), that can detect misaligned features and align profiles by leveraging expected RT drift structures and compound-specific warping functions. Specifically, ncGTW uses individualized warping functions for different compounds and assigns constraint edges on warping functions of neighboring samples. Validated with both realistic synthetic data and internal quality control samples, ncGTW applied to two large-scale metabolomics LC-MS datasets identifies many misaligned features and successfully realigns them. These features would otherwise be discarded or uncorrected using existing methods. The ncGTW software tool is developed currently as a plug-in to detect and realign misaligned features present in standard XCMS output. AVAILABILITY AND IMPLEMENTATION: An R package of ncGTW is freely available at Bioconductor and https://github.com/ChiungTingWu/ncGTW. A detailed user's manual and a vignette are provided within the package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Metabolômica , Espectrometria de Massas em Tandem , Algoritmos , Cromatografia Líquida , Proteômica , Software
9.
Bioinformatics ; 36(5): 1599-1606, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31596456

RESUMO

MOTIVATION: Synapses are essential to neural signal transmission. Therefore, quantification of synapses and related neurites from images is vital to gain insights into the underlying pathways of brain functionality and diseases. Despite the wide availability of synaptic punctum imaging data, several issues are impeding satisfactory quantification of these structures by current tools. First, the antibodies used for labeling synapses are not perfectly specific to synapses. These antibodies may exist in neurites or other cell compartments. Second, the brightness of different neurites and synaptic puncta is heterogeneous due to the variation of antibody concentration and synapse-intrinsic differences. Third, images often have low signal to noise ratio due to constraints of experiment facilities and availability of sensitive antibodies. These issues make the detection of synapses challenging and necessitates developing a new tool to easily and accurately quantify synapses. RESULTS: We present an automatic probability-principled synapse detection algorithm and integrate it into our synapse quantification tool SynQuant. Derived from the theory of order statistics, our method controls the false discovery rate and improves the power of detecting synapses. SynQuant is unsupervised, works for both 2D and 3D data, and can handle multiple staining channels. Through extensive experiments on one synthetic and three real datasets with ground truth annotation or manually labeling, SynQuant was demonstrated to outperform peer specialized unsupervised synapse detection tools as well as generic spot detection methods. AVAILABILITY AND IMPLEMENTATION: Java source code, Fiji plug-in, and test data are available at https://github.com/yu-lab-vt/SynQuant. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Microscopia , Sinapses , Algoritmos , Software
10.
Sci Rep ; 9(1): 2455, 2019 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-30792419

RESUMO

Most genetic or environmental factors work together in determining complex disease risk. Detecting gene-environment interactions may allow us to elucidate novel and targetable molecular mechanisms on how environmental exposures modify genetic effects. Unfortunately, standard logistic regression (LR) assumes a convenient mathematical structure for the null hypothesis that however results in both poor detection power and type 1 error, and is also susceptible to missing factor, imperfect surrogate, and disease heterogeneity confounding effects. Here we describe a new baseline framework, the asymmetric independence model (AIM) in case-control studies, and provide mathematical proofs and simulation studies verifying its validity across a wide range of conditions. We show that AIM mathematically preserves the asymmetric nature of maintaining health versus acquiring a disease, unlike LR, and thus is more powerful and robust to detect synergistic interactions. We present examples from four clinically discrete domains where AIM identified interactions that were previously either inconsistent or recognized with less statistical certainty.


Assuntos
Neoplasias Esofágicas/genética , Polimorfismo de Nucleotídeo Único , Trombose Venosa/genética , Algoritmos , Estudos de Casos e Controles , Interação Gene-Ambiente , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Modelos Logísticos , Modelos Genéticos
11.
Sensors (Basel) ; 14(8): 13548-55, 2014 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-25068864

RESUMO

We demonstrate a novel method for reducing saturation artifacts in spectral-domain optical coherence tomography (SD-OCT) systems. This method is based on a two-level SD-OCT system with a dual-line charge-coupled device (CCD) camera. We compensate the saturated signal detected by the first line using the unsaturated signal detected by the second line. The Fourier transform of the compensated spectrum shows effective suppression of saturation artifacts. This method was also successfully performed on phantom material and skin on a human finger. Our method causes neither back-scattering power loss nor signal-to-noise ratio (SNR) degradation. The only difference between the traditional system and our two-level system is our utilization of the dual-line CCD camera; no additional devices or complex designs are needed.


Assuntos
Tomografia de Coerência Óptica/métodos , Algoritmos , Artefatos , Dedos/fisiologia , Análise de Fourier , Humanos , Processamento de Imagem Assistida por Computador/métodos , Processamento de Sinais Assistido por Computador/instrumentação , Razão Sinal-Ruído
12.
Opt Express ; 20(27): 28418-30, 2012 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-23263077

RESUMO

The significantly less stringent operation of a two-reference swept-source optical coherence tomography (OCT) system for suppressing the mirror image is demonstrated based on the spatially localized image processing method. With this method, the phase difference between the two reference signals is not limited to 90 degrees. Based on the current experimental operation, the mirror image can be effectively suppressed as long as the phase difference is larger than 20 degrees. In other words, the adjustment of the beam splitter orientation for controlling the phase difference becomes much more flexible. Also, based on a phantom experiment, the combination the spatially localized mirror image suppression method with the two-reference OCT operation leads to the implementation of full-range optical Doppler tomography.


Assuntos
Artefatos , Aumento da Imagem/instrumentação , Lentes , Iluminação/instrumentação , Tomografia de Coerência Óptica/instrumentação , Desenho de Equipamento , Análise de Falha de Equipamento
13.
Biomed Opt Express ; 3(7): 1632-46, 2012 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-22808434

RESUMO

A procedure for computer analyzing an optical coherence tomography (OCT) image of normal and precancerous oral mucosae is demonstrated to reasonably plot the boundary between epithelium (EP) and lamina propria (LP) layers, determine the EP thickness, and estimate the range of dysplastic cell distribution based on standard deviation (SD) mapping. In this study, 54 normal oral mucosa, 39 oral mild dysplasia, and 44 oral moderate dysplasia OCT images are processed for evaluating the diagnosis statistics. Based on SD mapping in an OCT image, it is found that the laterally average range percentages of 70% SD maximum level in the EP layer is a reasonably good threshold for differentiating moderate dysplasia from mild dysplasia oral lesion based on the OCT image analysis. The sensitivity and specificity in diagnosis statistics can reach 82 and 90%, respectively.

14.
Opt Express ; 20(8): 8270-83, 2012 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-22513539

RESUMO

The theory and experimental results of a computation time-saving mirror image suppression method in Fourier-domain optical coherence tomography, which utilizes the property of reversed system phase shift between the real and mirror images, for differentiating one from the other are demonstrated. By solving a set of two equations based on a reasonable approximation, the real image signal can be obtained. The theoretical backgrounds and the improved real image quality of the average and iteration procedures in this method are particularly illustrated. Also, the mirror image suppression ratios under various process conditions, including different process iteration numbers and different system phase shifts between two neighboring A-mode scans, are evaluated. Meanwhile, the mirror image suppression results based on our method are compared with those obtained from the widely used BM-scan technique. It is found that when a process procedure of two iterations is used, the mirror image suppression quality based on our method can be higher than that obtained from the BM-scan technique. The computation time of our method is significantly shorter than that of the BM-scan technique.

15.
Opt Lett ; 36(15): 2889-91, 2011 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-21808348

RESUMO

A method, novel to our knowledge, for effective mirror image suppression in Fourier-domain optical coherence tomography based on a phase shift between neighboring A-mode scans is demonstrated. By realizing that the phase shifts of the real and mirror images are mutually reversed and assuming that the real image intensities of the two successive A-mode scans are the same, we can solve a set of two coupled equations to obtain the real image signals. The images based on the scanning of a high-resolution spectral-domain optical coherence tomography system are processed to show effective mirror image suppression results. Compared with a similar method of broad application, our approach has the advantages of shorter process time and higher flexibility in selecting the concerned image portions for processing.


Assuntos
Análise de Fourier , Processamento de Imagem Assistida por Computador/métodos , Tomografia de Coerência Óptica/métodos , Tecido Adiposo/citologia , Animais , Humanos , Pele/citologia , Suínos
16.
Opt Express ; 19(27): 26117-31, 2011 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-22274200

RESUMO

An improved image processing procedure for suppressing the phase noise due to a motion artifact acquired during optical coherence tomography scanning and effectively illustrating the blood vessel distribution in a living tissue is demonstrated. This new processing procedure and the widely used procedure for micro-angiography application are based on the selection of high-frequency components in the spatial-frequency spectrum of B-mode scanning (x-space), which are contributed from the image portions of moving objects. However, by switching the processing order between the x-space and k-space, the new processing procedure shows the superior function of effectively suppressing the phase noise due to a motion artifact. After the blood vessel positions are precisely acquired based on the new processing procedure, the projected blood flow speed can be more accurately calibrated based on a previously reported method. The demonstrated new procedure is useful for clinical micro-angiography application, in which a stepping motor of generating motion artifacts is usually used in the scanning probe.


Assuntos
Angiografia/instrumentação , Artefatos , Aumento da Imagem/instrumentação , Tomografia de Coerência Óptica/instrumentação , Desenho de Equipamento , Análise de Falha de Equipamento , Miniaturização , Movimento (Física)
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...