Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS Comput Biol ; 10(3): e1003494, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24603409

ABSTRACT

We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base sequencing read count data called FIXSEQ. We demonstrate that FIXSEQ substantially improves the performance of existing RNA-seq, DNase-seq, and ChIP-seq analysis tools when compared with existing alternatives.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Algorithms , Area Under Curve , Chromatin/chemistry , Chromatin Immunoprecipitation , Computational Biology , Computer Simulation , DNA/chemistry , Humans , K562 Cells , Likelihood Functions , Poisson Distribution , RNA/chemistry , Sequence Analysis, DNA , Software , Transcription Factors/chemistry
2.
Proc Natl Acad Sci U S A ; 108(41): 16916-21, 2011 Oct 11.
Article in English | MEDLINE | ID: mdl-21949369

ABSTRACT

The goal of dimensionality reduction is to embed high-dimensional data in a low-dimensional space while preserving structure in the data relevant to exploratory data analysis such as clusters. However, existing dimensionality reduction methods often either fail to separate clusters due to the crowding problem or can only separate clusters at a single resolution. We develop a new approach to dimensionality reduction: tree preserving embedding. Our approach uses the topological notion of connectedness to separate clusters at all resolutions. We provide a formal guarantee of cluster separation for our approach that holds for finite samples. Our approach requires no parameters and can handle general types of data, making it easy to use in practice and suggesting new strategies for robust data visualization.


Subject(s)
Data Interpretation, Statistical , Algorithms , Cluster Analysis , Handwriting , Models, Statistical , Radar , Sequence Analysis, Protein/statistics & numerical data
3.
BMC Bioinformatics ; 10: 19, 2009 Jan 15.
Article in English | MEDLINE | ID: mdl-19146673

ABSTRACT

BACKGROUND: Network visualization would serve as a useful first step for analysis. However, current graph layout algorithms for biological pathways are insensitive to biologically important information, e.g. subcellular localization, biological node and graph attributes, or/and not available for large scale networks, e.g. more than 10000 elements. RESULTS: To overcome these problems, we propose the use of a biologically important graph metric, betweenness, a measure of network flow. This metric is highly correlated with many biological phenomena such as lethality and clusters. We devise a new fast parallel algorithm calculating betweenness to minimize the preprocessing cost. Using this metric, we also invent a node and edge betweenness based fast layout algorithm (BFL). BFL places the high-betweenness nodes to optimal positions and allows the low-betweenness nodes to reach suboptimal positions. Furthermore, BFL reduces the runtime by combining a sequential insertion algorim with betweenness. For a graph with n nodes, this approach reduces the expected runtime of the algorithm to O(n2) when considering edge crossings, and to O(n log n) when considering only density and edge lengths. CONCLUSION: Our BFL algorithm is compared against fast graph layout algorithms and approaches requiring intensive optimizations. For gene networks, we show that our algorithm is faster than all layout algorithms tested while providing readability on par with intensive optimization algorithms. We achieve a 1.4 second runtime for a graph with 4000 nodes and 12000 edges on a standard desktop computer.


Subject(s)
Algorithms , Computer Graphics , Metabolic Networks and Pathways , Pattern Recognition, Automated , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...