Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Sci Rep ; 11(1): 21609, 2021 11 03.
Article in English | MEDLINE | ID: mdl-34732744

ABSTRACT

The concept of depth induces an ordering from centre outwards in multivariate data. Most depth definitions are unfeasible for dimensions larger than three or four, but the Modified Band Depth (MBD) is a notable exception that has proven to be a valuable tool in the analysis of high-dimensional gene expression data. This depth definition relates the centrality of each individual to its (partial) inclusion in all possible bands formed by elements of the data set. We assess (dis)similarity between pairs of observations by accounting for such bands and constructing binary matrices associated to each pair. From these, contingency tables are calculated and used to derive standard similarity indices. Our approach is computationally efficient and can be applied to bands formed by any number of observations from the data set. We have evaluated the performance of several band-based similarity indices with respect to that of other classical distances in standard classification and clustering tasks in a variety of simulated and real data sets. However, the use of the method is not restricted to these, the extension to other similarity coefficients being straightforward. Our experiments show the benefits of our technique, with some of the selected indices outperforming, among others, the Euclidean distance.


Subject(s)
Algorithms , Biomarkers, Tumor/genetics , Data Interpretation, Statistical , Neoplasms/genetics , Cluster Analysis , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Humans , Neoplasms/classification , Neoplasms/pathology
2.
Phys Rev E ; 100(5-1): 052128, 2019 Nov.
Article in English | MEDLINE | ID: mdl-31870030

ABSTRACT

A system of smooth "frozen" Janus-type disks is studied. Such disks cannot rotate and are divided by their diameter into two sides of different inelasticities. Taking as a reference a system of colored elastic disks, we find differences in the behavior of the collisions once the anisotropy is included. A homogeneous state, akin to the homogeneous cooling state of granular gases, is seen to arise and the singular behavior of both the collisions and the precollisional correlations are highlighted.

3.
Phys Rev E ; 99(6-1): 060901, 2019 Jun.
Article in English | MEDLINE | ID: mdl-31330601

ABSTRACT

We report the emergence of a giant Mpemba effect in the uniformly heated gas of inelastic rough hard spheres: The initially hotter sample may cool sooner than the colder one, even when the initial temperatures differ by more than one order of magnitude. In order to understand this behavior, it suffices to consider the simplest Maxwellian approximation for the velocity distribution in a kinetic approach. The largeness of the effect stems from the fact that the rotational and translational temperatures, which obey two coupled evolution equations, are comparable. Our theoretical predictions agree very well with molecular dynamics and direct simulation Monte Carlo data.

4.
Bioinformatics ; 33(24): 4001-4003, 2017 Dec 15.
Article in English | MEDLINE | ID: mdl-28961761

ABSTRACT

SUMMARY: clustComp is an open source Bioconductor package that implements different techniques for the comparison of two gene expression clustering results. These include flat versus flat and hierarchical versus flat comparisons. The visualization of the similarities is provided by means of a bipartite graph, whose layout is heuristically optimized. Its flexibility allows a suitable visualization for both small and large datasets. AVAILABILITY AND IMPLEMENTATION: The package is available at http://bioconductor.org/packages/clustComp/ and contains a 'vignette' outlying the typical use of the algorithms. CONTACT: etorrent@est-econ.uc3m.es. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Expression Profiling/methods , Software , Algorithms , Cluster Analysis
5.
PLoS One ; 11(6): e0157484, 2016.
Article in English | MEDLINE | ID: mdl-27322383

ABSTRACT

Rapid accumulation and availability of gene expression datasets in public repositories have enabled large-scale meta-analyses of combined data. The richness of cross-experiment data has provided new biological insights, including identification of new cancer genes. In this study, we compiled a human gene expression dataset from ∼40,000 publicly available Affymetrix HG-U133Plus2 arrays. After strict quality control and data normalisation the data was quantified in an expression matrix of ∼20,000 genes and ∼28,000 samples. To enable different ways of sample grouping, existing annotations where subjected to systematic ontology assisted categorisation and manual curation. Groups like normal tissues, neoplasmic tissues, cell lines, homoeotic cells and incompletely differentiated cells were created. Unsupervised analysis of the data confirmed global structure of expression consistent with earlier analysis but with more details revealed due to increased resolution. A suitable mixed-effects linear model was used to further investigate gene expression in solid tissue tumours, and to compare these with the respective healthy solid tissues. The analysis identified 1,285 genes with systematic expression change in cancer. The list is significantly enriched with known cancer genes from large, public, peer-reviewed databases, whereas the remaining ones are proposed as new cancer gene candidates. The compiled dataset is publicly available in the ArrayExpress Archive. It contains the most diverse collection of biological samples, making it the largest systematically annotated gene expression dataset of its kind in the public domain.


Subject(s)
Biomarkers, Tumor/biosynthesis , Gene Expression Regulation, Neoplastic , Neoplasm Proteins/biosynthesis , Neoplasms/genetics , Biomarkers, Tumor/genetics , Cell Cycle/genetics , Cell Differentiation/genetics , Cell Division/genetics , Computational Biology , DNA Replication/genetics , Databases, Genetic , Humans , Neoplasm Proteins/genetics , Neoplasms/pathology , Oligonucleotide Array Sequence Analysis , Principal Component Analysis , Protein Array Analysis
6.
BMC Bioinformatics ; 14: 237, 2013 Jul 25.
Article in English | MEDLINE | ID: mdl-23885712

ABSTRACT

BACKGROUND: The use of DNA microarrays and oligonucleotide chips of high density in modern biomedical research provides complex, high dimensional data which have been proven to convey crucial information about gene expression levels and to play an important role in disease diagnosis. Therefore, there is a need for developing new, robust statistical techniques to analyze these data. RESULTS: depthTools is an R package for a robust statistical analysis of gene expression data, based on an efficient implementation of a feasible notion of depth, the Modified Band Depth. This software includes several visualization and inference tools successfully applied to high dimensional gene expression data. A user-friendly interface is also provided via an R-commander plugin. CONCLUSION: We illustrate the utility of the depthTools package, that could be used, for instance, to achieve a better understanding of genome-level variation between tumors and to facilitate the development of personalized treatments.


Subject(s)
Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Software , Algorithms , Genome , Humans , Male , Prostatic Neoplasms/genetics , Prostatic Neoplasms/metabolism
7.
Nucleic Acids Res ; 41(10): e110, 2013 May 01.
Article in English | MEDLINE | ID: mdl-23563154

ABSTRACT

Rapid accumulation of large and standardized microarray data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of these data resources. Although short oligonucleotide arrays constitute a major source of genome-wide profiling data, scalable probe-level techniques have been available only for few platforms based on pre-calculated probe effects from restricted reference training sets. To overcome these key limitations, we introduce a fully scalable online-learning algorithm for probe-level analysis and pre-processing of large microarray atlases involving tens of thousands of arrays. In contrast to the alternatives, our algorithm scales up linearly with respect to sample size and is applicable to all short oligonucleotide platforms. The model can use the most comprehensive data collections available to date to pinpoint individual probes affected by noise and biases, providing tools to guide array design and quality control. This is the only available algorithm that can learn probe-level parameters based on sequential hyperparameter updates at small consecutive batches of data, thus circumventing the extensive memory requirements of the standard approaches and opening up novel opportunities to take full advantage of contemporary microarray collections.


Subject(s)
Algorithms , Oligonucleotide Array Sequence Analysis , Bayes Theorem , Gene Expression Profiling , Humans
8.
Biostatistics ; 11(2): 254-64, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20064844

ABSTRACT

Microarray experiments provide data on the expression levels of thousands of genes and, therefore, statistical methods applicable to the analysis of such high-dimensional data are needed. In this paper, we propose robust nonparametric tools for the description and analysis of microarray data based on the concept of functional depth, which measures the centrality of an observation within a sample. We show that this concept can be easily adapted to high-dimensional observations and, in particular, to gene expression data. This allows the development of the following depth-based inference tools: (1) a scale curve for measuring and visualizing the dispersion of a set of points, (2) a rank test for deciding if 2 groups of multidimensional observations come from the same population, and (3) supervised classification techniques for assigning a new sample to one of G given groups. We apply these methods to microarray data, and to simulated data including contaminated models, and show that they are robust, efficient, and competitive with other procedures proposed in the literature, outperforming them in some situations.


Subject(s)
Biometry/methods , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Algorithms , Computer Simulation , Gene Expression/genetics , Humans , Leukemia, Myeloid, Acute/metabolism , Male , Precursor Cell Lymphoblastic Leukemia-Lymphoma/metabolism , Prostatic Neoplasms/metabolism , Statistics, Nonparametric
9.
Bioinformatics ; 21(21): 3993-9, 2005 Nov 01.
Article in English | MEDLINE | ID: mdl-16141251

ABSTRACT

MOTIVATION: Clustering is one of the most widely used methods in unsupervised gene expression data analysis. The use of different clustering algorithms or different parameters often produces rather different results on the same data. Biological interpretation of multiple clustering results requires understanding how different clusters relate to each other. It is particularly non-trivial to compare the results of a hierarchical and a flat, e.g. k-means, clustering. RESULTS: We present a new method for comparing and visualizing relationships between different clustering results, either flat versus flat, or flat versus hierarchical. When comparing a flat clustering to a hierarchical clustering, the algorithm cuts different branches in the hierarchical tree at different levels to optimize the correspondence between the clusters. The optimization function is based on graph layout aesthetics or on mutual information. The clusters are displayed using a bipartite graph where the edges are weighted proportionally to the number of common elements in the respective clusters and the weighted number of crossings is minimized. The performance of the algorithm is tested using simulated and real gene expression data. The algorithm is implemented in the online gene expression data analysis tool Expression Profiler. AVAILABILITY: http://www.ebi.ac.uk/expressionprofiler


Subject(s)
Algorithms , Cluster Analysis , Computer Graphics , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Pattern Recognition, Automated/methods , User-Computer Interface
10.
Nucleic Acids Res ; 32(Web Server issue): W465-70, 2004 Jul 01.
Article in English | MEDLINE | ID: mdl-15215431

ABSTRACT

Expression Profiler (EP, http://www.ebi.ac.uk/expressionprofiler) is a web-based platform for microarray gene expression and other functional genomics-related data analysis. The new architecture, Expression Profiler: next generation (EP:NG), modularizes the original design and allows individual analysis-task-related components to be developed by different groups and yet still seamlessly to work together and share the same user interface look and feel. Data analysis components for gene expression data preprocessing, missing value imputation, filtering, clustering methods, visualization, significant gene finding, between group analysis and other statistical components are available from the EBI (European Bioinformatics Institute) web site. The web-based design of Expression Profiler supports data sharing and collaborative analysis in a secure environment. Developed tools are integrated with the microarray gene expression database ArrayExpress and form the exploratory analytical front-end to those data. EP:NG is an open-source project, encouraging broad distribution and further extensions from the scientific community.


Subject(s)
Gene Expression Profiling , Oligonucleotide Array Sequence Analysis , Software , Genomics , Internet , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...