Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 9(10): e111445, 2014.
Article in English | MEDLINE | ID: mdl-25347727

ABSTRACT

In this paper we analyse the word frequency profiles of a set of works from the Shakespearean era to uncover patterns of relationship between them, highlighting the connections within authorial canons. We used a text corpus comprising 256 plays and poems from the 16th and 17th centuries, with 17 works of uncertain authorship. Our clustering approach is based on the Jensen-Shannon divergence and a graph partitioning algorithm, and our results show that authors' characteristic styles are very powerful factors in explaining the variation of word use, frequently transcending cross-cutting factors like the differences between tragedy and comedy, early and late works, and plays and poems. Our method also provides an empirical guide to the authorship of plays and poems where this is unknown or disputed.


Subject(s)
Authorship/history , Drama/history , Models, Theoretical , Poetry as Topic/history , Cluster Analysis , England , History, 16th Century , History, 17th Century
2.
Neurogenetics ; 15(3): 201-12, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24928144

ABSTRACT

'Neuroinflammation' has become a widely applied term in the basic and clinical neurosciences but there is no generally accepted neuropathological tissue correlate. Inflammation, which is characterized by the presence of perivascular infiltrates of cells of the adaptive immune system, is indeed seen in the central nervous system (CNS) under certain conditions. Authors who refer to microglial activation as neuroinflammation confuse this issue because autoimmune neuroinflammation serves as a synonym for multiple sclerosis, the prototypical inflammatory disease of the CNS. We have asked the question whether a data-driven, unbiased in silico approach may help to clarify the nomenclatorial confusion. Specifically, we have examined whether unsupervised analysis of microarray data obtained from human cerebral cortex of Alzheimer's, Parkinson's and schizophrenia patients would reveal a degree of relatedness between these diseases and recognized inflammatory conditions including multiple sclerosis. Our results using two different data analysis methods provide strong evidence against this hypothesis demonstrating that very different sets of genes are involved. Consequently, the designations inflammation and neuroinflammation are not interchangeable. They represent different categories not only at the histophenotypic but also at the transcriptomic level. Therefore, non-autoimmune neuroinflammation remains a term in need of definition.


Subject(s)
Alzheimer Disease/genetics , Encephalitis/genetics , Multiple Sclerosis/genetics , Parkinson Disease/genetics , Schizophrenia/genetics , Transcriptome , Cluster Analysis , Computational Biology , Computer Simulation , Gene Expression Profiling , Humans , Immunoglobulins/metabolism , Inflammation/genetics , Intercellular Signaling Peptides and Proteins/metabolism
3.
PLoS One ; 7(9): e45535, 2012.
Article in English | MEDLINE | ID: mdl-23029078

ABSTRACT

BACKGROUND: One primary goal of transcriptomic studies is identifying gene expression patterns correlating with disease progression. This is usually achieved by considering transcripts that independently pass an arbitrary threshold (e.g. p<0.05). In diseases involving severe perturbations of multiple molecular systems, such as Alzheimer's disease (AD), this univariate approach often results in a large list of seemingly unrelated transcripts. We utilised a powerful multivariate clustering approach to identify clusters of RNA biomarkers strongly associated with markers of AD progression. We discuss the value of considering pairs of transcripts which, in contrast to individual transcripts, helps avoid natural human transcriptome variation that can overshadow disease-related changes. METHODOLOGY/PRINCIPAL FINDINGS: We re-analysed a dataset of hippocampal transcript levels in nine controls and 22 patients with varying degrees of AD. A large-scale clustering approach determined groups of transcript probe sets that correlate strongly with measures of AD progression, including both clinical and neuropathological measures and quantifiers of the characteristic transcriptome shift from control to severe AD. This enabled identification of restricted groups of highly correlated probe sets from an initial list of 1,372 previously published by our group. We repeated this analysis on an expanded dataset that included all pair-wise combinations of the 1,372 probe sets. As clustering of this massive dataset is unfeasible using standard computational tools, we adapted and re-implemented a clustering algorithm that uses external memory algorithmic approach. This identified various pairs that strongly correlated with markers of AD progression and highlighted important biological pathways potentially involved in AD pathogenesis. CONCLUSIONS/SIGNIFICANCE: Our analyses demonstrate that, although there exists a relatively large molecular signature of AD progression, only a small number of transcripts recurrently cluster with different markers of AD progression. Furthermore, considering the relationship between two transcripts can highlight important biological relationships that are missed when considering either transcript in isolation.


Subject(s)
Alzheimer Disease/genetics , Gene Expression Profiling , Transcriptome , Algorithms , Alzheimer Disease/pathology , Biomarkers , Cluster Analysis , Computational Biology/methods , Databases, Genetic , Disease Progression , Humans , Molecular Sequence Annotation , Reproducibility of Results
4.
PLoS One ; 7(8): e44000, 2012.
Article in English | MEDLINE | ID: mdl-22937144

ABSTRACT

BACKGROUND: The analysis of biological networks has become a major challenge due to the recent development of high-throughput techniques that are rapidly producing very large data sets. The exploding volumes of biological data are craving for extreme computational power and special computing facilities (i.e. super-computers). An inexpensive solution, such as General Purpose computation based on Graphics Processing Units (GPGPU), can be adapted to tackle this challenge, but the limitation of the device internal memory can pose a new problem of scalability. An efficient data and computational parallelism with partitioning is required to provide a fast and scalable solution to this problem. RESULTS: We propose an efficient parallel formulation of the k-Nearest Neighbour (kNN) search problem, which is a popular method for classifying objects in several fields of research, such as pattern recognition, machine learning and bioinformatics. Being very simple and straightforward, the performance of the kNN search degrades dramatically for large data sets, since the task is computationally intensive. The proposed approach is not only fast but also scalable to large-scale instances. Based on our approach, we implemented a software tool GPU-FS-kNN (GPU-based Fast and Scalable k-Nearest Neighbour) for CUDA enabled GPUs. The basic approach is simple and adaptable to other available GPU architectures. We observed speed-ups of 50-60 times compared with CPU implementation on a well-known breast microarray study and its associated data sets. CONCLUSION: Our GPU-based Fast and Scalable k-Nearest Neighbour search technique (GPU-FS-kNN) provides a significant performance improvement for nearest neighbour computation in large-scale networks. Source code and the software tool is available under GNU Public License (GPL) at https://sourceforge.net/p/gpufsknn/.


Subject(s)
Computational Biology/methods , Programming Languages , Software , Algorithms , Cluster Analysis
SELECTION OF CITATIONS
SEARCH DETAIL
...