Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
J Biomed Inform ; 147: 104510, 2023 11.
Article in English | MEDLINE | ID: mdl-37797704

ABSTRACT

Single-cell RNA sequencing experiments produce data useful to identify different cell types, including uncharacterized and rare ones. This enables us to study the specific functional roles of these cells in different microenvironments and contexts. After identifying a (novel) cell type of interest, it is essential to build succinct marker panels, composed of a few genes referring to cell surface proteins and clusters of differentiation molecules, able to discriminate the desired cells from the other cell populations. In this work, we propose a fully-automatic framework called MAGNETO, which can help construct optimal marker panels starting from a single-cell gene expression matrix and a cell type identity for each cell. MAGNETO builds effective marker panels solving a tailored bi-objective optimization problem, where the first objective regards the identification of the genes able to isolate a specific cell type, while the second conflicting objective concerns the minimization of the total number of genes included in the panel. Our results on three public datasets show that MAGNETO can identify marker panels that identify the cell populations of interest better than state-of-the-art approaches. Finally, by fine-tuning MAGNETO, our results demonstrate that it is possible to obtain marker panels with different specificity levels.


Subject(s)
Single-Cell Analysis , Transcriptome , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Gene Expression Profiling/methods , Cell Differentiation
2.
Methods Mol Biol ; 2584: 293-310, 2023.
Article in English | MEDLINE | ID: mdl-36495457

ABSTRACT

Single-cell studies are enabling our understanding of the molecular processes of normal cell development and the onset of several pathologies. For instance, single-cell RNA sequencing (scRNA-Seq) measures the transcriptome-wide gene expression at a single-cell resolution, allowing for studying the heterogeneity among the cells of the same population and revealing complex and rare cell populations. On the other hand, single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-Seq) can be used to define transcriptional and epigenetic changes by analyzing the chromatin accessibility at the single-cell level. However, the integration of multi-omics data still remains one of the most difficult tasks in bioinformatics. In this chapter, we focus on the combination of scRNA-Seq and scATACSeq data to perform an integrative analysis of the single-cell transcriptome and chromatin accessibility of human fetal progenitors.


Subject(s)
Single-Cell Analysis , Single-Cell Gene Expression Analysis , Humans , Transcriptome , Computational Biology , Chromatin/genetics
3.
PLoS Comput Biol ; 17(9): e1009410, 2021 09.
Article in English | MEDLINE | ID: mdl-34499658

ABSTRACT

Mathematical models of biochemical networks can largely facilitate the comprehension of the mechanisms at the basis of cellular processes, as well as the formulation of hypotheses that can be tested by means of targeted laboratory experiments. However, two issues might hamper the achievement of fruitful outcomes. On the one hand, detailed mechanistic models can involve hundreds or thousands of molecular species and their intermediate complexes, as well as hundreds or thousands of chemical reactions, a situation generally occurring in rule-based modeling. On the other hand, the computational analysis of a model typically requires the execution of a large number of simulations for its calibration, or to test the effect of perturbations. As a consequence, the computational capabilities of modern Central Processing Units can be easily overtaken, possibly making the modeling of biochemical networks a worthless or ineffective effort. To the aim of overcoming the limitations of the current state-of-the-art simulation approaches, we present in this paper FiCoS, a novel "black-box" deterministic simulator that effectively realizes both a fine-grained and a coarse-grained parallelization on Graphics Processing Units. In particular, FiCoS exploits two different integration methods, namely, the Dormand-Prince and the Radau IIA, to efficiently solve both non-stiff and stiff systems of coupled Ordinary Differential Equations. We tested the performance of FiCoS against different deterministic simulators, by considering models of increasing size and by running analyses with increasing computational demands. FiCoS was able to dramatically speedup the computations up to 855×, showing to be a promising solution for the simulation and analysis of large-scale models of complex biological processes.


Subject(s)
Biochemical Phenomena , Software , Systems Biology , Algorithms , Autophagy , Computational Biology , Computer Graphics , Computer Simulation , Humans , Mathematical Concepts , Metabolic Networks and Pathways , Models, Biological , Protein Biosynthesis , Synthetic Biology
4.
BMC Bioinformatics ; 22(1): 309, 2021 Jun 08.
Article in English | MEDLINE | ID: mdl-34103004

ABSTRACT

BACKGROUND: Single-cell RNA sequencing (scRNA-Seq) experiments are gaining ground to study the molecular processes that drive normal development as well as the onset of different pathologies. Finding an effective and efficient low-dimensional representation of the data is one of the most important steps in the downstream analysis of scRNA-Seq data, as it could provide a better identification of known or putatively novel cell-types. Another step that still poses a challenge is the integration of different scRNA-Seq datasets. Though standard computational pipelines to gain knowledge from scRNA-Seq data exist, a further improvement could be achieved by means of machine learning approaches. RESULTS: Autoencoders (AEs) have been effectively used to capture the non-linearities among gene interactions of scRNA-Seq data, so that the deployment of AE-based tools might represent the way forward in this context. We introduce here scAEspy, a unifying tool that embodies: (1) four of the most advanced AEs, (2) two novel AEs that we developed on purpose, (3) different loss functions. We show that scAEspy can be coupled with various batch-effect removal tools to integrate data by different scRNA-Seq platforms, in order to better identify the cell-types. We benchmarked scAEspy against the most used batch-effect removal tools, showing that our AE-based strategies outperform the existing solutions. CONCLUSIONS: scAEspy is a user-friendly tool that enables using the most recent and promising AEs to analyse scRNA-Seq data by only setting up two user-defined parameters. Thanks to its modularity, scAEspy can be easily extended to accommodate new AEs to further improve the downstream analysis of scRNA-Seq data. Considering the relevant results we achieved, scAEspy can be considered as a starting point to build a more comprehensive toolkit designed to integrate multi single-cell omics.


Subject(s)
RNA , Single-Cell Analysis , Machine Learning , RNA/genetics , Sequence Analysis, RNA , Exome Sequencing
5.
Cell Stem Cell ; 28(3): 472-487.e7, 2021 03 04.
Article in English | MEDLINE | ID: mdl-33352111

ABSTRACT

Regulation of hematopoiesis during human development remains poorly defined. Here we applied single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) to over 8,000 human immunophenotypic blood cells from fetal liver and bone marrow. We inferred their differentiation trajectory and identified three highly proliferative oligopotent progenitor populations downstream of hematopoietic stem cells (HSCs)/multipotent progenitors (MPPs). Along this trajectory, we observed opposing patterns of chromatin accessibility and differentiation that coincided with dynamic changes in the activity of distinct lineage-specific transcription factors. Integrative analysis of chromatin accessibility and gene expression revealed extensive epigenetic but not transcriptional priming of HSCs/MPPs prior to their lineage commitment. Finally, we refined and functionally validated the sorting strategy for the HSCs/MPPs and achieved around 90% enrichment. Our study provides a useful framework for future investigation of human developmental hematopoiesis in the context of blood pathologies and regenerative medicine.


Subject(s)
Chromatin Immunoprecipitation Sequencing , Hematopoiesis , Cell Lineage/genetics , Hematopoiesis/genetics , Hematopoietic Stem Cells , Humans , RNA-Seq , Single-Cell Analysis
6.
Appl Sci (Basel) ; 10(18)2020 Sep 02.
Article in English | MEDLINE | ID: mdl-34306736

ABSTRACT

Advances in microscopy imaging technologies have enabled the visualization of live-cell dynamic processes using time-lapse microscopy imaging. However, modern methods exhibit several limitations related to the training phases and to time constraints, hindering their application in the laboratory practice. In this work, we present a novel method, named Automated Cell Detection and Counting (ACDC), designed for activity detection of fluorescent labeled cell nuclei in time-lapse microscopy. ACDC overcomes the limitations of the literature methods, by first applying bilateral filtering on the original image to smooth the input cell images while preserving edge sharpness, and then by exploiting the watershed transform and morphological filtering. Moreover, ACDC represents a feasible solution for the laboratory practice, as it can leverage multi-core architectures in computer clusters to efficiently handle large-scale imaging datasets. Indeed, our Parent-Workers implementation of ACDC allows to obtain up to a 3.7× speed-up compared to the sequential counterpart. ACDC was tested on two distinct cell imaging datasets to assess its accuracy and effectiveness on images with different characteristics. We achieved an accurate cell-count and nuclei segmentation without relying on large-scale annotated datasets, a result confirmed by the average Dice Similarity Coefficients of 76.84 and 88.64 and the Pearson coefficients of 0.99 and 0.96, calculated against the manual cell counting, on the two tested datasets.

7.
Comput Methods Programs Biomed ; 176: 159-172, 2019 Jul.
Article in English | MEDLINE | ID: mdl-31200903

ABSTRACT

BACKGROUND AND OBJECTIVES: Image segmentation represents one of the most challenging issues in medical image analysis to distinguish among different adjacent tissues in a body part. In this context, appropriate image pre-processing tools can improve the result accuracy achieved by computer-assisted segmentation methods. Taking into consideration images with a bimodal intensity distribution, image binarization can be used to classify the input pictorial data into two classes, given a threshold intensity value. Unfortunately, adaptive thresholding techniques for two-class segmentation work properly only for images characterized by bimodal histograms. We aim at overcoming these limitations and automatically determining a suitable optimal threshold for bimodal Magnetic Resonance (MR) images, by designing an intelligent image analysis framework tailored to effectively assist the physicians during their decision-making tasks. METHODS: In this work, we present a novel evolutionary framework for image enhancement, automatic global thresholding, and segmentation, which is here applied to different clinical scenarios involving bimodal MR image analysis: (i) uterine fibroid segmentation in MR guided Focused Ultrasound Surgery, and (ii) brain metastatic cancer segmentation in neuro-radiosurgery therapy. Our framework exploits MedGA as a pre-processing stage. MedGA is an image enhancement method based on Genetic Algorithms that improves the threshold selection, obtained by the efficient Iterative Optimal Threshold Selection algorithm, between the underlying sub-distributions in a nearly bimodal histogram. RESULTS: The results achieved by the proposed evolutionary framework were quantitatively evaluated, showing that the use of MedGA as a pre-processing stage outperforms the conventional image enhancement methods (i.e., histogram equalization, bi-histogram equalization, Gamma transformation, and sigmoid transformation), in terms of both MR image enhancement and segmentation evaluation metrics. CONCLUSIONS: Thanks to this framework, MR image segmentation accuracy is considerably increased, allowing for measurement repeatability in clinical workflows. The proposed computational solution could be well-suited for other clinical contexts requiring MR image analysis and segmentation, aiming at providing useful insights for differential diagnosis and prognosis.


Subject(s)
Brain Neoplasms/diagnostic imaging , Image Processing, Computer-Assisted/methods , Leiomyoma/diagnostic imaging , Magnetic Resonance Imaging , Algorithms , Computer Simulation , Decision Making , Female , Humans , Neurosurgery , Radiosurgery , Software
8.
BMC Bioinformatics ; 20(Suppl 4): 172, 2019 Apr 18.
Article in English | MEDLINE | ID: mdl-30999845

ABSTRACT

BACKGROUND: In order to fully characterize the genome of an individual, the reconstruction of the two distinct copies of each chromosome, called haplotypes, is essential. The computational problem of inferring the full haplotype of a cell starting from read sequencing data is known as haplotype assembly, and consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the two chromosomes. Indeed, the knowledge of complete haplotypes is generally more informative than analyzing single SNPs and plays a fundamental role in many medical applications. RESULTS: To reconstruct the two haplotypes, we addressed the weighted Minimum Error Correction (wMEC) problem, which is a successful approach for haplotype assembly. This NP-hard problem consists in computing the two haplotypes that partition the sequencing reads into two disjoint sub-sets, with the least number of corrections to the SNP values. To this aim, we propose here GenHap, a novel computational method for haplotype assembly based on Genetic Algorithms, yielding optimal solutions by means of a global search process. In order to evaluate the effectiveness of our approach, we run GenHap on two synthetic (yet realistic) datasets, based on the Roche/454 and PacBio RS II sequencing technologies. We compared the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype phasing. Our results show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 4× faster than HapCol in the case of Roche/454 instances and up to 20× faster when compared on the PacBio RS II dataset. Finally, we assessed the performance of GenHap on two different real datasets. CONCLUSIONS: Future-generation sequencing technologies, producing longer reads with higher coverage, can highly benefit from GenHap, thanks to its capability of efficiently solving large instances of the haplotype assembly problem. Moreover, the optimization approach proposed in GenHap can be extended to the study of allele-specific genomic features, such as expression, methylation and chromatin conformation, by exploiting multi-objective optimization techniques. The source code and the full documentation are available at the following GitHub repository: https://github.com/andrea-tango/GenHap .


Subject(s)
Algorithms , Computational Biology/methods , Haplotypes/genetics , Databases, Genetic , Humans , Time Factors
9.
BMC Bioinformatics ; 18(1): 246, 2017 May 10.
Article in English | MEDLINE | ID: mdl-28486952

ABSTRACT

BACKGROUND: Mathematical modeling and in silico analysis are widely acknowledged as complementary tools to biological laboratory methods, to achieve a thorough understanding of emergent behaviors of cellular processes in both physiological and perturbed conditions. Though, the simulation of large-scale models-consisting in hundreds or thousands of reactions and molecular species-can rapidly overtake the capabilities of Central Processing Units (CPUs). The purpose of this work is to exploit alternative high-performance computing solutions, such as Graphics Processing Units (GPUs), to allow the investigation of these models at reduced computational costs. RESULTS: LASSIE is a "black-box" GPU-accelerated deterministic simulator, specifically designed for large-scale models and not requiring any expertise in mathematical modeling, simulation algorithms or GPU programming. Given a reaction-based model of a cellular process, LASSIE automatically generates the corresponding system of Ordinary Differential Equations (ODEs), assuming mass-action kinetics. The numerical solution of the ODEs is obtained by automatically switching between the Runge-Kutta-Fehlberg method in the absence of stiffness, and the Backward Differentiation Formulae of first order in presence of stiffness. The computational performance of LASSIE are assessed using a set of randomly generated synthetic reaction-based models of increasing size, ranging from 64 to 8192 reactions and species, and compared to a CPU-implementation of the LSODA numerical integration algorithm. CONCLUSIONS: LASSIE adopts a novel fine-grained parallelization strategy to distribute on the GPU cores all the calculations required to solve the system of ODEs. By virtue of this implementation, LASSIE achieves up to 92× speed-up with respect to LSODA, therefore reducing the running time from approximately 1 month down to 8 h to simulate models consisting in, for instance, four thousands of reactions and species. Notably, thanks to its smaller memory footprint, LASSIE is able to perform fast simulations of even larger models, whereby the tested CPU-implementation of LSODA failed to reach termination. LASSIE is therefore expected to make an important breakthrough in Systems Biology applications, for the execution of faster and in-depth computational analyses of large-scale models of complex biological systems.


Subject(s)
Algorithms , Biochemical Phenomena , Computer Graphics , Computer Simulation , Kinetics , Time Factors , User-Computer Interface
10.
Brief Bioinform ; 18(5): 870-885, 2017 09 01.
Article in English | MEDLINE | ID: mdl-27402792

ABSTRACT

Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools.


Subject(s)
Systems Biology , Algorithms , Computer Graphics , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...