Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Cell Death Dis ; 14(12): 841, 2023 12 18.
Article in English | MEDLINE | ID: mdl-38110334

ABSTRACT

Long non-coding RNAs (lncRNAs) comprise the most representative transcriptional units of the mammalian genome. They are associated with organ development linked with the emergence of cardiovascular diseases. We used bioinformatic approaches, machine learning algorithms, systems biology analyses, and statistical techniques to define co-expression modules linked to heart development and cardiovascular diseases. We also uncovered differentially expressed transcripts in subpopulations of cardiomyocytes. Finally, from this work, we were able to identify eight cardiac cell-types; several new coding, lncRNA, and pcRNA markers; two cardiomyocyte subpopulations at four different time points (ventricle E9.5, left ventricle E11.5, right ventricle E14.5 and left atrium P0) that harbored co-expressed gene modules enriched in mitochondrial, heart development and cardiovascular diseases. Our results evidence the role of particular lncRNAs in heart development and highlight the usage of co-expression modular approaches in the cell-type functional definition.


Subject(s)
Cardiovascular Diseases , RNA, Long Noncoding , Animals , Mice , RNA, Long Noncoding/genetics , Gene Expression Profiling/methods , Organogenesis , Myocytes, Cardiac , Mammals/genetics
2.
Tomography ; 9(3): 1120-1132, 2023 06 10.
Article in English | MEDLINE | ID: mdl-37368544

ABSTRACT

In breast tomosynthesis, multiple low-dose projections are acquired in a single scanning direction over a limited angular range to produce cross-sectional planes through the breast for three-dimensional imaging interpretation. We built a next-generation tomosynthesis system capable of multidirectional source motion with the intent to customize scanning motions around "suspicious findings". Customized acquisitions can improve the image quality in areas that require increased scrutiny, such as breast cancers, architectural distortions, and dense clusters. In this paper, virtual clinical trial techniques were used to analyze whether a finding or area at high risk of masking cancers can be detected in a single low-dose projection and thus be used for motion planning. This represents a step towards customizing the subsequent low-dose projection acquisitions autonomously, guided by the first low-dose projection; we call this technique "self-steering tomosynthesis." A U-Net was used to classify the low-dose projections into "risk classes" in simulated breasts with soft-tissue lesions; class probabilities were modified using post hoc Dirichlet calibration (DC). DC improved the multiclass segmentation (Dice = 0.43 vs. 0.28 before DC) and significantly reduced false positives (FPs) from the class of the highest risk of masking (sensitivity = 81.3% at 2 FPs per image vs. 76.0%). This simulation-based study demonstrated the feasibility of identifying suspicious areas using a single low-dose projection for self-steering tomosynthesis.


Subject(s)
Breast Neoplasms , Mammography , Humans , Female , Mammography/methods , Cross-Sectional Studies , Breast/diagnostic imaging , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Imaging, Three-Dimensional/methods
3.
F1000Res ; 10: 323, 2021.
Article in English | MEDLINE | ID: mdl-34164114

ABSTRACT

Non-coding RNAs (ncRNAs) are important players in the cellular regulation of organisms from different kingdoms. One of the key steps in ncRNAs research is the ability to distinguish coding/non-coding sequences. We applied seven machine learning algorithms (Naive Bayes, SVM, KNN, Random Forest, XGBoost, ANN and DL) through 15 model organisms from different evolutionary branches. Then, we created a stand-alone and web server tool (RNAmining) to distinguish coding and non-coding sequences, selecting the algorithm with the best performance (XGBoost). Firstly, we used coding/non-coding sequences downloaded from Ensembl (April 14th, 2020). Then, coding/non-coding sequences were balanced, had their tri-nucleotides counts analysed and we performed a normalization by the sequence length. Thus, in total we built 180 models. All the machine learning algorithms tests were performed using 10-folds cross-validation and we selected the algorithm with the best results (XGBoost) to implement at RNAmining. Best F1-scores ranged from 97.56% to 99.57% depending on the organism. Moreover, we produced a benchmarking with other tools already in literature (CPAT, CPC2, RNAcon and Transdecoder) and our results outperformed them, opening opportunities for the development of RNAmining, which is freely available at https://rnamining.integrativebioinformatics.me/.


Subject(s)
Machine Learning , RNA , Algorithms , Bayes Theorem , Support Vector Machine
4.
BMC Res Notes ; 13(1): 338, 2020 Jul 14.
Article in English | MEDLINE | ID: mdl-32665017

ABSTRACT

OBJECTIVE: Data normalization and clustering are mandatory steps in gene expression and downstream analyses, respectively. However, user-friendly implementations of these methodologies are available exclusively under expensive licensing agreements, or in stand-alone scripts developed, reflecting on a great obstacle for users with less computational skills. RESULTS: We developed an online tool called CORAZON (Correlations Analyses Zipper Online), which implements three unsupervised learning methods to cluster gene expression datasets in a friendly environment. It allows the usage of eight gene expression normalization/transformation methodologies and the attribute's influence. The normalizations requiring the gene length only could be performed to RNA-seq, meanwhile the others can be used with microarray and/or NanoString data. Clustering methodologies performances were evaluated through five models with accuracies between 92 and 100%. We applied our tool to obtain functional insights of non-coding RNAs (ncRNAs) based on Gene Ontology enrichment of clusters in a dataset generated by the ENCODE project. The clusters where the majority of transcripts are coding genes were enriched in Cellular, Metabolic, Transports, and Systems Development categories. Meanwhile, the ncRNAs were enriched in the Detection of Stimulus, Sensory Perception, Immunological System, and Digestion categories. CORAZON source-code is freely available at https://gitlab.com/integrativebioinformatics/corazon and the web-server can be accessed at http://corazon.integrativebioinformatics.me .


Subject(s)
Computers , Software , Cluster Analysis , Gene Expression Profiling , Gene Ontology , Internet , RNA, Untranslated
5.
PLoS One ; 11(1): e0146352, 2016.
Article in English | MEDLINE | ID: mdl-26731657

ABSTRACT

Genomic Islands (GIs) are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP--Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me.


Subject(s)
Genome, Bacterial , Genomic Islands , Genomics/methods , Algorithms , Cluster Analysis
7.
Bioinformatics ; 28(18): 2297-303, 2012 Sep 15.
Article in English | MEDLINE | ID: mdl-22730432

ABSTRACT

MOTIVATION: Blood cell development is thought to be controlled by a circuit of transcription factors (TFs) and chromatin modifications that determine the cell fate through activating cell type-specific expression programs. To shed light on the interplay between histone marks and TFs during blood cell development, we model gene expression from regulatory signals by means of combinations of sparse linear regression models. RESULTS: The mixture of sparse linear regression models was able to improve the gene expression prediction in relation to the use of a single linear model. Moreover, it performed an efficient selection of regulatory signals even when analyzing all TFs with known motifs (>600). The method identified interesting roles for histone modifications and a selection of TFs related to blood development and chromatin remodelling. AVAILABILITY: The method and datasets are available from http://www.cin.ufpe.br/~igcf/SparseMix. CONTACT: igcf@cin.ufpe.br SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Blood Cells/metabolism , Epigenesis, Genetic , Transcription, Genetic , Animals , Bayes Theorem , Binding Sites , Cell Differentiation/genetics , Embryonic Stem Cells/metabolism , Histones/metabolism , Linear Models , Mice , Promoter Regions, Genetic , Transcription Factors/metabolism
8.
BMC Bioinformatics ; 12 Suppl 1: S29, 2011 Feb 15.
Article in English | MEDLINE | ID: mdl-21342559

ABSTRACT

BACKGROUND: The differentiation process from stem cells to fully differentiated cell types is controlled by the interplay of chromatin modifications and transcription factor activity. Histone modifications or transcription factors frequently act in a multi-functional manner, with a given DNA motif or histone modification conveying both transcriptional repression and activation depending on its location in the promoter and other regulatory signals surrounding it. RESULTS: To account for the possible multi functionality of regulatory signals, we model the observed gene expression patterns by a mixture of linear regression models. We apply the approach to identify the underlying histone modifications and transcription factors guiding gene expression of differentiated CD4+ T cells. The method improves the gene expression prediction in relation to the use of a single linear model, as often used by previous approaches. Moreover, it recovered the known role of the modifications H3K4me3 and H3K27me3 in activating cell specific genes and of some transcription factors related to CD4+ T differentiation.


Subject(s)
CD4-Positive T-Lymphocytes/cytology , Cell Differentiation , Histones/metabolism , Transcription Factors/metabolism , Bayes Theorem , CD4-Positive T-Lymphocytes/metabolism , DNA/genetics , DNA/metabolism , Gene Expression Regulation , Histones/genetics , Linear Models , Protein Binding , Transcription Factors/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...