Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters











Database
Language
Publication year range
1.
J Cell Sci ; 136(1)2023 01 01.
Article in English | MEDLINE | ID: mdl-36482762

ABSTRACT

Multiple test corrections are a fundamental step in the analysis of differentially expressed genes, as the number of tests performed would otherwise inflate the false discovery rate (FDR). Recent methods for P-value correction involve a regression model in order to include covariates that are informative of the power of the test. Here, we present Progressive proportions plot (Prog-Plot), a visual tool to identify the functional relationship between the covariate and the proportion of P-values consistent with the null hypothesis. The relationship between the proportion of P-values and the covariate to be included is needed, but there are no available tools to verify it. The approach presented here aims at having an objective way to specify regression models instead of relying on prior knowledge.

2.
MethodsX ; 9: 101733, 2022.
Article in English | MEDLINE | ID: mdl-35637693

ABSTRACT

Machine learning methods were considered efficient in identifying single nucleotide polymorphisms (SNP) underlying a trait of interest. This study aimed to construct predictive models using machine learning algorithms, to identify loci that best explain the variance in milk traits of dairy cattle. Further objectives involved validating the results by comparison with reported relevant regions and retrieving the pathways overrepresented by the genes flanking relevant SNPs. Regression models using XGBoost (XGB), LightGBM (LGB), and Random Forest (RF) algorithms were trained using estimated breeding values for milk production (EBVM), milk fat content (EBVF) and milk protein content (EBVP) as phenotypes and genotypes on 40417 SNPs as predictor variables. To evaluate their efficiency, metrics for actual vs. predicted values were determined in validation folds (XGB and LGB) and out-of-bag data (RF). Less than 4500 relevant SNPs were retrieved for each trait. Among the genes flanking them, signaling and transmembrane transporter activities were overrepresented. The models trained:•Predicted breeding values for animals not included in the dataset.•Were efficient in identifying a subset of SNPs explaining phenotypic variation. The results obtained using XGB and LGB algorithms agreed with previous results. Therefore, the method proposed could be applied for future association studies on milk traits.

3.
J Appl Genet ; 59(1): 1-8, 2018 Feb.
Article in English | MEDLINE | ID: mdl-29190011

ABSTRACT

The objective of this study was to analyze the relevance of relationship information on the identification of low heritability quantitative trait loci (QTLs) from a genome-wide association study (GWAS) and on the genomic prediction of complex traits in human, animal and cross-pollinating populations. The simulation-based data sets included 50 samples of 1000 individuals of seven populations derived from a common population with linkage disequilibrium. The populations had non-inbred and inbred progeny structure (50 to 200) with varying number of members (5 to 20). The individuals were genotyped for 10,000 single nucleotide polymorphisms (SNPs) and phenotyped for a quantitative trait controlled by 10 QTLs and 90 minor genes showing dominance. The SNP density was 0.1 cM and the narrow sense heritability was 25%. The QTL heritabilities ranged from 1.1 to 2.9%. We applied mixed model approaches for both GWAS and genomic prediction using pedigree-based and genomic relationship matrices. For GWAS, the observed false discovery rate was kept below the significance level of 5%, the power of detection for the low heritability QTLs ranged from 14 to 50%, and the average bias between significant SNPs and a QTL ranged from less than 0.01 to 0.23 cM. The QTL detection power was consistently higher using genomic relationship matrix. Regardless of population and training set size, genomic prediction provided higher prediction accuracy of complex trait when compared to pedigree-based prediction. The accuracy of genomic prediction when there is relatedness between individuals in the training set and the reference population is much higher than the value for unrelated individuals.


Subject(s)
Genome-Wide Association Study , Quantitative Trait Loci , Quantitative Trait, Heritable , Animals , Computer Simulation , Genetics, Population , Genotype , Humans , Linkage Disequilibrium , Models, Genetic , Pedigree , Plants , Polymorphism, Single Nucleotide
4.
J Am Stat Assoc ; 113(523): 1028-1039, 2018.
Article in English | MEDLINE | ID: mdl-31249430

ABSTRACT

The identification of reproducible signals from the results of replicate high-throughput experiments is an important part of modern biological research. Often little is known about the dependence structure and the marginal distribution of the data, motivating the development of a nonparametric approach to assess reproducibility. The procedure, which we call the maximum rank reproducibility (MaRR) procedure, uses a maximum rank statistic to parse reproducible signals from noise without making assumptions about the distribution of reproducible signals. Because it uses the rank scale this procedure can be easily applied to a variety of data types. One application is to assess the reproducibility of RNA-seq technology using data produced by the sequencing quality control (SEQC) consortium, which coordinated a multi-laboratory effort to assess reproducibility across three RNA-seq platforms. Our results on simulations and SEQC data show that the MaRR procedure effectively controls false discovery rates, has desirable power properties, and compares well to existing methods. Supplementary materials for this article are available online.

5.
Biochim Biophys Acta ; 1844(1 Pt A): 63-76, 2014 Jan.
Article in English | MEDLINE | ID: mdl-23467006

ABSTRACT

Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.


Subject(s)
Proteomics , Tandem Mass Spectrometry/methods , Computational Biology , Software
6.
Mutat Res ; 756(1-2): 46-55, 2013 Aug 30.
Article in English | MEDLINE | ID: mdl-23817105

ABSTRACT

The genetic heterogeneity presented by different cell lines derived from glioblastoma (GBM) seems to influence their responses to antitumoral agents. Although GBM tumors present several genomic alterations, it has been assumed that TP53, frequently mutated in GBM, may to some extent be responsible for differences in cellular responses to antitumor agents, but this is not clear yet. To directly determine the impact of TP53 on GBM response to ionizing radiation, we compared the transcription profiles of four GBM cell lines (two with wild-type (WT) TP53 and two with mutant (MT) TP53) after 8Gy of gamma-rays. Transcript profiles of cells analyzed 30 min and 6h after irradiation showed that WT TP53 cells presented a higher number of modulated genes than MT TP53 cells. Our findings also indicate that there are several pathways (apoptosis, DNA repair/stress response, cytoskeleton organization and macromolecule metabolic process) in radiation responses of GBM cell lines that were modulated only in WT TP53 cells (30 min and 6h). Interestingly, the majority of differentially expressed genes did not present the TP53 binding site, suggesting secondary effects of TP53 on transcription. We conclude that radiation-induced changes in transcription profiles of irradiated GBM cell lines mainly depend on the functional status of TP53.


Subject(s)
Biomarkers/metabolism , Gene Expression Profiling , Glioblastoma/genetics , Mutation/genetics , Radiation, Ionizing , Tumor Suppressor Protein p53/genetics , Adult , Fluorescent Antibody Technique , Glioblastoma/metabolism , Glioblastoma/pathology , Humans , Oligonucleotide Array Sequence Analysis , RNA, Messenger/genetics , Real-Time Polymerase Chain Reaction , Reverse Transcriptase Polymerase Chain Reaction , Sarcomeres/chemistry , Sarcomeres/metabolism , Tumor Cells, Cultured , Tumor Suppressor Protein p53/deficiency
SELECTION OF CITATIONS
SEARCH DETAIL