Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 93
Filter
1.
J Comput Biol ; 30(3): 323-336, 2023 03.
Article in English | MEDLINE | ID: mdl-36322888

ABSTRACT

Information theory-based measures of variable dependency (previously published) have been implemented into a software package, MIST. The design of the software and its potential uses are described, and a demonstration is presented in the discovery of modifier alleles of the ApoE gene in affecting Alzheimer's disease (AD) by analyzing the UK Biobank dataset. The modifier genes uncovered overlap strongly with genes found to be associated with AD. Others include many known to influence AD. We discuss a range of uses of the dependency calculations using MIST that can uncover additional genetic effects in similar complex datasets, like higher degrees of interaction and phenotypic pleiotropy.


Subject(s)
Alzheimer Disease , Humans , Alleles , Alzheimer Disease/genetics , Information Theory , Apolipoproteins E/genetics , Genotype
2.
iScience ; 25(8): 104653, 2022 Aug 19.
Article in English | MEDLINE | ID: mdl-35958027

ABSTRACT

The extracellular RNA communication consortium (ERCC) is an NIH-funded program aiming to promote the development of new technologies, resources, and knowledge about exRNAs and their carriers. After Phase 1 (2013-2018), Phase 2 of the program (ERCC2, 2019-2023) aims to fill critical gaps in knowledge and technology to enable rigorous and reproducible methods for separation and characterization of both bulk populations of exRNA carriers and single EVs. ERCC2 investigators are also developing new bioinformatic pipelines to promote data integration through the exRNA atlas database. ERCC2 has established several Working Groups (Resource Sharing, Reagent Development, Data Analysis and Coordination, Technology Development, nomenclature, and Scientific Outreach) to promote collaboration between ERCC2 members and the broader scientific community. We expect that ERCC2's current and future achievements will significantly improve our understanding of exRNA biology and the development of accurate and efficient exRNA-based diagnostic, prognostic, and theranostic biomarker assays.

3.
Front Neurosci ; 15: 720778, 2021.
Article in English | MEDLINE | ID: mdl-34580583

ABSTRACT

A history of traumatic brain injury (TBI) increases the odds of developing Alzheimer's disease (AD). The long latent period between injury and dementia makes it difficult to study molecular changes initiated by TBI that may increase the risk of developing AD. MicroRNA (miRNA) levels are altered in TBI at acute times post-injury (<4 weeks), and in AD. We hypothesized that miRNA levels in cerebrospinal fluid (CSF) following TBI in veterans may be indicative of increased risk for developing AD. Our population of interest is cognitively normal veterans with a history of one or more mild TBI (mTBI) at a chronic time following TBI. We measured miRNA levels in CSF from three groups of participants: (1) community controls with no lifetime history of TBI (ComC); (2) deployed Iraq/Afghanistan veterans with no lifetime history of TBI (DepC), and (3) deployed Iraq/Afghanistan veterans with a history of repetitive blast mTBI (DepTBI). CSF samples were collected at the baseline visit in a longitudinal, multimodal assessment of Gulf War veterans, and represent a heterogenous group of male veterans and community controls. The average time since the last blast mTBI experienced was 4.7 ± 2.2 years [1.5 - 11.5]. Statistical analysis of TaqManTM miRNA array data revealed 18 miRNAs with significant differential expression in the group comparisons: 10 between DepTBI and ComC, 7 between DepC and ComC, and 8 between DepTBI and DepC. We also identified 8 miRNAs with significant differential detection in the group comparisons: 5 in DepTBI vs. ComC, 3 in DepC vs. ComC, and 2 in DepTBI vs. DepC. When we applied our previously developed multivariable dependence analysis, we found 13 miRNAs (6 of which are altered in levels or detection) that show dependencies with participant phenotypes, e.g., ApoE. Target prediction and pathway analysis with miRNAs differentially expressed in DepTBI vs. either DepC or ComC identified canonical pathways highly relevant to TBI including senescence and ephrin receptor signaling, respectively. This study shows that both TBI and deployment result in persistent changes in CSF miRNA levels that are relevant to known miRNA-mediated AD pathology, and which may reflect early events in AD.

4.
BMC Bioinformatics ; 22(1): 180, 2021 Apr 07.
Article in English | MEDLINE | ID: mdl-33827420

ABSTRACT

BACKGROUND: Permutation testing is often considered the "gold standard" for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large. RESULTS: In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP-SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 103 for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples. CONCLUSIONS: The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts .


Subject(s)
Epistasis, Genetic , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Genotype , Humans , Phenotype
5.
Sci Rep ; 11(1): 3627, 2021 02 11.
Article in English | MEDLINE | ID: mdl-33574451

ABSTRACT

Our aim was to investigate the associations between erythrocyte fatty acids and the risk of islet autoimmunity in children. The Environmental Determinants of Diabetes in the Young Study (TEDDY) is a longitudinal cohort study of children at high genetic risk for type 1 diabetes (n = 8676) born between 2004 and 2010 in the U.S., Finland, Sweden, and Germany. A nested case-control design comprised 398 cases with islet autoimmunity and 1178 sero-negative controls matched for clinical site, family history, and gender. Fatty acids composition was measured in erythrocytes collected at the age of 3, 6, and 12 months and then annually up to 6 years of age. Conditional logistic regression models were adjusted for HLA risk genotype, ancestry, and weight z-score. Higher eicosapentaenoic and docosapentaenoic acid (n - 3 polyunsaturated fatty acids) levels during infancy and conjugated linoleic acid after infancy were associated with a lower risk of islet autoimmunity. Furthermore, higher levels of some even-chain saturated (SFA) and monounsaturated fatty acids (MUFA) were associated with increased risk. Fatty acid status in early life may signal the risk for islet autoimmunity, especially n - 3 fatty acids may be protective, while increased levels of some SFAs and MUFAs may precede islet autoimmunity.


Subject(s)
Autoimmunity , Erythrocytes/metabolism , Fatty Acids/metabolism , Islets of Langerhans/immunology , Breast Feeding , Case-Control Studies , Child , Child, Preschool , Female , Humans , Infant , Male , Risk Factors
6.
J Comput Biol ; 28(6): 527-559, 2021 06.
Article in English | MEDLINE | ID: mdl-33395537

ABSTRACT

Quantitative genetics has evolved dramatically in the past century, and the proliferation of genetic data, in quantity as well as type, enables the characterization of complex interactions and mechanisms beyond the scope of its theoretical foundations. In this article, we argue that revisiting the framework for analysis is important and we begin to lay the foundations of an alternative formulation of quantitative genetics based on information theory. Information theory can provide sensitive and unbiased measures of statistical dependencies among variables, and it provides a natural mathematical language for an alternative view of quantitative genetics. In the previous work, we examined the information content of discrete functions and applied this approach and methods to the analysis of genetic data. In this article, we present a framework built around a set of relationships that both unifies the information measures for the discrete functions and uses them to express key quantitative genetic relationships. Information theory measures of variable interdependency are used to identify significant interactions, and a general approach is described for inferring functional relationships in genotype and phenotype data. We present information-based measures of the genetic quantities: penetrance, heritability, and degrees of statistical epistasis. Our scope here includes the consideration of both two- and three-variable dependencies and independently segregating variants, which captures additive effects, genetic interactions, and two-phenotype pleiotropy. This formalism and the theoretical approach naturally apply to higher multivariable interactions and complex dependencies, and can be adapted to account for population structure, linkage, and nonrandomly segregating markers. This article thus focuses on presenting the initial groundwork for a full formulation of quantitative genetics based on information theory.


Subject(s)
Information Theory , Models, Genetic , Databases, Genetic , Genome, Fungal , Genome-Wide Association Study/methods , Genomics/methods , Polymorphism, Single Nucleotide , Saccharomyces cerevisiae
7.
Entropy (Basel) ; 22(12)2020 Nov 24.
Article in English | MEDLINE | ID: mdl-33266517

ABSTRACT

Information theory provides robust measures of multivariable interdependence, but classically does little to characterize the multivariable relationships it detects. The Partial Information Decomposition (PID) characterizes the mutual information between variables by decomposing it into unique, redundant, and synergistic components. This has been usefully applied, particularly in neuroscience, but there is currently no generally accepted method for its computation. Independently, the Information Delta framework characterizes non-pairwise dependencies in genetic datasets. This framework has developed an intuitive geometric interpretation for how discrete functions encode information, but lacks some important generalizations. This paper shows that the PID and Delta frameworks are largely equivalent. We equate their key expressions, allowing for results in one framework to apply towards open questions in the other. For example, we find that the approach of Bertschinger et al. is useful for the open Information Delta question of how to deal with linkage disequilibrium. We also show how PID solutions can be mapped onto the space of delta measures. Using Bertschinger et al. as an example solution, we identify a specific plane in delta-space on which this approach's optimization is constrained, and compute it for all possible three-variable discrete functions of a three-letter alphabet. This yields a clear geometric picture of how a given solution decomposes information.

8.
PLoS One ; 15(12): e0242684, 2020.
Article in English | MEDLINE | ID: mdl-33270668

ABSTRACT

The genetic mechanisms of childhood development in its many facets remain largely undeciphered. In the population of healthy infants studied in the Growing Up in Singapore Towards Healthy Outcomes (GUSTO) program, we have identified a range of dependencies among the observed phenotypes of fetal and early childhood growth, neurological development, and a number of genetic variants. We have quantified these dependencies using our information theory-based methods. The genetic variants show dependencies with single phenotypes as well as pleiotropic effects on more than one phenotype and thereby point to a large number of brain-specific and brain-expressed gene candidates. These dependencies provide a basis for connecting a range of variants with a spectrum of phenotypes (pleiotropy) as well as with each other. A broad survey of known regulatory expression characteristics, and other function-related information from the literature for these sets of candidate genes allowed us to assemble an integrated body of evidence, including a partial regulatory network, that points towards the biological basis of these general dependencies. Notable among the implicated loci are RAB11FIP4 (next to NF1), MTMR7 and PLD5, all highly expressed in the brain; DNMT1 (DNA methyl transferase), highly expressed in the placenta; and PPP1R12B and DMD (dystrophin), known to be important growth and development genes. While we cannot specify and decipher the mechanisms responsible for the phenotypes in this study, a number of connections for further investigation of fetal and early childhood growth and neurological development are indicated. These results and this approach open the door to new explorations of early human development.


Subject(s)
Child Development , Fetal Development/genetics , Nervous System/growth & development , Child , Chromatin/genetics , Epistasis, Genetic , Gene Expression Profiling , Gene Expression Regulation, Developmental , Gene Regulatory Networks , Genetic Loci , Genetic Pleiotropy , Genome-Wide Association Study , Genotype , Humans , Linkage Disequilibrium/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics
9.
Cell Rep ; 29(12): 4212-4222.e5, 2019 12 17.
Article in English | MEDLINE | ID: mdl-31851944

ABSTRACT

Given the increasing interest in their use as disease biomarkers, the establishment of reproducible, accurate, sensitive, and specific platforms for microRNA (miRNA) quantification in biofluids is of high priority. We compare four platforms for these characteristics: small RNA sequencing (RNA-seq), FirePlex, EdgeSeq, and nCounter. For a pool of synthetic miRNAs, coefficients of variation for technical replicates are lower for EdgeSeq (6.9%) and RNA-seq (8.2%) than for FirePlex (22.4%); nCounter replicates are not performed. Receiver operating characteristic analysis for distinguishing present versus absent miRNAs shows small RNA-seq (area under curve 0.99) is superior to EdgeSeq (0.97), nCounter (0.94), and FirePlex (0.81). Expected differences in expression of placenta-associated miRNAs in plasma from pregnant and non-pregnant women are observed with RNA-seq and EdgeSeq, but not FirePlex or nCounter. These results indicate that differences in performance among miRNA profiling platforms impact ability to detect biological differences among samples and thus their relative utility for research and clinical use.


Subject(s)
Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , MicroRNAs/blood , MicroRNAs/genetics , Placenta/metabolism , Sequence Analysis, RNA/methods , Adult , Female , Humans , Male , Middle Aged , Pregnancy , ROC Curve , Reproducibility of Results , Young Adult
10.
Front Comput Neurosci ; 13: 75, 2019.
Article in English | MEDLINE | ID: mdl-31736734

ABSTRACT

Resting state networks (RSNs) extracted from functional magnetic resonance imaging (fMRI) scans are believed to reflect the intrinsic organization and network structure of brain regions. Most traditional methods for computing RSNs typically assume these functional networks are static throughout the duration of a scan lasting 5-15 min. However, they are known to vary on timescales ranging from seconds to years; in addition, the dynamic properties of RSNs are affected in a wide variety of neurological disorders. Recently, there has been a proliferation of methods for characterizing RSN dynamics, yet it remains a challenge to extract reproducible time-resolved networks. In this paper, we develop a novel method based on dynamic mode decomposition (DMD) to extract networks from short windows of noisy, high-dimensional fMRI data, allowing RSNs from single scans to be resolved robustly at a temporal resolution of seconds. After validating the method on a synthetic dataset, we analyze data from 120 individuals from the Human Connectome Project and show that unsupervised clustering of DMD modes discovers RSNs at both the group (gDMD) and the single subject (sDMD) levels. The gDMD modes closely resemble canonical RSNs. Compared to established methods, sDMD modes capture individualized RSN structure that both better resembles the population RSN and better captures subject-level variation. We further leverage this time-resolved sDMD analysis to infer occupancy and transitions among RSNs with high reproducibility. This automated DMD-based method is a powerful tool to characterize spatial and temporal structures of RSNs in individual subjects.

11.
EMBO J ; 38(11)2019 06 03.
Article in English | MEDLINE | ID: mdl-31053596

ABSTRACT

Extracellular RNAs (exRNAs) in biofluids have attracted great interest as potential biomarkers. Although extracellular microRNAs in blood plasma are extensively characterized, extracellular messenger RNA (mRNA) and long non-coding RNA (lncRNA) studies are limited. We report that plasma contains fragmented mRNAs and lncRNAs that are missed by standard small RNA-seq protocols due to lack of 5' phosphate or presence of 3' phosphate. These fragments were revealed using a modified protocol ("phospho-RNA-seq") incorporating RNA treatment with T4-polynucleotide kinase, which we compared with standard small RNA-seq for sequencing synthetic RNAs with varied 5' and 3' ends, as well as human plasma exRNA Analyzing phospho-RNA-seq data using a custom, high-stringency bioinformatic pipeline, we identified mRNA/lncRNA transcriptome fingerprints in plasma, including tissue-specific gene sets. In a longitudinal study of hematopoietic stem cell transplant patients, bone marrow- and liver-enriched exRNA genes were tracked with bone marrow recovery and liver injury, respectively, providing proof-of-concept validation as a biomarker approach. By enabling access to an unexplored realm of mRNA and lncRNA fragments, phospho-RNA-seq opens up new possibilities for plasma transcriptomic biomarker development.


Subject(s)
Biomarkers/blood , Cell-Free Nucleic Acids/analysis , MicroRNAs/blood , RNA, Long Noncoding/analysis , RNA, Messenger/analysis , RNA-Seq/methods , Biomarkers/analysis , Blood Chemical Analysis/methods , Cell-Free Nucleic Acids/blood , Computational Biology/methods , Gene Expression Profiling/methods , Humans , MicroRNAs/analysis , RNA, Long Noncoding/blood , RNA, Messenger/blood , Sequence Analysis, RNA/methods
12.
J Clin Med ; 8(5)2019 May 07.
Article in English | MEDLINE | ID: mdl-31067715

ABSTRACT

Recently, microRNAs (miRNAs) in circulating extracellular vesicles (EVs), have emerged as a source of potential biomarkers for various pathophysiological conditions, including metabolic disorders such as diabetes. Type 2 diabetes mellitus (T2DM), is the most prevalent form of diabetes in the USA, with 30 million diagnosed patients. Identifying miRNA biomarkers that can be used to assess response to glucose lowering treatments would be useful. Using patient plasma samples from a subset of the Danish Metagenomics of the Human Intestinal Tract (MetaHIT) cohort, we characterized miRNAs from whole plasma, plasma-derived EVs, and EV-depleted plasma by small RNA-sequencing to identify T2DM associated miRNAs. We identified several miRNAs that exhibited concentration changes between controls and non-metformin treated T2DM patients and we validated a subset of these by quantitative reverse transcription-polymerase chain reaction (qRT-PCR). The results showed that the concentrations of many T2DM-affected miRNAs in EV (but not in whole or EV-depleted plasma) decreased to levels close to those of healthy controls following metformin treatment. Among other potential uses of these differentially expressed miRNAs, some might be useful in assessing the response to metformin in T2DM patients.

13.
G3 (Bethesda) ; 9(7): 2071-2088, 2019 07 09.
Article in English | MEDLINE | ID: mdl-31109921

ABSTRACT

We describe an information-theory-based method and associated software for computationally identifying sister spores derived from the same meiotic tetrad. The method exploits specific DNA sequence features of tetrads that result from meiotic centromere and allele segregation patterns. Because the method uses only the genomic sequence, it alleviates the need for tetrad-specific barcodes or other genetic modifications to the strains. Using this method, strains derived from randomly arrayed spores can be efficiently grouped back into tetrads.


Subject(s)
Computational Biology/methods , Software , Yeasts/physiology , Alleles , Chromosome Segregation , Gene Expression Regulation, Fungal , Meiosis , Recombination, Genetic , Reproducibility of Results , Spores, Fungal
14.
Cell ; 177(2): 231-242, 2019 04 04.
Article in English | MEDLINE | ID: mdl-30951667

ABSTRACT

The Extracellular RNA Communication Consortium (ERCC) was launched to accelerate progress in the new field of extracellular RNA (exRNA) biology and to establish whether exRNAs and their carriers, including extracellular vesicles (EVs), can mediate intercellular communication and be utilized for clinical applications. Phase 1 of the ERCC focused on exRNA/EV biogenesis and function, discovery of exRNA biomarkers, development of exRNA/EV-based therapeutics, and construction of a robust set of reference exRNA profiles for a variety of biofluids. Here, we present progress by ERCC investigators in these areas, and we discuss collaborative projects directed at development of robust methods for EV/exRNA isolation and analysis and tools for sharing and computational analysis of exRNA profiling data.


Subject(s)
Cell-Free Nucleic Acids/genetics , Cell-Free Nucleic Acids/metabolism , Extracellular Vesicles/genetics , Biomarkers , Humans , Knowledge Bases , MicroRNAs/genetics , RNA/genetics
15.
Cell ; 177(2): 463-477.e15, 2019 04 04.
Article in English | MEDLINE | ID: mdl-30951672

ABSTRACT

To develop a map of cell-cell communication mediated by extracellular RNA (exRNA), the NIH Extracellular RNA Communication Consortium created the exRNA Atlas resource (https://exrna-atlas.org). The Atlas version 4P1 hosts 5,309 exRNA-seq and exRNA qPCR profiles from 19 studies and a suite of analysis and visualization tools. To analyze variation between profiles, we apply computational deconvolution. The analysis leads to a model with six exRNA cargo types (CT1, CT2, CT3A, CT3B, CT3C, CT4), each detectable in multiple biofluids (serum, plasma, CSF, saliva, urine). Five of the cargo types associate with known vesicular and non-vesicular (lipoprotein and ribonucleoprotein) exRNA carriers. To validate utility of this model, we re-analyze an exercise response study by deconvolution to identify physiologically relevant response pathways that were not detected previously. To enable wide application of this model, as part of the exRNA Atlas resource, we provide tools for deconvolution and analysis of user-provided case-control studies.


Subject(s)
Cell Communication/physiology , RNA/metabolism , Adult , Body Fluids/chemistry , Cell-Free Nucleic Acids/metabolism , Circulating MicroRNA/metabolism , Extracellular Vesicles/metabolism , Female , Humans , Male , Reproducibility of Results , Sequence Analysis, RNA/methods , Software
16.
Entropy (Basel) ; 21(1)2019 Jan 18.
Article in English | MEDLINE | ID: mdl-33266804

ABSTRACT

Relations between common information measures include the duality relations based on Möbius inversion on lattices, which are the direct consequence of the symmetries of the lattices of the sets of variables (subsets ordered by inclusion). In this paper we use the lattice and functional symmetries to provide a unifying formalism that reveals some new relations and systematizes the symmetries of the information functions. To our knowledge, this is the first systematic examination of the full range of relationships of this class of functions. We define operators on functions on these lattices based on the Möbius inversions that map functions into one another, which we call Möbius operators, and show that they form a simple group isomorphic to the symmetric group S3. Relations among the set of functions on the lattice are transparently expressed in terms of the operator algebra, and, when applied to the information measures, can be used to derive a wide range of relationships among diverse information measures. The Möbius operator algebra is then naturally generalized which yields an even wider range of new relationships.

17.
J Comput Biol ; 26(2): 152-171, 2019 02.
Article in English | MEDLINE | ID: mdl-30495984

ABSTRACT

Missing values in complex biological data sets have significant impacts on our ability to correctly detect and quantify interactions in biological systems and to infer relationships accurately. In this article, we propose a useful metaphor to show that information theory measures, such as mutual information and interaction information, can be employed directly for evaluating multivariable dependencies even if data contain some missing values. The metaphor is that of thinking of variable dependencies as information channels between and among variables. In this view, missing data can be thought of as noise that reduces the channel capacity in predictable ways. We extract the available information in the data even if there are missing values and use the notion of channel capacity to assess the reliability of the result. This avoids the common practice-in the absence of prior knowledge of random imputation-of eliminating samples entirely, thus losing the information they can provide. We show how this reliability function can be implemented for pairs of variables, and generalize it for an arbitrary number of variables. Illustrations of the reliability functions for several cases are provided using simulated data.


Subject(s)
Databases, Genetic/standards , Information Theory , Multivariate Analysis , Sequence Analysis, DNA/methods , Animals , Data Accuracy , Humans , Reproducibility of Results , Sequence Analysis, DNA/standards
18.
Front Microbiol ; 9: 2015, 2018.
Article in English | MEDLINE | ID: mdl-30214435

ABSTRACT

Bacterial outer membrane vesicles (OMVs), as well as OMV-associated small RNAs, have been demonstrated to play a role in host-pathogen interactions. The presence of larger RNA transcripts in OMVs has been less studied and their potential role in host-pathogen interactions remains largely unknown. Here we analyze RNA from OMVs secreted by Salmonella enterica serovar Typhimurium (S. Typhimurium) cultured under different conditions, which mimic host-pathogen interactions. S. Typhimurium was grown to exponential and stationary growth phases in minimal growth control medium (phosphate-carbon-nitrogen, PCN), as well as in acidic and phosphate-depleted PCN, comparable to the macrophage environment and inducing therefore the expression of Salmonella pathogenicity island 2 (SPI-2) genes. Moreover, Salmonella pathogenicity island 1 (SPI-1), which is required for virulence during the intestinal phase of infection, was induced by culturing S. Typhimurium to the stationary phase in Lysogeny Broth (LB). For each condition, we identified OMV-associated RNAs that are enriched in the extracellular environment relative to the intracellular space. All RNA classes could be observed, but a vast majority of rRNA was exported in all conditions in variable proportions with a notable decrease in LB SPI-1 inducing media. Several mRNAs and ncRNAs were specifically enriched in/on OMVs dependent on the growth conditions. Important to note is that some RNAs showed identical read coverage profiles intracellularly and extracellularly, whereas distinct coverage patterns were observed for other transcripts, suggesting a specific processing or degradation. Moreover, PCR experiments confirmed that distinct RNAs were present in or on OMVs as full-length transcripts (IsrB-1/2; IsrA; ffs; SsrS; CsrC; pSLT035; 10Sa; rnpB; STM0277; sseB; STM0972; STM2606), whereas others seemed to be rather present in a processed or degraded form. Finally, we show by a digestion protection assay that OMVs are able to prevent enzymatic degradation of given full-length transcripts (SsrS, CsrC, 10Sa, and rnpB). In summary, we show that OMV-associated RNA is clearly different in distinct culture conditions and that at least a fraction of the extracellular RNA is associated as a full-length transcripts with OMVs, indicating that some RNAs are protected by OMVs and thereby leaving open the possibility that those might be functionally active.

20.
Nat Biotechnol ; 36(8): 746-757, 2018 09.
Article in English | MEDLINE | ID: mdl-30010675

ABSTRACT

RNA-seq is increasingly used for quantitative profiling of small RNAs (for example, microRNAs, piRNAs and snoRNAs) in diverse sample types, including isolated cells, tissues and cell-free biofluids. The accuracy and reproducibility of the currently used small RNA-seq library preparation methods have not been systematically tested. Here we report results obtained by a consortium of nine labs that independently sequenced reference, 'ground truth' samples of synthetic small RNAs and human plasma-derived RNA. We assessed three commercially available library preparation methods that use adapters of defined sequence and six methods using adapters with degenerate bases. Both protocol- and sequence-specific biases were identified, including biases that reduced the ability of small RNA-seq to accurately measure adenosine-to-inosine editing in microRNAs. We found that these biases were mitigated by library preparation methods that incorporate adapters with degenerate bases. MicroRNA relative quantification between samples using small RNA-seq was accurate and reproducible across laboratories and methods.


Subject(s)
MicroRNAs/genetics , Sequence Analysis, RNA/methods , Adenosine/genetics , Humans , Inosine/genetics , MicroRNAs/blood , MicroRNAs/standards , RNA Editing , Reference Standards , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...