Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Insects ; 13(7)2022 Jul 11.
Article in English | MEDLINE | ID: mdl-35886794

ABSTRACT

We provide here an updated description of the REDfly (Regulatory Element Database for Fly) database of transcriptional regulatory elements, a unique resource that provides regulatory annotation for the genome of Drosophila and other insects. The genomic sequences regulating insect gene expression-transcriptional cis-regulatory modules (CRMs, e.g., "enhancers") and transcription factor binding sites (TFBSs)-are not currently curated by any other major database resources. However, knowledge of such sequences is important, as CRMs play critical roles with respect to disease as well as normal development, phenotypic variation, and evolution. Characterized CRMs also provide useful tools for both basic and applied research, including developing methods for insect control. REDfly, which is the most detailed existing platform for metazoan regulatory-element annotation, includes over 40,000 experimentally verified CRMs and TFBSs along with their DNA sequences, their associated genes, and the expression patterns they direct. Here, we briefly describe REDfly's contents and data model, with an emphasis on the new features implemented since 2020. We then provide an illustrated walk-through of several common REDfly search use cases.

2.
PLoS One ; 13(6): e0198883, 2018.
Article in English | MEDLINE | ID: mdl-29924841

ABSTRACT

The Machine Recognition of Crystallization Outcomes (MARCO) initiative has assembled roughly half a million annotated images of macromolecular crystallization experiments from various sources and setups. Here, state-of-the-art machine learning algorithms are trained and tested on different parts of this data set. We find that more than 94% of the test images can be correctly labeled, irrespective of their experimental origin. Because crystal recognition is key to high-density screening and the systematic analysis of crystallization experiments, this approach opens the door to both industrial and fundamental research applications.


Subject(s)
Crystallization , Crystallography, X-Ray , Image Processing, Computer-Assisted , Neural Networks, Computer , Algorithms , Datasets as Topic
3.
Nucleic Acids Res ; 45(13): 7965-7983, 2017 Jul 27.
Article in English | MEDLINE | ID: mdl-28535252

ABSTRACT

Uridine insertion/deletion RNA editing is an essential process in kinetoplastid parasites whereby mitochondrial mRNAs are modified through the specific insertion and deletion of uridines to generate functional open reading frames, many of which encode components of the mitochondrial respiratory chain. The roles of numerous non-enzymatic editing factors have remained opaque given the limitations of conventional methods to interrogate the order and mechanism by which editing progresses and thus roles of individual proteins. Here, we examined whole populations of partially edited sequences using high throughput sequencing and a novel bioinformatic platform, the Trypanosome RNA Editing Alignment Tool (TREAT), to elucidate the roles of three proteins in the RNA Editing Mediator Complex (REMC). We determined that the factors examined function in the progression of editing through a gRNA; however, they have distinct roles and REMC is likely heterogeneous in composition. We provide the first evidence that editing can proceed through numerous paths within a single gRNA and that non-linear modifications are essential, generating commonly observed junction regions. Our data support a model in which RNA editing is executed via multiple paths that necessitate successive re-modification of junction regions facilitated, in part, by the REMC variant containing TbRGG2 and MRB8180.


Subject(s)
Protozoan Proteins/genetics , Protozoan Proteins/metabolism , RNA Editing/genetics , RNA, Guide, Kinetoplastida/genetics , RNA, Guide, Kinetoplastida/metabolism , RNA, Protozoan/genetics , RNA, Protozoan/metabolism , Trypanosoma brucei brucei/genetics , Trypanosoma brucei brucei/metabolism , Base Sequence , Cell Line , Models, Biological , RNA Interference , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism
4.
J Appl Crystallogr ; 49(Pt 6): 2082-2090, 2016 Dec 01.
Article in English | MEDLINE | ID: mdl-27980513

ABSTRACT

Haptic interfaces have become common in consumer electronics. They enable easy interaction and information entry without the use of a mouse or keyboard. The work presented here illustrates the application of a haptic interface to crystallization screening in order to provide a natural means for visualizing and selecting results. By linking this to a cloud-based database and web-based application program interface, the same application shifts the approach from 'point and click' to 'touch and share', where results can be selected, annotated and discussed collaboratively. In the crystallographic application, given a suitable crystallization plate, beamline and robotic end effector, the resulting information can be used to close the loop between screening and X-ray analysis, allowing a direct and efficient 'screen to beam' approach. The application is not limited to the area of crystallization screening; 'touch and share' can be used by any information-rich scientific analysis and geographically distributed collaboration.

5.
RNA ; 22(5): 677-95, 2016 May.
Article in English | MEDLINE | ID: mdl-26908922

ABSTRACT

Uridine insertion/deletion RNA editing in kinetoplastids entails the addition and deletion of uridine residues throughout the length of mitochondrial transcripts to generate translatable mRNAs. This complex process requires the coordinated use of several multiprotein complexes as well as the sequential use of noncoding template RNAs called guide RNAs. The majority of steady-state mitochondrial mRNAs are partially edited and often contain regions of mis-editing, termed junctions, whose role is unclear. Here, we report a novel method for sequencing entire populations of pre-edited partially edited, and fully edited RNAs and analyzing editing characteristics across populations using a new bioinformatics tool, the Trypanosome RNA Editing Alignment Tool (TREAT). Using TREAT, we examined populations of two transcripts, RPS12 and ND7-5', in wild-typeTrypanosoma brucei We provide evidence that the majority of partially edited sequences contain junctions, that intrinsic pause sites arise during the progression of editing, and that the mechanisms that mediate pausing in the generation of canonical fully edited sequences are distinct from those that mediate the ends of junction regions. Furthermore, we identify alternatively edited sequences that constitute plausible alternative open reading frames and identify substantial variability in the 5' UTRs of both canonical and alternatively edited sequences. This work is the first to use high-throughput sequencing to examine full-length sequences of whole populations of partially edited transcripts. Our method is highly applicable to current questions in the RNA editing field, including defining mechanisms of action for editing factors and identifying potential alternatively edited sequences.


Subject(s)
High-Throughput Nucleotide Sequencing , RNA Editing , RNA, Messenger/genetics , Trypanosoma brucei brucei/genetics , Algorithms , Animals
6.
PLoS One ; 9(7): e101123, 2014.
Article in English | MEDLINE | ID: mdl-24988076

ABSTRACT

X-ray crystallography is the predominant method for obtaining atomic-scale information about biological macromolecules. Despite the success of the technique, obtaining well diffracting crystals still critically limits going from protein to structure. In practice, the crystallization process proceeds through knowledge-informed empiricism. Better physico-chemical understanding remains elusive because of the large number of variables involved, hence little guidance is available to systematically identify solution conditions that promote crystallization. To help determine relationships between macromolecular properties and their crystallization propensity, we have trained statistical models on samples for 182 proteins supplied by the Northeast Structural Genomics consortium. Gaussian processes, which capture trends beyond the reach of linear statistical models, distinguish between two main physico-chemical mechanisms driving crystallization. One is characterized by low levels of side chain entropy and has been extensively reported in the literature. The other identifies specific electrostatic interactions not previously described in the crystallization context. Because evidence for two distinct mechanisms can be gleaned both from crystal contacts and from solution conditions leading to successful crystallization, the model offers future avenues for optimizing crystallization screens based on partial structural information. The availability of crystallization data coupled with structural outcomes analyzed through state-of-the-art statistical models may thus guide macromolecular crystallization toward a more rational basis.


Subject(s)
Crystallography, X-Ray/methods , Databases, Protein , Models, Chemical , Proteins/chemistry
7.
PLoS One ; 9(6): e100782, 2014.
Article in English | MEDLINE | ID: mdl-24971458

ABSTRACT

Many bioscience fields employ high-throughput methods to screen multiple biochemical conditions. The analysis of these becomes tedious without a degree of automation. Crystallization, a rate limiting step in biological X-ray crystallography, is one of these fields. Screening of multiple potential crystallization conditions (cocktails) is the most effective method of probing a proteins phase diagram and guiding crystallization but the interpretation of results can be time-consuming. To aid this empirical approach a cocktail distance coefficient was developed to quantitatively compare macromolecule crystallization conditions and outcome. These coefficients were evaluated against an existing similarity metric developed for crystallization, the C6 metric, using both virtual crystallization screens and by comparison of two related 1,536-cocktail high-throughput crystallization screens. Hierarchical clustering was employed to visualize one of these screens and the crystallization results from an exopolyphosphatase-related protein from Bacteroides fragilis, (BfR192) overlaid on this clustering. This demonstrated a strong correlation between certain chemically related clusters and crystal lead conditions. While this analysis was not used to guide the initial crystallization optimization, it led to the re-evaluation of unexplained peaks in the electron density map of the protein and to the insertion and correct placement of sodium, potassium and phosphate atoms in the structure. With these in place, the resulting structure of the putative active site demonstrated features consistent with active sites of other phosphatases which are involved in binding the phosphoryl moieties of nucleotide triphosphates. The new distance coefficient, CDcoeff, appears to be robust in this application, and coupled with hierarchical clustering and the overlay of crystallization outcome, reveals information of biological relevance. While tested with a single example the potential applications related to crystallography appear promising and the distance coefficient, clustering, and hierarchal visualization of results undoubtedly have applications in wider fields.


Subject(s)
Bacterial Proteins/chemistry , Macromolecular Substances/chemistry , Bacteroides fragilis/metabolism , Catalytic Domain , Cluster Analysis , Crystallization , Crystallography, X-Ray , Hydrogen-Ion Concentration , Models, Theoretical , Phosphates/chemistry , Polyethylene Glycols/chemistry , Potassium/chemistry , Sodium/chemistry
8.
Adv Bioinformatics ; 2013: 790567, 2013.
Article in English | MEDLINE | ID: mdl-24223587

ABSTRACT

Introduction. The microarray datasets from the MicroArray Quality Control (MAQC) project have enabled the assessment of the precision, comparability of microarrays, and other various microarray analysis methods. However, to date no studies that we are aware of have reported the performance of missing value imputation schemes on the MAQC datasets. In this study, we use the MAQC Affymetrix datasets to evaluate several imputation procedures in Affymetrix microarrays. Results. We evaluated several cutting edge imputation procedures and compared them using different error measures. We randomly deleted 5% and 10% of the data and imputed the missing values using imputation tests. We performed 1000 simulations and averaged the results. The results for both 5% and 10% deletion are similar. Among the imputation methods, we observe the local least squares method with k = 4 is most accurate under the error measures considered. The k-nearest neighbor method with k = 1 has the highest error rate among imputation methods and error measures. Conclusions. We conclude for imputing missing values in Affymetrix microarray datasets, using the MAS 5.0 preprocessing scheme, the local least squares method with k = 4 has the best overall performance and k-nearest neighbor method with k = 1 has the worst overall performance. These results hold true for both 5% and 10% missing values.

9.
BMC Bioinformatics ; 14: 13, 2013 Jan 16.
Article in English | MEDLINE | ID: mdl-23323884

ABSTRACT

BACKGROUND: Gene fusions are the result of chromosomal aberrations and encode chimeric RNA (fusion transcripts) that play an important role in cancer genesis. Recent advances in high throughput transcriptome sequencing have given rise to computational methods for new fusion discovery. The ability to simulate fusion transcripts is essential for testing and improving those tools. RESULTS: To facilitate this need, we developed FUSIM (FUsion SIMulator), a software tool for simulating fusion transcripts. The simulation of events known to create fusion genes and their resulting chimeric proteins is supported, including inter-chromosome translocation, trans-splicing, complex chromosomal rearrangements, and transcriptional read through events. CONCLUSIONS: FUSIM provides the ability to assemble a dataset of fusion transcripts useful for testing and benchmarking applications in fusion gene discovery.


Subject(s)
Gene Fusion , RNA/genetics , Software , Computer Simulation , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Humans , Mutant Chimeric Proteins/genetics , RNA/metabolism , Sequence Analysis, RNA
10.
BMC Genomics ; 13: 44, 2012 Jan 25.
Article in English | MEDLINE | ID: mdl-22276777

ABSTRACT

BACKGROUND: Single nucleotide polymorphisms (SNPs) can lead to the susceptibility and onset of diseases through their effects on gene expression at the posttranscriptional level. Recent findings indicate that SNPs could create, destroy, or modify the efficiency of miRNA binding to the 3'UTR of a gene, resulting in gene dysregulation. With the rapidly growing number of published disease-associated SNPs (dSNPs), there is a strong need for resources specifically recording dSNPs on the 3'UTRs and their nucleotide distance from miRNA target sites. We present here miRdSNP, a database incorporating three important areas of dSNPs, miRNA target sites, and diseases. DESCRIPTION: miRdSNP provides a unique database of dSNPs on the 3'UTRs of human genes manually curated from PubMed. The current release includes 786 dSNP-disease associations for 630 unique dSNPs and 204 disease types. miRdSNP annotates genes with experimentally confirmed targeting by miRNAs and indexes miRNA target sites predicted by TargetScan and PicTar as well as potential miRNA target sites newly generated by dSNPs. A robust web interface and search tools are provided for studying the proximity of miRNA binding sites to dSNPs in relation to human diseases. Searches can be dynamically filtered by gene name, miRBase ID, target prediction algorithm, disease, and any nucleotide distance between dSNPs and miRNA target sites. Results can be viewed at the sequence level showing the annotated locations for miRNA target sites and dSNPs on the entire 3'UTR sequences. The integration of dSNPs with the UCSC Genome browser is also supported. CONCLUSION: miRdSNP provides a comprehensive data source of dSNPs and robust tools for exploring their distance from miRNA target sites on the 3'UTRs of human genes. miRdSNP enables researchers to further explore the molecular mechanism of gene dysregulation for dSNPs at posttranscriptional level. miRdSNP is freely available on the web at http://mirdsnp.ccr.buffalo.edu.


Subject(s)
Databases, Genetic , Disease/genetics , MicroRNAs/metabolism , Polymorphism, Single Nucleotide , 3' Untranslated Regions , Algorithms , Humans , Internet , Software
11.
Comp Funct Genomics ; 2011: 910769, 2011.
Article in English | MEDLINE | ID: mdl-22110399

ABSTRACT

MicroRNAs (miRNAs) regulate gene expression posttranscriptionally. Although previous efforts have demonstrated the functional importance of target sites on miRNAs, little is known about the influence of the rest of 3' untranslated regions (3'UTRs) of target genes on microRNA function. We conducted a genome-wide study and found that the entire 3'UTR sequences could also play important roles on miRNA function in addition to miRNA target sites. This was evidenced by the fact that human single nucleotide polymorphisms (SNPs) on both seed target region and the rest of 3'UTRs of miRNA target genes were under significantly stronger negative selection, when compared to non-miRNA target genes. We also discovered that the flanking nucleotides on both sides of miRNA target sites were subject to moderate strong selection. A local sequence region of ~67 nucleotides with symmetric structure is herein defined. Additionally, from gene expression analysis, we found that SNPs and miRNA target sites on target sequences may interactively affect gene expression.

12.
Int J Bioinform Res Appl ; 6(6): 584-93, 2010.
Article in English | MEDLINE | ID: mdl-21354964

ABSTRACT

While the technologies for high dimensional data have been advancing, a lack of adequate visualisation tools to accommodate the results and inability to integrate multiple sources of data has emerged. The move towards multi-disciplinary work and collaborative research impresses the need for visualisation and analysis tools that are platform independent and customisable. iGenomicViewer through the use of customisable tool-tips that may include links and images, allows for a greater level of data integration for genomic data in a variety of formats. The iGenomicViewer is a freely available R software which allows users to generate interactive, platform-independent plots of genomic data.


Subject(s)
Genome , Genomics/methods , Software , Computer Graphics , Databases, Genetic , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...