Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
2.
Bioinformatics ; 28(1): 136-7, 2012 Jan 01.
Article in English | MEDLINE | ID: mdl-22072385

ABSTRACT

SUMMARY: MR-Tandem adapts the popular X!Tandem peptide search engine to work with Hadoop MapReduce for reliable parallel execution of large searches. MR-Tandem runs on any Hadoop cluster but offers special support for Amazon Web Services for creating inexpensive on-demand Hadoop clusters, enabling search volumes that might not otherwise be feasible with the compute resources a researcher has at hand. MR-Tandem is designed to drop in wherever X!Tandem is already in use and requires no modification to existing X!Tandem parameter files, and only minimal modification to X!Tandem-based workflows. AVAILABILITY AND IMPLEMENTATION: MR-Tandem is implemented as a lightly modified X!Tandem C++ executable and a Python script that drives Hadoop clusters including Amazon Web Services (AWS) Elastic Map Reduce (EMR), using the modified X!Tandem program as a Hadoop Streaming mapper and reducer. The modified X!Tandem C++ source code is Artistic licensed, supports pluggable scoring, and is available as part of the Sashimi project at http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/extern/xtandem/. The MR-Tandem Python script is Apache licensed and available as part of the Insilicos Cloud Army project at http://ica.svn.sourceforge.net/viewvc/ica/trunk/mr-tandem/. Full documentation and a windows installer that configures MR-Tandem, Python and all necessary packages are available at this same URL. CONTACT: brian.pratt@insilicos.com


Subject(s)
Protein Processing, Post-Translational , Proteins/analysis , Proteins/metabolism , Search Engine , Software , Cluster Analysis , Programming Languages , Software/economics
3.
Mol Cell Proteomics ; 10(12): M111.007690, 2011 Dec.
Article in English | MEDLINE | ID: mdl-21876204

ABSTRACT

The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets.


Subject(s)
Data Interpretation, Statistical , Peptide Fragments/chemistry , Proteome/chemistry , Software , Algorithms , Amino Acid Sequence , Humans , Jurkat Cells , Probability , Proteomics , Search Engine , Streptococcus pyogenes , Tandem Mass Spectrometry
4.
Mol Cell Proteomics ; 10(1): R110.000133, 2011 Jan.
Article in English | MEDLINE | ID: mdl-20716697

ABSTRACT

Mass spectrometry is a fundamental tool for discovery and analysis in the life sciences. With the rapid advances in mass spectrometry technology and methods, it has become imperative to provide a standard output format for mass spectrometry data that will facilitate data sharing and analysis. Initially, the efforts to develop a standard format for mass spectrometry data resulted in multiple formats, each designed with a different underlying philosophy. To resolve the issues associated with having multiple formats, vendors, researchers, and software developers convened under the banner of the HUPO PSI to develop a single standard. The new data format incorporated many of the desirable technical attributes from the previous data formats, while adding a number of improvements, including features such as a controlled vocabulary with validation tools to ensure consistent usage of the format, improved support for selected reaction monitoring data, and immediately available implementations to facilitate rapid adoption by the community. The resulting standard data format, mzML, is a well tested open-source format for mass spectrometer output files that can be readily utilized by the community and easily adapted for incremental advances in mass spectrometry technology.


Subject(s)
Databases, Protein/standards , Mass Spectrometry/methods , Mass Spectrometry/standards , Software/standards , Reference Standards , Reproducibility of Results
5.
Proteomics ; 10(6): 1150-9, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20101611

ABSTRACT

The Trans-Proteomic Pipeline (TPP) is a suite of software tools for the analysis of MS/MS data sets. The tools encompass most of the steps in a proteomic data analysis workflow in a single, integrated software system. Specifically, the TPP supports all steps from spectrometer output file conversion to protein-level statistical validation, including quantification by stable isotope ratios. We describe here the full workflow of the TPP and the tools therein, along with an example on a sample data set, demonstrating that the setup and use of the tools are straightforward and well supported and do not require specialized informatic resources or knowledge.


Subject(s)
Databases, Protein , Proteomics/methods , Software , Computational Biology , Information Storage and Retrieval , Isotope Labeling , Sequence Analysis, Protein , Tandem Mass Spectrometry
6.
Proteomics ; 10(6): 1190-5, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20082347

ABSTRACT

Electron transfer dissociation (ETD) is an alternative fragmentation technique to CID that has recently become commercially available. ETD has several advantages over CID. It is less prone to fragmenting amino acid side chains, especially those that are modified, thus yielding fragment ion spectra with more uniform peak intensities. Further, precursor ions of longer peptides and higher charge states can be fragmented and identified. However, analysis of ETD spectra has a few important differences that require the optimization of the software packages used for the analysis of CID data or the development of specialized tools. We have adapted the Trans-Proteomic Pipeline to process ETD data. Specifically, we have added support for fragment ion spectra from high-charge precursors, compatibility with charge-state estimation algorithms, provisions for the use of the Lys-C protease, capabilities for ETD spectrum library building, and updates to the data formats to differentiate CID and ETD spectra. We show the results of processing data sets from several different types of ETD instruments and demonstrate that application of the ETD-enhanced Trans-Proteomic Pipeline can increase the number of spectrum identifications at a fixed false discovery rate by as much as 100% over native output from a single sequence search engine.


Subject(s)
Computational Biology/methods , Peptides/analysis , Proteomics/methods , Software , Tandem Mass Spectrometry/methods
7.
J Proteome Res ; 8(10): 4396-405, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19603829

ABSTRACT

Multiple reaction monitoring mass spectrometry (MRM-MS) is a targeted analysis method that has been increasingly viewed as an avenue to explore proteomes with unprecedented sensitivity and throughput. We have developed a software tool, called MaRiMba, to automate the creation of explicitly defined MRM transition lists required to program triple quadrupole mass spectrometers in such analyses. MaRiMba creates MRM transition lists from downloaded or custom-built spectral libraries, restricts output to specified proteins or peptides, and filters based on precursor peptide and product ion properties. MaRiMba can also create MRM lists containing corresponding transitions for isotopically heavy peptides, for which the precursor and product ions are adjusted according to user specifications. This open-source application is operated through a graphical user interface incorporated into the Trans-Proteomic Pipeline, and it outputs the final MRM list to a text file for upload to MS instruments. To illustrate the use of MaRiMba, we used the tool to design and execute an MRM-MS experiment in which we targeted the proteins of a well-defined and previously published standard mixture.


Subject(s)
Databases, Protein , Mass Spectrometry/methods , Proteomics/methods , User-Computer Interface , Animals , Lung/chemistry , Male , Mice , Mice, Inbred BALB C , Peptides/chemistry , Proteins/chemistry , Reproducibility of Results , Systems Biology
SELECTION OF CITATIONS
SEARCH DETAIL
...