Search | VHL Regional Portal

SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information.

Dührkop, Kai; Fleischauer, Markus; Ludwig, Marcus; Aksenov, Alexander A; Melnik, Alexey V; Meusel, Marvin; Dorrestein, Pieter C; Rousu, Juho; Böcker, Sebastian.

Nat Methods ; 16(4): 299-302, 2019 04.

Article in English | MEDLINE | ID: mdl-30886413

ABSTRACT

Mass spectrometry is a predominant experimental technique in metabolomics and related fields, but metabolite structural elucidation remains highly challenging. We report SIRIUS 4 (https://bio.informatik.uni-jena.de/sirius/), which provides a fast computational approach for molecular structure identification. SIRIUS 4 integrates CSI:FingerID for searching in molecular structure databases. Using SIRIUS 4, we achieved identification rates of more than 70% on challenging metabolomics datasets.

Subject(s)

Metabolomics/methods , Molecular Structure , Signal Processing, Computer-Assisted , Tandem Mass Spectrometry/methods , Algorithms , Bayes Theorem , Biomarkers , Cluster Analysis , Computational Biology/methods , Computer Graphics , Databases, Factual , Electronic Data Processing , Internet , Isotopes , Likelihood Functions , Metabolome , Neural Networks, Computer , Programming Languages , User-Computer Interface

Predicting the Presence of Uncommon Elements in Unknown Biomolecules from Isotope Patterns.

Meusel, Marvin; Hufsky, Franziska; Panter, Fabian; Krug, Daniel; Müller, Rolf; Böcker, Sebastian.

Anal Chem ; 88(15): 7556-66, 2016 08 02.

Article in English | MEDLINE | ID: mdl-27398867

ABSTRACT

The determination of the molecular formula is one of the earliest and most important steps when investigating the chemical nature of an unknown compound. Common approaches use the isotopic pattern of a compound measured using mass spectrometry. Computational methods to determine the molecular formula from this isotopic pattern require a fixed set of elements. Considering all possible elements severely increases running times and more importantly the chance for false positive identifications as the number of candidate formulas for a given target mass rises significantly if the constituting elements are not prefiltered. This negative effect grows stronger for compounds of higher molecular mass as the effect of a single atom on the overall isotopic pattern grows smaller. On the other hand, hand-selected restrictions on this set of elements may prevent the identification of the correct molecular formula. Thus, it is a crucial step to determine the set of elements most likely comprising the compound prior to the assignment of an elemental formula to an exact mass. In this paper, we present a method to determine the presence of certain elements (sulfur, chlorine, bromine, boron, and selenium) in the compound from its (high mass accuracy) isotopic pattern. We limit ourselves to biomolecules, in the sense of products from nature or synthetic products with potential bioactivity. The classifiers developed here predict the presence of an element with a very high sensitivity and high specificity. We evaluate classifiers on three real-world data sets with 663 isotope patterns in total: 184 isotope patterns containing sulfur, 187 containing chlorine, 14 containing bromine, one containing boron, one containing selenium. In no case do we make a false negative prediction; for chlorine, bromine, boron, and selenium, we make ten false positive predictions in total. We also demonstrate the impact of our method on the identification of molecular formulas, in particular on the number of considered candidates and running time. The element prediction will be part of the next SIRIUS release, available from https://bio.informatik.uni-jena.de/software/sirius/ .

Subject(s)

Chemical Phenomena , Elements , Isotopes/chemistry , Machine Learning , Algorithms , Datasets as Topic , Mass Spectrometry , Molecular Weight

Searching molecular structure databases with tandem mass spectra using CSI:FingerID.

Dührkop, Kai; Shen, Huibin; Meusel, Marvin; Rousu, Juho; Böcker, Sebastian.

Proc Natl Acad Sci U S A ; 112(41): 12580-5, 2015 Oct 13.

Article in English | MEDLINE | ID: mdl-26392543

ABSTRACT

Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem MS to identify the thousands of compounds in a biological sample. Today, the vast majority of metabolites remain unknown. We present a method for searching molecular structure databases using tandem MS data of small molecules. Our method computes a fragmentation tree that best explains the fragmentation spectrum of an unknown molecule. We use the fragmentation tree to predict the molecular structure fingerprint of the unknown compound using machine learning. This fingerprint is then used to search a molecular structure database such as PubChem. Our method is shown to improve on the competing methods for computational metabolite identification by a considerable margin.

Subject(s)

Databases, Protein , Machine Learning , Mass Spectrometry , Metabolomics , Animals , Humans

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL