Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Asian J Psychiatr ; 98: 104079, 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38838458

RESUMO

BACKGROUND: In order to improve taVNS efficacy, the usage of fMRI to explore the predictive neuroimaging markers would be beneficial for screening the appropriate MDD population before treatment. METHODS: A total of 86 MDD patients were recruited in this study, and all subjects were conducted with the clinical scales and resting-state functional magnetic resonance imaging (fMRI) scan before and after 8 weeks' taVNS treatment. A two-stage feature selection strategy combining Machine Learning and Statistical was used to screen out the critical brain functional connections (FC) that were significantly associated with efficacy prediction, then the efficacy prediction model was constructed for taVNS treating MDD. Finally, the model was validated by separated the responding and non-responding patients. RESULTS: This study showed that taVNS produced promising clinical efficacy in the treatment of mild and moderate MDD. Eleven FCs were selected out and were found to be associated with the cortico-striatal-pallidum-thalamic loop, the hippocampus and cerebellum and the HAMD-17 scores. The prediction model was created based on these FCs for the efficacy prediction of taVNS treatment. The R-square of the conducted regression model for predicting HAMD-17 reduction rate is 0.44, and the AUC for classifying the responding and non-responding patients is 0.856. CONCLUSION: The study demonstrates the validity and feasibility of combining neuroimaging and machine learning techniques to predict the efficacy of taVNS on MDD, and provides an effective solution for personalized and precise treatment for MDD.

2.
J Proteome Res ; 23(5): 1702-1712, 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38640356

RESUMO

Several lossy compressors have achieved superior compression rates for mass spectrometry (MS) data at the cost of storage precision. Currently, the impacts of precision losses on MS data processing have not been thoroughly evaluated, which is critical for the future development of lossy compressors. We first evaluated different storage precision (32 bit and 64 bit) in lossless mzML files. We then applied 10 truncation transformations to generate precision-lossy files: five relative errors for intensities and five absolute errors for m/z values. MZmine3 and XCMS were used for feature detection and GNPS for compound annotation. Lastly, we compared Precision, Recall, F1 - score, and file sizes between lossy files and lossless files under different conditions. Overall, we revealed that the discrepancy between 32 and 64 bit precision was under 1%. We proposed an absolute m/z error of 10-4 and a relative intensity error of 2 × 10-2, adhering to a 5% error threshold (F1 - scores above 95%). For a stricter 1% error threshold (F1 - scores above 99%), an absolute m/z error of 2 × 10-5 and a relative intensity error of 2 × 10-3 were advised. This guidance aims to help researchers improve lossy compression algorithms and minimize the negative effects of precision losses on downstream data processing.


Assuntos
Compressão de Dados , Espectrometria de Massas , Metabolômica , Espectrometria de Massas/métodos , Metabolômica/métodos , Metabolômica/estatística & dados numéricos , Compressão de Dados/métodos , Software , Humanos , Algoritmos
3.
Anal Bioanal Chem ; 416(14): 3305-3312, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38642098

RESUMO

Metformin (MET) and sitagliptin (STG) are widely used as the first-line and long-term oral hypoglycemic agents for managing type 2 diabetes mellitus (T2DM). However, the current lack of convenient and rapid measurement methods poses a challenge for individualized management. This study developed a point-of-care (POC) assay method utilizing a miniature mass spectrometer, enabling rapid and accurate quantification of MET and STG concentrations in human blood and urine. By combining the miniature mass spectrometer with paper spray ionization, this method simplifies the process into three to four steps, requires minimal amounts of bodily fluids (50 µL of blood and 2 µL of urine), and is able to obtain quantification results within approximately 2 min. Stable isotope-labeled internal standards were employed to enhance the accuracy and stability of measurement. The MS/MS responses exhibited good linear relationship with concentration, with relative standard deviations (RSDs) below 25%. It has the potential to provide immediate treatment feedback and decision support for patients and healthcare professionals in clinical practice.


Assuntos
Hipoglicemiantes , Metformina , Sistemas Automatizados de Assistência Junto ao Leito , Fosfato de Sitagliptina , Humanos , Fosfato de Sitagliptina/sangue , Fosfato de Sitagliptina/urina , Metformina/sangue , Metformina/urina , Hipoglicemiantes/urina , Hipoglicemiantes/sangue , Limite de Detecção , Espectrometria de Massas em Tandem/métodos , Diabetes Mellitus Tipo 2/tratamento farmacológico , Diabetes Mellitus Tipo 2/sangue , Diabetes Mellitus Tipo 2/urina , Espectrometria de Massas/métodos , Reprodutibilidade dos Testes
4.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38426325

RESUMO

Accurate metabolite annotation and false discovery rate (FDR) control remain challenging in large-scale metabolomics. Recent progress leveraging proteomics experiences and interdisciplinary inspirations has provided valuable insights. While target-decoy strategies have been introduced, generating reliable decoy libraries is difficult due to metabolite complexity. Moreover, continuous bioinformatics innovation is imperative to improve the utilization of expanding spectral resources while reducing false annotations. Here, we introduce the concept of ion entropy for metabolomics and propose two entropy-based decoy generation approaches. Assessment of public databases validates ion entropy as an effective metric to quantify ion information in massive metabolomics datasets. Our entropy-based decoy strategies outperform current representative methods in metabolomics and achieve superior FDR estimation accuracy. Analysis of 46 public datasets provides instructive recommendations for practical application.


Assuntos
Algoritmos , Espectrometria de Massas em Tandem , Entropia , Espectrometria de Massas em Tandem/métodos , Metabolômica/métodos , Biologia Computacional/métodos , Bases de Dados de Proteínas
5.
BMC Bioinformatics ; 25(1): 60, 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38321388

RESUMO

BACKGROUND: As a gold-standard quantitative technique based on mass spectrometry, multiple reaction monitoring (MRM) has been widely used in proteomics and metabolomics. In the analysis of MRM data, as no peak picking algorithm can achieve perfect accuracy, manual inspection is necessary to correct the errors. In large cohort analysis scenarios, the time required for manual inspection is often considerable. Apart from the commercial software that comes with mass spectrometers, the open-source and free software Skyline is the most popular software for quantitative omics. However, this software is not optimized for manual inspection of hundreds of samples, the interactive experience also needs to be improved. RESULTS: Here we introduce MRMPro, a web-based MRM data analysis platform for efficient manual inspection. MRMPro supports data analysis of MRM and schedule MRM data acquired by mass spectrometers of mainstream vendors. With the goal of improving the speed of manual inspection, we implemented a collaborative review system based on cloud architecture, allowing multiple users to review through browsers. To reduce bandwidth usage and improve data retrieval speed, we proposed a MRM data compression algorithm, which reduced data volume by more than 60% and 80% respectively compared to vendor and mzML format. To improve the efficiency of manual inspection, we proposed a retention time drift estimation algorithm based on similarity of chromatograms. The estimated retention time drifts were then used for peak alignment and automatic EIC grouping. Compared with Skyline, MRMPro has higher quantification accuracy and better manual inspection support. CONCLUSIONS: In this study, we proposed MRMPro to improve the usability of manual calibration for MRM data analysis. MRMPro is free for non-commercial use. Researchers can access MRMPro through http://mrmpro.csibio.com/ . All major mass spectrometry formats (wiff, raw, mzML, etc.) can be analyzed on the platform. The final identification results can be exported to a common.xlsx format for subsequent analysis.


Assuntos
Algoritmos , Compressão de Dados , Humanos , Calibragem , Espectrometria de Massas/métodos , Software , Internet
6.
BMC Bioinformatics ; 24(1): 489, 2023 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-38124029

RESUMO

BACKGROUND: Plate design is a necessary and time-consuming operation for GC/LC-MS-based sample preparation. The implementation of the inter-batch balancing algorithm and the intra-batch randomization algorithm can have a significant impact on the final results. For researchers without programming skills, a stable and efficient online service for plate design is necessary. RESULTS: Here we describe InjectionDesign, a free online plate design service focused on GC/LC-MS-based multi-omics experiment design. It offers the ability to separate the position design from the sequence design, making the output more compatible with the requirements of a modern mass spectrometer-based laboratory. In addition, it has implemented an optimized block randomization algorithm, which can be better applied to sample stratification with block randomization for an unbalanced distribution. It is easy to use, with built-in support for common instrument models and quick export to a worksheet. CONCLUSIONS: InjectionDesign is an open-source project based on Java. Researchers can get the source code for the project from Github: https://github.com/CSi-Studio/InjectionDesign . A free web service is also provided: http://www.injection.design .


Assuntos
Espectrometria de Massa com Cromatografia Líquida , Espectrometria de Massas em Tandem , Distribuição Aleatória , Cromatografia Líquida , Software
7.
BMC Bioinformatics ; 24(1): 431, 2023 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-37964228

RESUMO

BACKGROUND: Liquid chromatography-mass spectrometry is widely used in untargeted metabolomics for composition profiling. In multi-run analysis scenarios, features of each run are aligned into consensus features by feature alignment algorithms to observe the intensity variations across runs. However, most of the existing feature alignment methods focus more on accurate retention time correction, while underestimating the importance of feature matching. None of the existing methods can comprehensively consider feature correspondences among all runs and achieve optimal matching. RESULTS: To comprehensively analyze feature correspondences among runs, we propose G-Aligner, a graph-based feature alignment method for untargeted LC-MS data. In the feature matching stage, G-Aligner treats features and potential correspondences as nodes and edges in a multipartite graph, considers the multi-run feature matching problem an unbalanced multidimensional assignment problem, and provides three combinatorial optimization algorithms to find optimal matching solutions. In comparison with the feature alignment methods in OpenMS, MZmine2 and XCMS on three public metabolomics benchmark datasets, G-Aligner achieved the best feature alignment performance on all the three datasets with up to 9.8% and 26.6% increase in accurately aligned features and analytes, and helped all comparison software obtain more accurate results on their self-extracted features by integrating G-Aligner to their analysis workflow. G-Aligner is open-source and freely available at https://github.com/CSi-Studio/G-Aligner under a permissive license. Benchmark datasets, manual annotation results, evaluation methods and results are available at https://doi.org/10.5281/zenodo.8313034 CONCLUSIONS: In this study, we proposed G-Aligner to improve feature matching accuracy for untargeted metabolomics LC-MS data. G-Aligner comprehensively considered potential feature correspondences between all runs, converting the feature matching problem as a multidimensional assignment problem (MAP). In evaluations on three public metabolomics benchmark datasets, G-Aligner achieved the highest alignment accuracy on manual annotated and popular software extracted features, proving the effectiveness and robustness of the algorithm.


Assuntos
Software , Espectrometria de Massas em Tandem , Cromatografia Líquida/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Metabolômica/métodos
8.
Math Biosci Eng ; 20(9): 17197-17219, 2023 09 04.
Artigo em Inglês | MEDLINE | ID: mdl-37920052

RESUMO

With the continuous improvement of biological detection technology, the scale of biological data is also increasing, which overloads the central-computing server. The use of edge computing in 5G networks can provide higher processing performance for large biological data analysis, reduce bandwidth consumption and improve data security. Appropriate data compression and reading strategy becomes the key technology to implement edge computing. We introduce the column storage strategy into mass spectrum data so that part of the analysis scenario can be completed by edge computing. Data produced by mass spectrometry is a typical biological big data based. A blood sample analysed by mass spectrometry can produce a 10 gigabytes digital file. By introducing the column storage strategy and combining the related prior knowledge of mass spectrometry, the structure of the mass spectrum data is reorganized, and the result file is effectively compressed. Data can be processed immediately near the scientific instrument, reducing the bandwidth requirements and the pressure of the central server. Here, we present Aird-Slice, a mass spectrum data format using the column storage strategy. Aird-Slice reduces volume by 48% compared to vendor files and speeds up the critical computational step of ion chromatography extraction by an average of 116 times over the test dataset. Aird-Slice provides the ability to analyze biological data using an edge computing architecture on 5G networks.


Assuntos
Big Data , Compressão de Dados , Análise de Dados
9.
Org Biomol Chem ; 21(42): 8516-8520, 2023 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-37853833

RESUMO

It is challenging to enantioselectively construct molecules bearing multiple nonadjacent stereocenters, in contrast to those bearing a single stereocenter or adjacent stereocenters. Herein, we report an enantio- and diastereoselective synthesis of substituted chiral allenes with nonadjacent axial and two central chiral centers through a combination of retro-oxa-Michael addition and palladium-catalyzed asymmetric allenylic alkylation. This methodology exhibits good functional-group compatibility, and the corresponding allenylic alkylated compounds, including flavonoid frameworks, are obtained with good yields and diastereoselectivities and excellent enantioselectivities (all >95% ee). Furthermore, the scalability of the current synthetic protocol was proven by performing a gram-scale reaction.

10.
Org Lett ; 25(41): 7540-7544, 2023 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-37812068

RESUMO

A diarylborinic acid-catalyzed ring opening of cis-4-hydroxymethyl-1,2-cyclopentene oxides was developed with N-nucleophiles including anilines, benzotriazole, and alkylamines, as well as S-nucleophiles, affording 1,2,4-trisubstituted cyclopentane compounds containing a quaternary carbon center. The mechanism study indicated that the "half-cage" structure formed by the epoxide substrate and the catalyst prevents the nucleophiles from attacking the inner side of the "half-cage", resulting in the desired ring-opening product.

11.
Metabolomics ; 19(6): 57, 2023 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-37289291

RESUMO

INTRODUCTION: Metabolomics analysis based on liquid chromatography-mass spectrometry (LC-MS) has been a prevalent method in the metabolic field. However, accurately quantifying all the metabolites in large metabolomics sample cohorts is challenging. The analysis efficiency is restricted by the abilities of software in many labs, and the lack of spectra for some metabolites also hinders metabolite identification. OBJECTIVES: Develop software that performs semi-targeted metabolomics analysis with an optimized workflow to improve quantification accuracy. The software also supports web-based technologies and increases laboratory analysis efficiency. A spectral curation function is provided to promote the prosperity of homemade MS/MS spectral libraries in the metabolomics community. METHODS: MetaPro is developed based on an industrial-grade web framework and a computation-oriented MS data format to improve analysis efficiency. Algorithms from mainstream metabolomics software are integrated and optimized for more accurate quantification results. A semi-targeted analysis workflow is designed based on the concept of combining artificial judgment and algorithm inference. RESULTS: MetaPro supports semi-targeted analysis workflow and functions for fast QC inspection and self-made spectral library curation with easy-to-use interfaces. With curated authentic or high-quality spectra, it can improve identification accuracy using different peak identification strategies. It demonstrates practical value in analyzing large amounts of metabolomics samples. CONCLUSION: We offer MetaPro as a web-based application characterized by fast batch QC inspection and credible spectral curation towards high-throughput metabolomics data. It aims to resolve the analysis difficulty in semi-targeted metabolomics.


Assuntos
Metabolômica , Espectrometria de Massas em Tandem , Metabolômica/métodos , Cromatografia Líquida/métodos , Espectrometria de Massas em Tandem/métodos , Software , Internet
12.
Chem Sci ; 14(20): 5477-5482, 2023 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-37234894

RESUMO

The development of a new strategy for the construction of chiral cyclic sulfide-containing multiple stereogenic centers is highly desirable. Herein, by the combination of base-promoted retro-sulfa-Michael addition and palladium-catalyzed asymmetric allenylic alkylation, the streamlined synthesis of chiral thiochromanones containing two central chiralities (including a quaternary stereogenic center) and an axial chirality (allene unit) was successfully realized with up to 98% yield, 49.0 : 1 dr and >99% ee.

13.
Bioinformatics ; 39(5)2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-37071700

RESUMO

MOTIVATION: Liquid chromatography coupled with high-resolution mass spectrometry is widely used in composition profiling in untargeted metabolomics research. While retaining complete sample information, mass spectrometry (MS) data naturally have the characteristics of high dimensionality, high complexity, and huge data volume. In mainstream quantification methods, none of the existing methods can perform direct 3D analysis on lossless profile MS signals. All software simplify calculations by dimensionality reduction or lossy grid transformation, ignoring the full 3D signal distribution of MS data and resulting in inaccurate feature detection and quantification. RESULTS: On the basis that the neural network is effective for high-dimensional data analysis and can discover implicit features from large amounts of complex data, in this work, we propose 3D-MSNet, a novel deep learning-based model for untargeted feature extraction. 3D-MSNet performs direct feature detection on 3D MS point clouds as an instance segmentation task. After training on a self-annotated 3D feature dataset, we compared our model with nine popular software (MS-DIAL, MZmine 2, XCMS Online, MarkerView, Compound Discoverer, MaxQuant, Dinosaur, DeepIso, PointIso) on two metabolomics and one proteomics public benchmark datasets. Our 3D-MSNet model outperformed other software with significant improvement in feature detection and quantification accuracy on all evaluation datasets. Furthermore, 3D-MSNet has high feature extraction robustness and can be widely applied to profile MS data acquired with various high-resolution mass spectrometers with various resolutions. AVAILABILITY AND IMPLEMENTATION: 3D-MSNet is an open-source model and is freely available at https://github.com/CSi-Studio/3D-MSNet under a permissive license. Benchmark datasets, training dataset, evaluation methods, and results are available at https://doi.org/10.5281/zenodo.6582912.


Assuntos
Aprendizado Profundo , Computação em Nuvem , Espectrometria de Massas , Software , Cromatografia Líquida , Metabolômica/métodos
14.
Angew Chem Int Ed Engl ; 62(16): e202301337, 2023 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-36802127

RESUMO

Here we report the first palladium-catalyzed asymmetric hydrogenolysis of readily available aryl triflates via desymmetrization and kinetic resolution for facile construction of axially chiral biaryl scaffolds with excellent enantioselectivities and s selectivity factors. The axially chiral monophosphine ligands could be prepared from these chiral biaryl compounds and were further applied to palladium-catalyzed asymmetric allylic alkylation with excellent ee values and high branched and linear ratio, which demonstrated the potential utility of this methodology.

15.
Angew Chem Int Ed Engl ; 61(34): e202205623, 2022 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-35764533

RESUMO

Compared with heteroarenes, homogeneous asymmetric hydrogenation of all-carbon aromatic rings is a longstanding challenge in organic synthesis due to the strong aromaticity and difficult enantioselective control. Herein, we report the rhodium/diphosphine-catalyzed asymmetric hydrogenation of all-carbon aromatic rings, affording a series of axially chiral cyclic compounds with high enantioselectivity through desymmetrization or kinetic resolution. In addition, the central-chiral cyclic compounds were also obtained by asymmetric hydrogenation of phenanthrenes bearing a directing group. The key to success is the introduction of chiral diphosphine ligands with steric hindrance and strong electron-donating properties. The axially chiral monophosphine ligands could be obtained by simple conversion of the hydrogenation products bearing the phosphine atom.

17.
J Org Chem ; 87(11): 7521-7530, 2022 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-35605190

RESUMO

A ruthenium-catalyzed asymmetric transfer hydrogenation of 2,3-disubstituted flavanones was developed for the construction of three contiguous stereocenters under basic conditions through a combination of dynamic kinetic resolution and retro-oxa-Michael addition, giving chiral flavanols with excellent enantioselectivities and diastereoselectivities. The reaction proceeded via a base-catalyzed retro-oxa-Michael addition to racemize two stereogenic centers simultaneously in concert with a highly enantioselective ketone transfer hydrogenation step. The asymmetric transfer hydrogenation could be achieved at gram scale without loss of the activity and enantioselectivity.


Assuntos
Flavanonas , Catálise , Hidrogenação , Cinética , Estereoisomerismo
18.
Sci Rep ; 12(1): 5384, 2022 03 30.
Artigo em Inglês | MEDLINE | ID: mdl-35354909

RESUMO

As the pervasive, standardized format for interchange and deposition of raw mass spectrometry (MS) proteomics and metabolomics data, text-based mzML is inefficiently utilized on various analysis platforms due to its sheer volume of samples and limited read/write speed. Most research on compression algorithms rarely provides flexible random file reading scheme. Database-developed solution guarantees the efficiency of random file reading, but nevertheless the efforts in compression and third-party software support are insufficient. Under the premise of ensuring the efficiency of decompression, we propose an encoding scheme "Stack-ZDPD" that is optimized for storage of raw MS data, designed for the format "Aird", a computation-oriented format with fast accessing and decoding time, where the core compression algorithm is "ZDPD". Stack-ZDPD reduces the volume of data stored in mzML format by around 80% or more, depending on the data acquisition pattern, and the compression ratio is approximately 30% compared to ZDPD for data generated using Time of Flight technology. Our approach is available on AirdPro, for file conversion and the Java-API Aird-SDK, for data parsing.


Assuntos
Compressão de Dados , Algoritmos , Espectrometria de Massas/métodos , Proteômica/métodos , Software
19.
Bioinformatics ; 38(6): 1525-1531, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34999750

RESUMO

MOTIVATION: Peptide identification of data-independent acquisition (DIA) mass spectrometry applying the peptide-centric approach heavily relies on the spectral library matching, such as the fragment intensity similarity. If the intensity similarity is calculated through all possible fragment ions of a targeted peptide instead of just a few fragment ions provided by the spectral library, the matching will be more comprehensive and reliable, and thus the identification will be more confident. In addition, the emergence of high precision spectrum predictors, like Prosit, also makes it possible to capitalize on the predicted spectrum, which contains all possible fragment ion intensities, to calculate the intensity similarity for DIA data. RESULTS: In this work, we propose Alpha-Tri, a neural-network-based model to calculate intensity similarity as a post-processing score using the predicted spectrum, measured spectrum and correlation spectrum (triple-spectrum). The predicted spectrum is generated by Prosit, the measured spectrum is retrieved from the apex of the chromatograms of all possible fragment ions and the correlation spectrum is used to indicate the present probabilities of these fragment ions as the link between the precursor and its fragment ions is lost in DIA. By adopting a data-driven method, Alpha-Tri is able to learn the intensity similarity from the triple-spectrum. This learned value is appended to initial scores from DIA-NN, allowing the ensuing statistical validation tool to report more peptides at the same false discovery rate (FDR). In our evaluation of the HeLa dataset with gradient lengths ranging from 0.5 to 2 h, Alpha-Tri delivered 3.0-7.2% gains in peptide detections at 1% FDR. On LFQbench dataset, a mixed-species dataset with known ratios, Alpha-Tri identified more peptides and proteins fell within the valid ratio ranges by up to 8.6% and 7.6%, respectively, compared with DIA-NN solely. AVAILABILITY AND IMPLEMENTATION: The original datasets for benchmarks are downloaded from the ProteomeXchange with the identifiers PXD005573, PXD000954 and PXD002952. Source code is available at https://github.com/YuAirLab/Alpha-Tri.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Peptídeos/química , Proteínas , Redes Neurais de Computação , Software , Íons
20.
BMC Bioinformatics ; 23(1): 35, 2022 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-35021987

RESUMO

BACKGROUND: With the precision of the mass spectrometry (MS) going higher, the MS file size increases rapidly. Beyond the widely-used open format mzML, near-lossless or lossless compression algorithms and formats emerged in scenarios with different precision requirements. The data precision is often related to the instrument and subsequent processing algorithms. Unlike storage-oriented formats, which focus more on lossless compression rate, computation-oriented formats concentrate as much on decoding speed as the compression rate. RESULTS: Here we introduce "Aird", an opensource and computation-oriented format with controllable precision, flexible indexing strategies, and high compression rate. Aird provides a novel compressor called Zlib-Diff-PforDelta (ZDPD) for m/z data. Compared with Zlib only, m/z data size is about 55% lower in Aird average. With the high-speed decoding and encoding performance of the single instruction multiple data technology used in the ZDPD, Aird merely takes 33% decoding time compared with Zlib. We have downloaded seven datasets from ProteomeXchange and Metabolights. They are from different SCIEX, Thermo, and Agilent instruments. Then we convert the raw data into mzML, mgf, and mz5 file formats by MSConvert and compare them with Aird format. Aird uses JavaScript Object Notation for metadata storage. Aird-SDK is written in Java, and AirdPro is a GUI client for vendor file converting written in C#. They are freely available at https://github.com/CSi-Studio/Aird-SDK and https://github.com/CSi-Studio/AirdPro . CONCLUSIONS: With the innovation of MS acquisition mode, MS data characteristics are also constantly changing. New data features can bring more effective compression methods and new index modes to achieve high search performance. The MS data storage mode will also become professional and customized. ZDPD uses multiple MS digital features, and researchers also can use it in other formats like mzML. Aird is designed to become a computing-oriented data format with high scalability, compression rate, and fast decoding speed.


Assuntos
Compressão de Dados , Algoritmos , Humanos , Espectrometria de Massas , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...