Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
1.
Proteomics ; 24(12-13): e2300001, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38402423

ABSTRACT

MALDI mass spectrometry imaging (MALDI imaging) uniquely advances cancer research, by measuring spatial distribution of endogenous and exogenous molecules directly from tissue sections. These molecular maps provide valuable insights into basic and translational cancer research, including tumor biology, tumor microenvironment, biomarker identification, drug treatment, and patient stratification. Despite its advantages, MALDI imaging is underutilized in studying rare cancers. Sarcomas, a group of malignant mesenchymal tumors, pose unique challenges in medical research due to their complex heterogeneity and low incidence, resulting in understudied subtypes with suboptimal management and outcomes. In this review, we explore the applicability of MALDI imaging in sarcoma research, showcasing its value in understanding this highly heterogeneous and challenging rare cancer. We summarize all MALDI imaging studies in sarcoma to date, highlight their impact on key research fields, including molecular signatures, cancer heterogeneity, and drug studies. We address specific challenges encountered when employing MALDI imaging for sarcomas, and propose solutions, such as using formalin-fixed paraffin-embedded tissues, and multiplexed experiments, and considerations for multi-site studies and digital data sharing practices. Through this review, we aim to spark collaboration between MALDI imaging researchers and clinical colleagues, to deploy the unique capabilities of MALDI imaging in the context of sarcoma.


Subject(s)
Sarcoma , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Humans , Sarcoma/diagnostic imaging , Sarcoma/pathology , Biomarkers, Tumor/analysis , Rare Diseases/diagnostic imaging , Rare Diseases/pathology , Tumor Microenvironment
2.
Nat Methods ; 20(12): 1883-1886, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37996752

ABSTRACT

Cardinal v.3 is an open-source software for reproducible analysis of mass spectrometry imaging experiments. A major update from its previous versions, Cardinal v.3 supports most mass spectrometry imaging workflows. Its analytical capabilities include advanced data processing such as mass recalibration, advanced statistical analyses such as single-ion segmentation and rough annotation-based classification, and memory-efficient analyses of large-scale multitissue experiments.


Subject(s)
Image Processing, Computer-Assisted , Software , Mass Spectrometry/methods
3.
Expert Rev Proteomics ; 20(11): 251-266, 2023.
Article in English | MEDLINE | ID: mdl-37787106

ABSTRACT

INTRODUCTION: Continuous advances in mass spectrometry (MS) technologies have enabled deeper and more reproducible proteome characterization and a better understanding of biological systems when integrated with other 'omics data. Bioinformatic resources meeting the analysis requirements of increasingly complex MS-based proteomic data and associated multi-omic data are critically needed. These requirements included availability of software that would span diverse types of analyses, scalability for large-scale, compute-intensive applications, and mechanisms to ease adoption of the software. AREAS COVERED: The Galaxy ecosystem meets these requirements by offering a multitude of open-source tools for MS-based proteomics analyses and applications, all in an adaptable, scalable, and accessible computing environment. A thriving global community maintains these software and associated training resources to empower researcher-driven analyses. EXPERT OPINION: The community-supported Galaxy ecosystem remains a crucial contributor to basic biological and clinical studies using MS-based proteomics. In addition to the current status of Galaxy-based resources, we describe ongoing developments for meeting emerging challenges in MS-based proteomic informatics. We hope this review will catalyze increased use of Galaxy by researchers employing MS-based proteomics and inspire software developers to join the community and implement new tools, workflows, and associated training content that will add further value to this already rich ecosystem.


Subject(s)
Proteomics , Humans , Computational Biology/methods , Mass Spectrometry/methods , Proteomics/methods , Software
4.
bioRxiv ; 2023 Feb 21.
Article in English | MEDLINE | ID: mdl-36865170

ABSTRACT

Cardinal v3 is an open source software for reproducible analysis of mass spectrometry imaging experiments. A major update from its previous versions, Cardinal v3 supports most mass spectrometry imaging workflows. Its analytical capabilities include advanced data processing such as mass re-calibration, advanced statistical analyses such as single-ion segmentation and rough annotation-based classification, and memory-efficient analyses of large-scale multi-tissue experiments.

5.
Bioinformatics ; 39(2)2023 02 03.
Article in English | MEDLINE | ID: mdl-36744928

ABSTRACT

MOTIVATION: Mass Spectrometry Imaging (MSI) analyzes complex biological samples such as tissues. It simultaneously characterizes the ions present in the tissue in the form of mass spectra, and the spatial distribution of the ions across the tissue in the form of ion images. Unsupervised clustering of ion images facilitates the interpretation in the spectral domain, by identifying groups of ions with similar spatial distributions. Unfortunately, many current methods for clustering ion images ignore the spatial features of the images, and are therefore unable to learn these features for clustering purposes. Alternative methods extract spatial features using deep neural networks pre-trained on natural image tasks; however, this is often inadequate since ion images are substantially noisier than natural images. RESULTS: We contribute a deep clustering approach for ion images that accounts for both spatial contextual features and noise. In evaluations on a simulated dataset and on four experimental datasets of different tissue types, the proposed method grouped ions from the same source into a same cluster more frequently than existing methods. We further demonstrated that using ion image clustering as a pre-processing step facilitated the interpretation of a subsequent spatial segmentation as compared to using either all the ions or one ion at a time. As a result, the proposed approach facilitated the interpretability of MSI data in both the spectral domain and the spatial domain. AVAILABILITYAND IMPLEMENTATION: The data and code are available at https://github.com/DanGuo1223/mzClustering. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Neural Networks, Computer , Mass Spectrometry/methods , Cluster Analysis , Ions/analysis
6.
PLoS Comput Biol ; 19(1): e1010752, 2023 01.
Article in English | MEDLINE | ID: mdl-36622853

ABSTRACT

There is an ongoing explosion of scientific datasets being generated, brought on by recent technological advances in many areas of the natural sciences. As a result, the life sciences have become increasingly computational in nature, and bioinformatics has taken on a central role in research studies. However, basic computational skills, data analysis, and stewardship are still rarely taught in life science educational programs, resulting in a skills gap in many of the researchers tasked with analysing these big datasets. In order to address this skills gap and empower researchers to perform their own data analyses, the Galaxy Training Network (GTN) has previously developed the Galaxy Training Platform (https://training.galaxyproject.org), an open access, community-driven framework for the collection of FAIR (Findable, Accessible, Interoperable, Reusable) training materials for data analysis utilizing the user-friendly Galaxy framework as its primary data analysis platform. Since its inception, this training platform has thrived, with the number of tutorials and contributors growing rapidly, and the range of topics extending beyond life sciences to include topics such as climatology, cheminformatics, and machine learning. While initially aimed at supporting researchers directly, the GTN framework has proven to be an invaluable resource for educators as well. We have focused our efforts in recent years on adding increased support for this growing community of instructors. New features have been added to facilitate the use of the materials in a classroom setting, simplifying the contribution flow for new materials, and have added a set of train-the-trainer lessons. Here, we present the latest developments in the GTN project, aimed at facilitating the use of the Galaxy Training materials by educators, and its usage in different learning environments.


Subject(s)
Computational Biology , Software , Humans , Computational Biology/methods , Data Analysis , Research Personnel
7.
J Proteome Res ; 21(6): 1558-1565, 2022 06 03.
Article in English | MEDLINE | ID: mdl-35503992

ABSTRACT

Quantitative mass spectrometry-based proteomics has become a high-throughput technology for the identification and quantification of thousands of proteins in complex biological samples. Two frequently used tools, MaxQuant and MSstats, allow for the analysis of raw data and finding proteins with differential abundance between conditions of interest. To enable accessible and reproducible quantitative proteomics analyses in a cloud environment, we have integrated MaxQuant (including TMTpro 16/18plex), Proteomics Quality Control (PTXQC), MSstats, and MSstatsTMT into the open-source Galaxy framework. This enables the web-based analysis of label-free and isobaric labeling proteomics experiments via Galaxy's graphical user interface on public clouds. MaxQuant and MSstats in Galaxy can be applied in conjunction with thousands of existing Galaxy tools and integrated into standardized, sharable workflows. Galaxy tracks all metadata and intermediate results in analysis histories, which can be shared privately for collaborations or publicly, allowing full reproducibility and transparency of published analysis. To further increase accessibility, we provide detailed hands-on training materials. The integration of MaxQuant and MSstats into the Galaxy framework enables their usage in a reproducible way on accessible large computational infrastructures, hence realizing the foundation for high-throughput proteomics data science for everyone.


Subject(s)
Proteomics , Software , Cloud Computing , Mass Spectrometry/methods , Proteins/analysis , Proteomics/methods , Reproducibility of Results
8.
Clin Proteomics ; 19(1): 8, 2022 Apr 19.
Article in English | MEDLINE | ID: mdl-35439943

ABSTRACT

BACKGROUND: Mass spectrometry imaging (MSI) derives spatial molecular distribution maps directly from clinical tissue specimens and thus bears great potential for assisting pathologists with diagnostic decisions or personalized treatments. Unfortunately, progress in translational MSI is often hindered by insufficient quality control and lack of reproducible data analysis. Raw data and analysis scripts are rarely publicly shared. Here, we demonstrate the application of the Galaxy MSI tool set for the reproducible analysis of a urothelial carcinoma dataset. METHODS: Tryptic peptides were imaged in a cohort of 39 formalin-fixed, paraffin-embedded human urothelial cancer tissue cores with a MALDI-TOF/TOF device. The complete data analysis was performed in a fully transparent and reproducible manner on the European Galaxy Server. Annotations of tumor and stroma were performed by a pathologist and transferred to the MSI data to allow for supervised classifications of tumor vs. stroma tissue areas as well as for muscle-infiltrating and non-muscle infiltrating urothelial carcinomas. For putative peptide identifications, m/z features were matched to the MSiMass list. RESULTS: Rigorous quality control in combination with careful pre-processing enabled reduction of m/z shifts and intensity batch effects. High classification accuracy was found for both, tumor vs. stroma and muscle-infiltrating vs. non-muscle infiltrating urothelial tumors. Some of the most discriminative m/z features for each condition could be assigned a putative identity: stromal tissue was characterized by collagen peptides and tumor tissue by histone peptides. Immunohistochemistry confirmed an increased histone H2A abundance in the tumor compared to the stroma tissues. The muscle-infiltration status was distinguished via MSI by peptides from intermediate filaments such as cytokeratin 7 in non-muscle infiltrating carcinomas and vimentin in muscle-infiltrating urothelial carcinomas, which was confirmed by immunohistochemistry. To make the study fully reproducible and to advocate the criteria of FAIR (findability, accessibility, interoperability, and reusability) research data, we share the raw data, spectra annotations as well as all Galaxy histories and workflows. Data are available via ProteomeXchange with identifier PXD026459 and Galaxy results via https://github.com/foellmelanie/Bladder_MSI_Manuscript_Galaxy_links . CONCLUSION: Here, we show that translational MSI data analysis in a fully transparent and reproducible manner is possible and we would like to encourage the community to join our efforts.

9.
Gigascience ; 112022 02 15.
Article in English | MEDLINE | ID: mdl-35166338

ABSTRACT

BACKGROUND: Data-independent acquisition (DIA) has become an important approach in global, mass spectrometric proteomic studies because it provides in-depth insights into the molecular variety of biological systems. However, DIA data analysis remains challenging owing to the high complexity and large data and sample size, which require specialized software and vast computing infrastructures. Most available open-source DIA software necessitates basic programming skills and covers only a fraction of a complete DIA data analysis. In consequence, DIA data analysis often requires usage of multiple software tools and compatibility thereof, severely limiting the usability and reproducibility. FINDINGS: To overcome this hurdle, we have integrated a suite of open-source DIA tools in the Galaxy framework for reproducible and version-controlled data processing. The DIA suite includes OpenSwath, PyProphet, diapysef, and swath2stats. We have compiled functional Galaxy pipelines for DIA processing, which provide a web-based graphical user interface to these pre-installed and pre-configured tools for their use on freely accessible, powerful computational resources of the Galaxy framework. This approach also enables seamless sharing workflows with full configuration in addition to sharing raw data and results. We demonstrate the usability of an all-in-one DIA pipeline in Galaxy by the analysis of a spike-in case study dataset. Additionally, extensive training material is provided to further increase access for the proteomics community. CONCLUSION: The integration of an open-source DIA analysis suite in the web-based and user-friendly Galaxy framework in combination with extensive training material empowers a broad community of researches to perform reproducible and transparent DIA data analysis.


Subject(s)
Computational Biology , Proteomics , Computational Biology/methods , Mass Spectrometry , Proteomics/methods , Reproducibility of Results , Software
10.
Nat Commun ; 12(1): 5854, 2021 10 06.
Article in English | MEDLINE | ID: mdl-34615866

ABSTRACT

The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.


Subject(s)
Data Analysis , Databases, Protein , Metadata , Proteomics , Big Data , Humans , Reproducibility of Results , Software , Transcriptome
11.
Bioinformatics ; 36(Suppl_1): i300-i308, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32657378

ABSTRACT

MOTIVATION: Mass spectrometry imaging (MSI) characterizes the molecular composition of tissues at spatial resolution, and has a strong potential for distinguishing tissue types, or disease states. This can be achieved by supervised classification, which takes as input MSI spectra, and assigns class labels to subtissue locations. Unfortunately, developing such classifiers is hindered by the limited availability of training sets with subtissue labels as the ground truth. Subtissue labeling is prohibitively expensive, and only rough annotations of the entire tissues are typically available. Classifiers trained on data with approximate labels have sub-optimal performance. RESULTS: To alleviate this challenge, we contribute a semi-supervised approach mi-CNN. mi-CNN implements multiple instance learning with a convolutional neural network (CNN). The multiple instance aspect enables weak supervision from tissue-level annotations when classifying subtissue locations. The convolutional architecture of the CNN captures contextual dependencies between the spectral features. Evaluations on simulated and experimental datasets demonstrated that mi-CNN improved the subtissue classification as compared to traditional classifiers. We propose mi-CNN as an important step toward accurate subtissue classification in MSI, enabling rapid distinction between tissue types and disease states. AVAILABILITY AND IMPLEMENTATION: The data and code are available at https://github.com/Vitek-Lab/mi-CNN_MSI.


Subject(s)
Neural Networks, Computer , Mass Spectrometry
12.
Gigascience ; 8(12)2019 12 01.
Article in English | MEDLINE | ID: mdl-31816088

ABSTRACT

BACKGROUND: Mass spectrometry imaging is increasingly used in biological and translational research because it has the ability to determine the spatial distribution of hundreds of analytes in a sample. Being at the interface of proteomics/metabolomics and imaging, the acquired datasets are large and complex and often analyzed with proprietary software or in-house scripts, which hinders reproducibility. Open source software solutions that enable reproducible data analysis often require programming skills and are therefore not accessible to many mass spectrometry imaging (MSI) researchers. FINDINGS: We have integrated 18 dedicated mass spectrometry imaging tools into the Galaxy framework to allow accessible, reproducible, and transparent data analysis. Our tools are based on Cardinal, MALDIquant, and scikit-image and enable all major MSI analysis steps such as quality control, visualization, preprocessing, statistical analysis, and image co-registration. Furthermore, we created hands-on training material for use cases in proteomics and metabolomics. To demonstrate the utility of our tools, we re-analyzed a publicly available N-linked glycan imaging dataset. By providing the entire analysis history online, we highlight how the Galaxy framework fosters transparent and reproducible research. CONCLUSION: The Galaxy framework has emerged as a powerful analysis platform for the analysis of MSI data with ease of use and access, together with high levels of reproducibility and transparency.


Subject(s)
Computational Biology/education , Metabolomics/methods , Proteomics/methods , Computational Biology/methods , Data Analysis , Humans , Mass Spectrometry , Reproducibility of Results , Software , Translational Research, Biomedical
13.
Clin Proteomics ; 15: 11, 2018.
Article in English | MEDLINE | ID: mdl-29527141

ABSTRACT

BACKGROUND: Proteomic analyses of clinical specimens often rely on human tissues preserved through formalin-fixation and paraffin embedding (FFPE). Minimal sample consumption is the key to preserve the integrity of pathological archives but also to deal with minimal invasive core biopsies. This has been achieved by using the acid-labile surfactant RapiGest in combination with a direct trypsinization (DTR) strategy. A critical comparison of the DTR protocol with the most commonly used filter aided sample preparation (FASP) protocol is lacking. Furthermore, it is unknown how common histological stainings influence the outcome of the DTR protocol. METHODS: Four single consecutive murine kidney tissue specimens were prepared with the DTR approach or with the FASP protocol using both 10 and 30 k filter devices and analyzed by label-free, quantitative liquid chromatography-tandem mass spectrometry (LC-MS/MS). We compared the different protocols in terms of proteome coverage, relative label-free quantitation, missed cleavages, physicochemical properties and gene ontology term annotations of the proteins. Additionally, we probed compatibility of the DTR protocol for the analysis of common used histological stainings, namely hematoxylin & eosin (H&E), hematoxylin and hemalaun. These were proteomically compared to an unstained control by analyzing four human tonsil FFPE tissue specimens per condition. RESULTS: On average, the DTR protocol identified 1841 ± 22 proteins in a single, non-fractionated LC-MS/MS analysis, whereas these numbers were 1857 ± 120 and 1970 ± 28 proteins for the FASP 10 and 30 k protocol. The DTR protocol showed 15% more missed cleavages, which did not adversely affect quantitation and intersample comparability. Hematoxylin or hemalaun staining did not adversely impact the performance of the DTR protocol. A minor perturbation was observed for H&E staining, decreasing overall protein identification by 13%. CONCLUSIONS: In essence, the DTR protocol can keep up with the FASP protocol in terms of qualitative and quantitative reproducibility and performed almost as well in terms of proteome coverage and missed cleavages. We highlight the suitability of the DTR protocol as a viable and straightforward alternative to the FASP protocol for proteomics-based clinical research.

SELECTION OF CITATIONS
SEARCH DETAIL
...