Pesquisa | Portal Regional da BVS

An Adaptive Pipeline To Maximize Isobaric Tagging Data in Large-Scale MS-Based Proteomics.

Corthésy, John; Theofilatos, Konstantinos; Mavroudi, Seferina; Macron, Charlotte; Cominetti, Ornella; Remlawi, Mona; Ferraro, Francesco; Núñez Galindo, Antonio; Kussmann, Martin; Likothanassis, Spiridon; Dayon, Loïc.

J Proteome Res ; 17(6): 2165-2173, 2018 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-29695160

RESUMO

Isobaric tagging is the method of choice in mass-spectrometry-based proteomics for comparing several conditions at a time. Despite its multiplexing capabilities, some drawbacks appear when multiple experiments are merged for comparison in large sample-size studies due to the presence of missing values, which result from the stochastic nature of the data-dependent acquisition mode. Another indirect cause of data incompleteness might derive from the proteomic-typical data-processing workflow that first identifies proteins in individual experiments and then only quantifies those identified proteins, leaving a large number of unmatched spectra with quantitative information unexploited. Inspired by untargeted metabolomic and label-free proteomic workflows, we developed a quantification-driven bioinformatic pipeline (Quantify then Identify (QtI)) that optimizes the processing of isobaric tandem mass tag (TMT) data from large-scale studies. This pipeline includes innovative features, such as peak filtering with a self-adaptive preprocessing pipeline optimization method, Peptide Match Rescue, and Optimized Post-Translational Modification. QtI outperforms a classical benchmark workflow in terms of quantification and identification rates, significantly reducing missing data while preserving unmatched features for quantitative comparison. The number of unexploited tandem mass spectra was reduced by 77 and 62% for two human cerebrospinal fluid and plasma data sets, respectively.

Assuntos

Proteômica/métodos , Coloração e Rotulagem/métodos , Espectrometria de Massas em Tandem/métodos , Fluxo de Trabalho , Algoritmos , Líquido Cefalorraquidiano/química , Biologia Computacional , Conjuntos de Dados como Assunto , Humanos , Plasma/química , Processamento de Proteína Pós-Traducional

Which clustering algorithm is better for predicting protein complexes?

Moschopoulos, Charalampos N; Pavlopoulos, Georgios A; Iacucci, Ernesto; Aerts, Jan; Likothanassis, Spiridon; Schneider, Reinhard; Kossida, Sophia.

BMC Res Notes ; 4: 549, 2011 Dec 20.

Artigo em Inglês | MEDLINE | ID: mdl-22185599

RESUMO

BACKGROUND: Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. RESULTS: In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. CONCLUSIONS: While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm.

GIBA: a clustering tool for detecting protein complexes.

Moschopoulos, Charalampos N; Pavlopoulos, Georgios A; Schneider, Reinhard; Likothanassis, Spiridon D; Kossida, Sophia.

BMC Bioinformatics ; 10 Suppl 6: S11, 2009 Jun 16.

Artigo em Inglês | MEDLINE | ID: mdl-19534736

RESUMO

BACKGROUND: During the last years, high throughput experimental methods have been developed which generate large datasets of protein - protein interactions (PPIs). However, due to the experimental methodologies these datasets contain errors mainly in terms of false positive data sets and reducing therefore the quality of any derived information. Typically these datasets can be modeled as graphs, where vertices represent proteins and edges the pairwise PPIs, making it easy to apply automated clustering methods to detect protein complexes or other biological significant functional groupings. METHODS: In this paper, a clustering tool, called GIBA (named by the first characters of its developers' nicknames), is presented. GIBA implements a two step procedure to a given dataset of protein-protein interaction data. First, a clustering algorithm is applied to the interaction data, which is then followed by a filtering step to generate the final candidate list of predicted complexes. RESULTS: The efficiency of GIBA is demonstrated through the analysis of 6 different yeast protein interaction datasets in comparison to four other available algorithms. We compared the results of the different methods by applying five different performance measurement metrices. Moreover, the parameters of the methods that constitute the filter have been checked on how they affect the final results. CONCLUSION: GIBA is an effective and easy to use tool for the detection of protein complexes out of experimentally measured protein - protein interaction networks. The results show that GIBA has superior prediction accuracy than previously published methods.

Assuntos

Algoritmos , Biologia Computacional/métodos , Complexos Multiproteicos/análise , Mapeamento de Interação de Proteínas/métodos , Software , Análise por Conglomerados , Bases de Dados de Proteínas

Kernel-based self-organized maps trained with supervised bias for gene expression data analysis.

Papadimitriou, Stergios; Likothanassis, Spiridon D.

J Bioinform Comput Biol ; 1(4): 647-80, 2004 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-15290758

RESUMO

Self-Organized Maps (SOMs) are a popular approach for analyzing genome-wide expression data. However, most SOM based approaches ignore prior knowledge about functional gene categories. Also, Self Organized Map (SOM) based approaches usually develop topographic maps with disjoint and uniform activation regions that correspond to a hard clustering of the patterns at their nodes. We present a novel Self-Organizing map, the Kernel Supervised Dynamic Grid Self-Organized Map (KSDG-SOM). This model adapts its parameters in a kernel space. Gaussian kernels are used and their mean and variance components are adapted in order to optimize the fitness to the input density. The KSDG-SOM also grows dynamically up to a size defined with statistical criteria. It is capable of incorporating a priori information for the known functional characteristics of genes. This information forms a supervised bias at the cluster formation and the model owns the potentiality of revising incorrect functional labels. The new method overcomes the main drawbacks of most of the existing clustering methods that lack a mechanism for dynamical extension on the basis of a balance between unsupervised and supervised drives.

Assuntos

Perfilação da Expressão Gênica/estatística & dados numéricos , Algoritmos , Viés , Análise por Conglomerados , Biologia Computacional , Interpretação Estatística de Dados , Modelos Estatísticos , Redes Neurais de Computação , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA