Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
FEBS Open Bio ; 2024 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-38877295

RESUMO

Peptides are attracting a growing interest as therapeutic agents. This trend stems from their cost-effectiveness and reduced immunogenicity, compared to antibodies or recombinant proteins, but also from their ability to dock and interfere with large protein-protein interaction surfaces, and their higher specificity and better biocompatibility relative to organic molecules. Many tools have been developed to understand, predict, and engineer peptide function. However, most state-of-the-art approaches treat peptides only as linear entities and disregard their structural arrangement. Yet, structural details are critical for peptide properties such as solubility, stability, or binding affinities. Recent advances in peptide structure prediction have successfully addressed the scarcity of confidently determined peptide structures. This review will explore different therapeutic and biotechnological applications of peptides and their assemblies, emphasizing the importance of integrating structural information to advance these endeavors effectively.

2.
Comput Struct Biotechnol J ; 23: 1951-1958, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38736697

RESUMO

NanoString nCounter is a medium-throughput technology used in mRNA and miRNA differential expression studies. It offers several advantages, including the absence of an amplification step and the ability to analyze low-grade samples. Despite its considerable strengths, the popularity of the nCounter platform in experimental research stabilized in 2022 and 2023, and this trend may continue in the upcoming years. Such stagnation could potentially be attributed to the absence of a standardized analytical pipeline or the indication of optimal processing methods for nCounter data analysis. To standardize the description of the nCounter data analysis workflow, we divided it into five distinct steps: data pre-processing, quality control, background correction, normalization and differential expression analysis. Next, we evaluated eleven R packages dedicated to nCounter data processing to point out functionalities belonging to these steps and provide comments on their applications in studies of mRNA and miRNA samples.

3.
Nucleic Acids Res ; 2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38738618

RESUMO

Protein aggregation is behind the genesis of incurable diseases and imposes constraints on drug discovery and the industrial production and formulation of proteins. Over the years, we have been advancing the Aggresscan3D (A3D) method, aiming to deepen our comprehension of protein aggregation and assist the engineering of protein solubility. Since its inception, A3D has become one of the most popular structure-based aggregation predictors because of its performance, modular functionalities, RESTful service for extensive screenings, and intuitive user interface. Building on this foundation, we introduce Aggrescan4D (A4D), significantly extending A3D's functionality. A4D is aimed at predicting the pH-dependent aggregation of protein structures, and features an evolutionary-informed automatic mutation protocol to engineer protein solubility without compromising structure and stability. It also integrates precalculated results for the nearly 500,000 jobs in the A3D Model Organisms Database and structure retrieval from the AlphaFold database. Globally, A4D constitutes a comprehensive tool for understanding, predicting, and designing solutions for specific protein aggregation challenges. The A4D web server and extensive documentation are available at https://biocomp.chem.uw.edu.pl/a4d/. This website is free and open to all users without a login requirement.

4.
Spectrochim Acta A Mol Biomol Spectrosc ; 313: 124094, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38503257

RESUMO

The most studied functional amyloid is the CsgA, major curli subunit protein, which is produced by numerous strains of Enterobacteriaceae. Although CsgA sequences are highly conserved, they exhibit species diversity, which reflects the specific evolutionary and functional adaptability of the major curli subunit. Herein, we performed bioinformatics analyses to uncover the differences in the amyloidogenic properties of the R4 fragments in Escherichia coli and Salmonella enterica and proposed four mutants for more detailed studies: M1, M2, M3, and M4. The mutated sequences were characterized by various experimental techniques, such as circular dichroism, ATR-FTIR, FT-Raman, thioflavin T, transmission electron microscopy and confocal microscopy. Additionally, molecular dynamics simulations were performed to determine the role of buffer ions in the aggregation process. Our results demonstrated that the aggregation kinetics, fibril morphology, and overall structure of the peptide were significantly affected by the positions of charged amino acids within the repeat sequences of CsgA. Notably, substituting glycine with lysine resulted in the formation of distinctive spherically packed globular aggregates. The differences in morphology observed are attributed to the influence of phosphate ions, which disrupt the local electrostatic interaction network of the polypeptide chains. This study provides knowledge on the preferential formation of amyloid fibrils based on charge states within the polypeptide chain.


Assuntos
Proteínas de Escherichia coli , Proteínas de Escherichia coli/química , Substituição de Aminoácidos , Amiloide/química , Escherichia coli/genética , Escherichia coli/metabolismo , Peptídeos/química , Íons
5.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38377398

RESUMO

MOTIVATION: Missing values are commonly observed in metabolomics data from mass spectrometry. Imputing them is crucial because it assures data completeness, increases the statistical power of analyses, prevents inaccurate results, and improves the quality of exploratory analysis, statistical modeling, and machine learning. Numerous Missing Value Imputation Algorithms (MVIAs) employ heuristics or statistical models to replace missing information with estimates. In the context of metabolomics data, we identified 52 MVIAs implemented across 70 R functions. Nevertheless, the usage of those 52 established methods poses challenges due to package dependency issues, lack of documentation, and their instability. RESULTS: Our R package, 'imputomics', provides a convenient wrapper around 41 (plus random imputation as a baseline model) out of 52 MVIAs in the form of a command-line tool and a web application. In addition, we propose a novel functionality for selecting MVIAs recommended for metabolomics data with the best performance or execution time. AVAILABILITY AND IMPLEMENTATION: 'imputomics' is freely available as an R package (github.com/BioGenies/imputomics) and a Shiny web application (biogenies.info/imputomics-ws). The documentation is available at biogenies.info/imputomics.


Assuntos
Metabolômica , Software , Metabolômica/métodos , Algoritmos , Computadores , Espectrometria de Massas/métodos
6.
Nucleic Acids Res ; 52(D1): D360-D367, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37897355

RESUMO

Protein aggregation has been associated with aging and different pathologies and represents a bottleneck in the industrial production of biotherapeutics. Numerous past studies performed in Escherichia coli and other model organisms have allowed to dissect the biophysical principles underlying this process. This knowledge fuelled the development of computational tools, such as Aggrescan 3D (A3D) to forecast and re-design protein aggregation. Here, we present the A3D Model Organism Database (A3D-MODB) http://biocomp.chem.uw.edu.pl/A3D2/MODB, a comprehensive resource for the study of structural protein aggregation in the proteomes of 12 key model species spanning distant biological clades. In addition to A3D predictions, this resource incorporates information useful for contextualizing protein aggregation, including membrane protein topology and structural model confidence, as an indirect reporter of protein disorder. The database is openly accessible without any need for registration. We foresee A3D-MOBD evolving into a central hub for conducting comprehensive, multi-species analyses of protein aggregation, fostering the development of protein-based solutions for medical, biotechnological, agricultural and industrial applications.


Assuntos
Bases de Dados de Proteínas , Agregados Proteicos , Proteoma , Humanos , Animais
7.
Database (Oxford) ; 20232023 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-38011719

RESUMO

Parkinson's disease (PD) is the second most prevalent neurodegenerative disorder, yet effective treatments able to stop or delay disease progression remain elusive. The aggregation of a presynaptic protein, α-synuclein (aSyn), is the primary neurological hallmark of PD and, thus, a promising target for therapeutic intervention. However, the lack of consensus on the molecular properties required to specifically bind the toxic species formed during aSyn aggregation has hindered the development of therapeutic molecules. Recently, we defined and experimentally validated a peptide architecture that demonstrated high affinity and selectivity in binding to aSyn toxic oligomers and fibrils, effectively preventing aSyn pathogenic aggregation. Human peptides with such properties may have neuroprotective activities and hold a huge therapeutic interest. Driven by this idea, here, we developed a discriminative algorithm for the screening of human endogenous neuropeptides, antimicrobial peptides and diet-derived bioactive peptides with the potential to inhibit aSyn aggregation. We identified over 100 unique biogenic peptide candidates and ensembled a comprehensive database (aSynPEP-DB) that collects their physicochemical features, source datasets and additional therapeutic-relevant information, including their sites of expression and associated pathways. Besides, we provide access to the discriminative algorithm to extend its application to the screening of artificial peptides or new peptide datasets. aSynPEP-DB is a unique repository of peptides with the potential to modulate aSyn aggregation, serving as a platform for the identification of previously unexplored therapeutic agents. Database URL:  https://asynpepdb.ppmclab.com/.


Assuntos
Doenças Neurodegenerativas , Doença de Parkinson , Humanos , alfa-Sinucleína/química , alfa-Sinucleína/metabolismo , Doença de Parkinson/tratamento farmacológico , Doença de Parkinson/genética , Doença de Parkinson/metabolismo , Peptídeos
8.
Sci Rep ; 13(1): 8365, 2023 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-37225726

RESUMO

Due to their complex history, plastids possess proteins encoded in the nuclear and plastid genome. Moreover, these proteins localize to various subplastid compartments. Since protein localization is associated with its function, prediction of subplastid localization is one of the most important steps in plastid protein annotation, providing insight into their potential function. Therefore, we create a novel manually curated data set of plastid proteins and build an ensemble model for prediction of protein subplastid localization. Moreover, we discuss problems associated with the task, e.g. data set sizes and homology reduction. PlastoGram classifies proteins as nuclear- or plastid-encoded and predicts their localization considering: envelope, stroma, thylakoid membrane or thylakoid lumen; for the latter, the import pathway is also predicted. We also provide an additional function to differentiate nuclear-encoded inner and outer membrane proteins. PlastoGram is available as a web server at https://biogenies.info/PlastoGram and as an R package at https://github.com/BioGenies/PlastoGram . The code used for described analyses is available at https://github.com/BioGenies/PlastoGram-analysis .


Assuntos
Proteínas de Cloroplastos , Genomas de Plastídeos , Proteínas de Membrana , Anotação de Sequência Molecular , Tilacoides
9.
Microb Genom ; 9(4)2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-37103985

RESUMO

Enterohaemolysin (Ehx) and alpha-haemolysin are virulence-associated factors (VAFs) causing the haemolytic phenotype in Escherichia coli. It has been shown that chromosomally and plasmid-encoded alpha-haemolysin are characteristic of specific pathotypes, virulence-associated factors and hosts. However, the prevalence of alpha- and enterohaemolysin does not overlap in the majority of pathotypes. Therefore, this study focuses on the characterization of the haemolytic E. coli population associated with multiple pathotypes in human and animal infectious diseases. Using a genomics approach, we investigated characteristic features of the enterohaemolysin-encoding strains to identify factors differentiating enterohaemolysin-positive from alpha-haemolysin-positive E. coli populations. To shed light on the functionality of Ehx subtypes, we analysed Ehx-coding genes and inferred EhxA phylogeny. The two haemolysins are associated with a different repertoire of adhesins, iron acquisition or toxin systems. Alpha-haemolysin is predominantly found in uropathogenic E. coli (UPEC) and predicted to be chromosomally encoded, or nonpathogenic and undetermined E. coli pathotypes and typically predicted to be plasmid-encoded. Enterohaemolysin is mainly associated with Shiga toxin-producing E. coli (STEC) and enterohaemorrhagic E. coli (EHEC) and predicted to be plasmid-encoded. Both types of haemolysin are found in atypical enteropathogenic E. coli (aEPEC). Moreover, we identified a new EhxA subtype present exclusively in genomes with VAFs characteristic of nonpathogenic E. coli. This study reveals a complex relationship between haemolytic E. coli of diverse pathotypes, providing a framework for understanding the potential role of haemolysin in pathogenesis.


Assuntos
Escherichia coli Êntero-Hemorrágica , Proteínas de Escherichia coli , Animais , Humanos , Proteínas Hemolisinas/genética , Proteínas de Escherichia coli/genética , Genômica , Fatores de Virulência/genética
10.
Int J Mol Sci ; 24(1)2023 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-36614244

RESUMO

Amyloids and antimicrobial peptides (AMPs) have many similarities, e.g., both kill microorganisms by destroying their membranes, form aggregates, and modulate the innate immune system. Given these similarities and the fact that the antimicrobial properties of short amyloids have not yet been investigated, we chose a group of potentially antimicrobial short amyloids to verify their impact on bacterial and eukaryotic cells. We used AmpGram, a best-performing AMP classification model, and selected ten amyloids with the highest AMP probability for our experimental research. Our results indicate that four tested amyloids: VQIVCK, VCIVYK, KCWCFT, and GGYLLG, formed aggregates under the conditions routinely used to evaluate peptide antimicrobial properties, but none of the tested amyloids exhibited antimicrobial or cytotoxic properties. Accordingly, they should be included in the negative datasets to train the next-generation AMP prediction models, based on experimentally confirmed AMP and non-AMP sequences. In the article, we also emphasize the importance of reporting non-AMPs, given that only a handful of such sequences have been officially confirmed.


Assuntos
Anti-Infecciosos , Peptídeos Catiônicos Antimicrobianos , Peptídeos Catiônicos Antimicrobianos/farmacologia , Peptídeos Catiônicos Antimicrobianos/química , Anti-Infecciosos/farmacologia , Bactérias
11.
Nucleic Acids Res ; 51(D1): D352-D357, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36243982

RESUMO

Information about the impact of interactions between amyloid proteins on their fibrillization propensity is scattered among many experimental articles and presented in unstructured form. We manually curated information located in almost 200 publications (selected out of 562 initially considered), obtaining details of 883 experimentally studied interactions between 46 amyloid proteins or peptides. We also proposed a novel standardized terminology for the description of amyloid-amyloid interactions, which is included in our database, covering all currently known types of such a cross-talk, including inhibition of fibrillization, cross-seeding and other phenomena. The new approach allows for more specific studies on amyloids and their interactions, by providing very well-defined data. AmyloGraph, an online database presenting information on amyloid-amyloid interactions, is available at (http://AmyloGraph.com/). Its functionalities are also accessible as the R package (https://github.com/KotulskaLab/AmyloGraph). AmyloGraph is the only publicly available repository for experimentally determined amyloid-amyloid interactions.


Assuntos
Amiloide , Proteínas Amiloidogênicas , Proteínas Amiloidogênicas/metabolismo , Peptídeos , Bases de Dados de Proteínas
12.
Comput Struct Biotechnol J ; 20: 6526-6533, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36467580

RESUMO

Peptides are known to possess a plethora of beneficial properties and activities: antimicrobial, anticancer, anti-inflammatory or the ability to cross the blood-brain barrier are only a few examples of their functional diversity. For this reason, bioinformaticians are constantly developing and upgrading models to predict their activity in silico, generating a steadily increasing number of available tools. Although these efforts have provided fruitful outcomes in the field, the vast and diverse amount of resources for peptide prediction can turn a simple prediction into an overwhelming searching process to find the optimal tool. This minireview aims at providing a systematic and accessible analysis of the complex ecosystem of peptide activity prediction, showcasing the variability of existing models for peptide assessment, their domain specialization and popularity. Moreover, we also assess the reproducibility of such bioinformatics tools and describe tendencies observed in their development. The list of tools is available under https://biogenies.info/peptide-prediction-list/.

13.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35988923

RESUMO

Antimicrobial peptides (AMPs) are a heterogeneous group of short polypeptides that target not only microorganisms but also viruses and cancer cells. Due to their lower selection for resistance compared with traditional antibiotics, AMPs have been attracting the ever-growing attention from researchers, including bioinformaticians. Machine learning represents the most cost-effective method for novel AMP discovery and consequently many computational tools for AMP prediction have been recently developed. In this article, we investigate the impact of negative data sampling on model performance and benchmarking. We generated 660 predictive models using 12 machine learning architectures, a single positive data set and 11 negative data sampling methods; the architectures and methods were defined on the basis of published AMP prediction software. Our results clearly indicate that similar training and benchmark data set, i.e. produced by the same or a similar negative data sampling method, positively affect model performance. Consequently, all the benchmark analyses that have been performed for AMP prediction models are significantly biased and, moreover, we do not know which model is the most accurate. To provide researchers with reliable information about the performance of AMP predictors, we also created a web server AMPBenchmark for fair model benchmarking. AMPBenchmark is available at http://BioGenies.info/AMPBenchmark.


Assuntos
Peptídeos Antimicrobianos , Benchmarking , Antibacterianos , Peptídeos/química
14.
Biomolecules ; 12(5)2022 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-35625541

RESUMO

Human S100B is a small, multifunctional protein. Its activity, inside and outside cells, contributes to the biology of the brain, muscle, skin, and adipocyte tissues. Overexpression of S100B occurs in Down Syndrome, Alzheimer's disease, Creutzfeldt-Jakob disease, schizophrenia, multiple sclerosis, brain tumors, epilepsy, melanoma, myocardial infarction, muscle disorders, and sarcopenia. Modulating the activities of S100B, related to human diseases, without disturbing its physiological functions, is vital for drug and therapy design. This work focuses on the extracellular activity of S100B and one of its receptors, the Receptor for Advanced Glycation End products (RAGE). The functional outcome of extracellular S100B, partially, depends on the activation of intracellular signaling pathways. Here, we used Biotin Switch Technique enrichment and mass-spectrometry-based proteomics to show that the appearance of the S100B protein in the extracellular milieu of the mammalian Chinese Hamster Ovary (CHO) cells, and expression of the membrane-bound RAGE receptor, lead to changes in the intracellular S-nitrosylation of, at least, more than a hundred proteins. Treatment of the wild-type CHO cells with nanomolar or micromolar concentrations of extracellular S100B modulates the sets of S-nitrosylation targets inside cells. The cellular S-nitrosome is tuned differently, depending on the presence or absence of stable RAGE receptor expression. The presented results are a proof-of-concept study, suggesting that S-nitrosylation, like other post-translational modifications, should be considered in future research, and in developing tailored therapies for S100B and RAGE receptor-related diseases.


Assuntos
Proteína S , Receptores Imunológicos , Animais , Células CHO , Cricetinae , Cricetulus , Humanos , Proteína S/metabolismo , Receptor para Produtos Finais de Glicação Avançada/metabolismo , Receptores Imunológicos/metabolismo , Subunidade beta da Proteína Ligante de Cálcio S100
15.
Microb Genom ; 7(12)2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34939560

RESUMO

Since the discovery of haemolysis, many studies focused on a deeper understanding of this phenotype in Escherichia coli and its association with other virulence genes, diseases and pathogenic attributes/functions in the host. Our virulence-associated factor profiling and genome-wide association analysis of genomes of haemolytic and nonhaemolytic E. coli unveiled high prevalence of adhesins, iron acquisition genes and toxins in haemolytic bacteria. In the case of fimbriae with high prevalence, we analysed sequence variation of FimH, EcpD and CsgA, and showed that different adhesin variants were present in the analysed groups, indicating altered adhesive capabilities of haemolytic and nonhaemolytic E. coli. Analysis of over 1000 haemolytic E. coli genomes revealed that they are pathotypically, genetically and antigenically diverse, but their adhesin and iron acquisition repertoire is associated with genome placement of hlyCABD cluster. Haemolytic E. coli with chromosome-encoded alpha-haemolysin had high frequency of P, S, Auf fimbriae and multiple iron acquisition systems such as aerobactin, yersiniabactin, salmochelin, Fec, Sit, Bfd and hemin uptake systems. Haemolytic E. coli with plasmid-encoded alpha-haemolysin had similar adhesin profile to nonpathogenic E. coli, with high prevalence of Stg, Yra, Ygi, Ycb, Ybg, Ycf, Sfm, F9 fimbriae, Paa, Lda, intimin and type 3 secretion system encoding genes. Analysis of HlyCABD sequence variation revealed presence of variants associated with genome placement and pathotype.


Assuntos
Adesinas de Escherichia coli/genética , Escherichia coli/genética , Proteínas Hemolisinas/genética , Ferro/metabolismo , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Fímbrias/genética , Humanos , Chaperonas Moleculares/genética , Família Multigênica , Mutação , Plasmídeos/genética
16.
Protein Sci ; 30(9): 1854-1870, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34075639

RESUMO

Cross seeding between amyloidogenic proteins in the gut is receiving increasing attention as a possible mechanism for initiation or acceleration of amyloid formation by aggregation-prone proteins such as αSN, which is central in the development of Parkinson's disease (PD). This is particularly pertinent in view of the growing number of functional (i.e., benign and useful) amyloid proteins discovered in bacteria. Here we identify two amyloidogenic proteins, Pr12 and Pr17, in fecal matter from PD transgenic rats and their wild type counterparts, based on their stability against dissolution by formic acid (FA). Both proteins show robust aggregation into ThT-positive aggregates that contain higher-order ß-sheets and have a fibrillar morphology, indicative of amyloid proteins. In addition, Pr17 aggregates formed in vitro showed significant resistance against FA, suggesting an ability to form highly stable amyloid. Treatment with proteinase K revealed a protected core of approx. 9 kDa. Neither Pr12 nor Pr17, however, affected αSN aggregation in vitro. Thus, amyloidogenicity does not per se lead to an ability to cross-seed fibrillation of αSN. Our results support the use of proteomics and FA to identify amyloidogenic protein in complex mixtures and suggests that there may be numerous functional amyloid proteins in microbiomes.


Assuntos
Amiloide/química , Proteínas Amiloidogênicas/química , Proteínas de Bactérias/química , Microbioma Gastrointestinal/genética , Consórcios Microbianos/genética , Doença de Parkinson/microbiologia , Sequência de Aminoácidos , Amiloide/isolamento & purificação , Proteínas Amiloidogênicas/isolamento & purificação , Animais , Proteínas de Bactérias/isolamento & purificação , Benzotiazóis/química , Biofilmes/crescimento & desenvolvimento , Modelos Animais de Doenças , Endopeptidase K/química , Fezes/química , Fezes/microbiologia , Feminino , Formiatos/química , Humanos , Concentração de Íons de Hidrogênio , Doença de Parkinson/metabolismo , Doença de Parkinson/patologia , Agregados Proteicos , Ratos , Ratos Transgênicos , Ureia/química , alfa-Sinucleína/química , alfa-Sinucleína/metabolismo
17.
Int J Mol Sci ; 22(10)2021 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-34066237

RESUMO

CsgA is an aggregating protein from bacterial biofilms, representing a class of functional amyloids. Its amyloid propensity is defined by five fragments (R1-R5) of the sequence, representing non-perfect repeats. Gate-keeper amino acid residues, specific to each fragment, define the fragment's propensity for self-aggregation and aggregating characteristics of the whole protein. We study the self-aggregation and secondary structures of the repeat fragments of Salmonella enterica and Escherichia coli and comparatively analyze their potential effects on these proteins in a bacterial biofilm. Using bioinformatics predictors, ATR-FTIR and FT-Raman spectroscopy techniques, circular dichroism, and transmission electron microscopy, we confirmed self-aggregation of R1, R3, R5 fragments, as previously reported for Escherichia coli, however, with different temporal characteristics for each species. We also observed aggregation propensities of R4 fragment of Salmonella enterica that is different than that of Escherichia coli. Our studies showed that amyloid structures of CsgA repeats are more easily formed and more durable in Salmonella enterica than those in Escherichia coli.


Assuntos
Amiloide/química , Proteínas de Bactérias/metabolismo , Proteínas de Escherichia coli/metabolismo , Escherichia coli/metabolismo , Salmonella enterica/metabolismo , Sequência de Aminoácidos , Proteínas de Bactérias/genética , Escherichia coli/genética , Escherichia coli/crescimento & desenvolvimento , Proteínas de Escherichia coli/genética , Agregados Proteicos , Conformação Proteica , Salmonella enterica/genética , Salmonella enterica/crescimento & desenvolvimento , Homologia de Sequência
18.
Ann Transl Med ; 9(7): 528, 2021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-33987226

RESUMO

BACKGROUND: DNA double-strand breaks can be counted as discrete foci by imaging techniques. In personalized medicine and pharmacology, the analysis of counting data is relevant for numerous applications, e.g., for cancer and aging research and the evaluation of drug efficacy. By default, it is assumed to follow the Poisson distribution. This assumption, however, may lead to biased results and faulty conclusions in datasets with excess zero values (zero-inflation), a variance larger than the mean (overdispersion), or both. In such cases, the assumption of a Poisson distribution would skew the estimation of mean and variance, and other models like the negative binomial (NB), zero-inflated Poisson or zero-inflated NB distributions should be employed. The model chosen has an influence on the parameter estimation (mean value and confidence interval). Yet the choice of the suitable distribution model is not trivial. METHODS: To support, simplify and objectify this process, we have developed the countfitteR software as an R package. We used a Bayesian approach for distribution model selection and the shiny web application framework for interactive data analysis. RESULTS: We show the application of our software based on examples of DNA double-strand break count data from phenotypic imaging by multiplex fluorescence microscopy. In analyzing numerous datasets of molecular pharmacological markers (phosphorylated histone H2AX and p53 binding protein), countfitteR demonstrated an equal or superior statistical performance compared to the usually employed two-step procedure, with an overall power of up to 98%. In addition, it still gave information in cases with no result at all from the two-step procedure. In our data sample we found that the NB distribution was the most frequent, with the Poisson distribution taking second place. CONCLUSIONS: countfitteR can perform an automated distribution model selection and thus support the data analysis and lead to objective statistically verifiable estimated values. Originally designed for the analysis of foci in biomedical image data, countfitteR can be used in a variety of areas where non-Poisson distributed counting data is prevalent.

19.
Sci Rep ; 11(1): 8934, 2021 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-33903613

RESUMO

Several disorders are related to amyloid aggregation of proteins, for example Alzheimer's or Parkinson's diseases. Amyloid proteins form fibrils of aggregated beta structures. This is preceded by formation of oligomers-the most cytotoxic species. Determining amyloidogenicity is tedious and costly. The most reliable identification of amyloids is obtained with high resolution microscopies, such as electron microscopy or atomic force microscopy (AFM). More frequently, less expensive and faster methods are used, especially infrared (IR) spectroscopy or Thioflavin T staining. Different experimental methods are not always concurrent, especially when amyloid peptides do not readily form fibrils but oligomers. This may lead to peptide misclassification and mislabeling. Several bioinformatics methods have been proposed for in-silico identification of amyloids, many of them based on machine learning. The effectiveness of these methods heavily depends on accurate annotation of the reference training data obtained from in-vitro experiments. We study how robust are bioinformatics methods to weak supervision, encountering imperfect training data. AmyloGram and three other amyloid predictors were applied. The results proved that a certain degree of misannotation in the reference data can be eliminated by the bioinformatics tools, even if they belonged to their training set. The computational results are supported by new experiments with IR and AFM methods.


Assuntos
Amiloide , Biologia Computacional , Simulação por Computador , Peptídeos , Agregados Proteicos/genética , Amiloide/química , Amiloide/genética , Humanos , Microscopia de Força Atômica , Peptídeos/química , Peptídeos/genética , Espectrofotometria Infravermelho
20.
J Proteome Res ; 20(4): 2083-2088, 2021 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-33661648

RESUMO

The study of microbiomes has gained in importance over the past few years and has led to the emergence of the fields of metagenomics, metatranscriptomics, and metaproteomics. While initially focused on the study of biodiversity within these communities, the emphasis has increasingly shifted to the study of (changes in) the complete set of functions available in these communities. A key tool to study this functional complement of a microbiome is Gene Ontology (GO) term analysis. However, comparing large sets of GO terms is not an easy task due to the deeply branched nature of GO, which limits the utility of exact term matching. To solve this problem, we here present MegaGO, a user-friendly tool that relies on semantic similarity between GO terms to compute the functional similarity between multiple data sets. MegaGO is high performing: Each set can contain thousands of GO terms, and results are calculated in a matter of seconds. MegaGO is available as a web application at https://megago.ugent.be and is installable via pip as a standalone command line tool and reusable software library. All code is open source under the MIT license and is available at https://github.com/MEGA-GO/.


Assuntos
Microbiota , Software , Biologia Computacional , Ontologia Genética , Metagenômica , Semântica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...