Search | VHL Regional Portal

1.

Deep learning-based localization algorithms on fluorescence human brain 3D reconstruction: a comparative study using stereology as a reference.

Checcucci, Curzio; Wicinski, Bridget; Mazzamuto, Giacomo; Scardigli, Marina; Ramazzotti, Josephine; Brady, Niamh; Pavone, Francesco S; Hof, Patrick R; Costantini, Irene; Frasconi, Paolo.

Sci Rep ; 14(1): 14629, 2024 06 25.

Article in English | MEDLINE | ID: mdl-38918523

ABSTRACT

3D reconstruction of human brain volumes at high resolution is now possible thanks to advancements in tissue clearing methods and fluorescence microscopy techniques. Analyzing the massive data produced with these approaches requires automatic methods able to perform fast and accurate cell counting and localization. Recent advances in deep learning have enabled the development of various tools for cell segmentation. However, accurate quantification of neurons in the human brain presents specific challenges, such as high pixel intensity variability, autofluorescence, non-specific fluorescence and very large size of data. In this paper, we provide a thorough empirical evaluation of three techniques based on deep learning (StarDist, CellPose and BCFind-v2, an updated version of BCFind) using a recently introduced three-dimensional stereological design as a reference for large-scale insights. As a representative problem in human brain analysis, we focus on a 4 -cm 3 portion of the Broca's area. We aim at helping users in selecting appropriate techniques depending on their research objectives. To this end, we compare methods along various dimensions of analysis, including correctness of the predicted density and localization, computational efficiency, and human annotation effort. Our results suggest that deep learning approaches are very effective, have a high throughput providing each cell 3D location, and obtain results comparable to the estimates of the adopted stereological design.

Subject(s)

Brain , Deep Learning , Imaging, Three-Dimensional , Humans , Imaging, Three-Dimensional/methods , Brain/diagnostic imaging , Algorithms , Neurons/cytology , Microscopy, Fluorescence/methods

2.

Corrigendum to "Machine learning approach for prediction of outcomes in anticoagulated patients with atrial fibrillation" [International Journal of Cardiology 407 (2024) 132088].

Bernardini, Andrea; Bindini, Luca; Antonucci, Emilia; Berteotti, Martina; Giusti, Betti; Testa, Sophie; Palareti, Gualtiero; Poli, Daniela; Frasconi, Paolo; Marcucci, Rossella.

Int J Cardiol ; 410: 132226, 2024 Sep 01.

Article in English | MEDLINE | ID: mdl-38851912

3.

Machine learning approach for prediction of outcomes in anticoagulated patients with atrial fibrillation.

Bernardini, Andrea; Bindini, Luca; Antonucci, Emilia; Berteotti, Martina; Giusti, Betti; Testa, Sophie; Palareti, Gualtiero; Poli, Daniela; Frasconi, Paolo; Marcucci, Rossella.

Int J Cardiol ; 407: 132088, 2024 Jul 15.

Article in English | MEDLINE | ID: mdl-38657869

ABSTRACT

BACKGROUND: The accuracy of available prediction tools for clinical outcomes in patients with atrial fibrillation (AF) remains modest. Machine Learning (ML) has been used to predict outcomes in the AF population, but not in a population entirely on anticoagulant therapy. METHODS AND AIMS: Different supervised ML models were applied to predict all-cause death, cardiovascular (CV) death, major bleeding and stroke in anticoagulated patients with AF, processing data from the multicenter START-2 Register. RESULTS: 11078 AF patients (male n = 6029, 54.3%) were enrolled with a median follow-up period of 1.5 years [IQR 1.0-2.6]. Patients on Vitamin K Antagonists (VKA) were 5135 (46.4%) and 5943 (53.6%) were on Direct Oral Anticoagulants (DOAC). Using Multi-Gate Mixture of Experts, a cross-validated AUC of 0.779 ± 0.016 and 0.745 ± 0.022 were obtained, respectively, for the prediction of all-cause death and CV-death in the overall population. The best ML model outperformed CHA2DSVA2SC and HAS-BLED for all-cause death prediction (p < 0.001 for both). When compared to HAS-BLED, Gradient Boosting improved major bleeding prediction in DOACs patients (0.711 vs. 0.586, p < 0.001). A very low number of events during follow-up (52) resulted in a suboptimal ischemic stroke prediction (best AUC of 0.606 ± 0.117 in overall population). Body mass index, age, renal function, platelet count and hemoglobin levels resulted the most important variables for ML prediction. CONCLUSIONS: In AF patients, ML models showed good discriminative ability to predict all-cause death, regardless of the type of anticoagulation strategy, and major bleeding on DOAC therapy, outperforming CHA2DS2VASC and the HAS-BLED scores for risk prediction in these populations.

Subject(s)

Anticoagulants , Atrial Fibrillation , Machine Learning , Humans , Atrial Fibrillation/drug therapy , Atrial Fibrillation/complications , Male , Female , Aged , Anticoagulants/therapeutic use , Stroke/prevention & control , Stroke/epidemiology , Stroke/etiology , Aged, 80 and over , Registries , Middle Aged , Follow-Up Studies , Predictive Value of Tests , Hemorrhage/chemically induced , Hemorrhage/epidemiology , Treatment Outcome , Risk Assessment/methods

4.

Two-Dimensional Aortic Size Normalcy: A Novelty Detection Approach.

Frasconi, Paolo; Baracchi, Daniele; Giusti, Betti; Kura, Ada; Spaziani, Gaia; Cherubini, Antonella; Favilli, Silvia; Di Lenarda, Andrea; Pepe, Guglielmina; Nistri, Stefano.

Diagnostics (Basel) ; 11(2)2021 Feb 02.

Article in English | MEDLINE | ID: mdl-33540834

ABSTRACT

Background: To develop a tool for assessing normalcy of the thoracic aorta (TA) by echocardiography, based on either a linear regression model (Z-score), or a machine learning technique, namely one-class support vector machine (OC-SVM) (Q-score). Methods: TA diameters were measured in 1112 prospectively enrolled healthy subjects, aging 5 to 89 years. Considering sex, age and body surface area we developed two calculators based on the traditional Z-score and the novel Q-score. The calculators were compared in 198 adults with TA > 40 mm, and in 466 patients affected by either Marfan syndrome or bicuspid aortic valve (BAV). Results: Q-score attained a better Area Under the Curve (0.989; 95% CI 0.984-0.993, sensitivity = 97.5%, specificity = 95.4%) than Z-score (0.955; 95% CI 0.942-0.967, sensitivity = 81.3%, specificity = 93.3%; p < 0.0001) in patients with TA > 40 mm. The prevalence of TA dilatation in Marfan and BAV patients was higher as Z-score > 2 than as Q-score < 4% (73.4% vs. 50.09%, p < 0.00001). Conclusions: Q-score is a novel tool for assessing TA normalcy based on a model requiring less assumptions about the distribution of the relevant variables. Notably, diameters do not need to depend linearly on anthropometric measurements. Additionally, Q-score can capture the joint distribution of these variables with all four diameters simultaneously, thus accounting for the overall aortic shape. This approach results in a lower rate of predicted TA abnormalcy in patients at risk of TA aneurysm. Further prognostic studies will be necessary for assessing the relative effectiveness of Q-score versus Z-score.

5.

Classification of Cancer Pathology Reports: A Large-Scale Comparative Study.

Martina, Stefano; Ventura, Leonardo; Frasconi, Paolo.

IEEE J Biomed Health Inform ; 24(11): 3085-3094, 2020 11.

Article in English | MEDLINE | ID: mdl-32749978

ABSTRACT

We report about the application of state-of-the-art deep learning techniques to the automatic and interpretable assignment of ICD-O3 topography and morphology codes to free-text cancer reports. We present results on a large dataset (more than 80 000 labeled and 1 500 000 unlabeled anonymized reports written in Italian and collected from hospitals in Tuscany over more than a decade) and with a large number of classes (134 morphological classes and 61 topographical classes). We compare alternative architectures in terms of prediction accuracy and interpretability and show that our best model achieves a multiclass accuracy of 90.3% on topography site assignment and 84.8% on morphology type assignment. We found that in this context hierarchical models are not better than flat models and that an element-wise maximum aggregator is slightly better than attentive models on site classification. Moreover, the maximum aggregator offers a way to interpret the classification process.

Subject(s)

Neoplasms , Humans

6.

Publisher Correction: Whole-Brain Vasculature Reconstruction at the Single Capillary Level.

Di Giovanna, Antonino Paolo; Tibo, Alessandro; Silvestri, Ludovico; Müllenbroich, Marie Caroline; Costantini, Irene; Allegra Mascaro, Anna Letizia; Sacconi, Leonardo; Frasconi, Paolo; Pavone, Francesco Saverio.

Sci Rep ; 9(1): 8765, 2019 Jun 14.

Article in English | MEDLINE | ID: mdl-31201354

ABSTRACT

A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has been fixed in the paper.

7.

Whole-Brain Vasculature Reconstruction at the Single Capillary Level.

Di Giovanna, Antonino Paolo; Tibo, Alessandro; Silvestri, Ludovico; Müllenbroich, Marie Caroline; Costantini, Irene; Allegra Mascaro, Anna Letizia; Sacconi, Leonardo; Frasconi, Paolo; Pavone, Francesco Saverio.

Sci Rep ; 8(1): 12573, 2018 08 22.

Article in English | MEDLINE | ID: mdl-30135559

ABSTRACT

The distinct organization of the brain's vascular network ensures that it is adequately supplied with oxygen and nutrients. However, despite this fundamental role, a detailed reconstruction of the brain-wide vasculature at the capillary level remains elusive, due to insufficient image quality using the best available techniques. Here, we demonstrate a novel approach that improves vascular demarcation by combining CLARITY with a vascular staining approach that can fill the entire blood vessel lumen and imaging with light-sheet fluorescence microscopy. This method significantly improves image contrast, particularly in depth, thereby allowing reliable application of automatic segmentation algorithms, which play an increasingly important role in high-throughput imaging of the terabyte-sized datasets now routinely produced. Furthermore, our novel method is compatible with endogenous fluorescence, thus allowing simultaneous investigations of vasculature and genetically targeted neurons. We believe our new method will be valuable for future brain-wide investigations of the capillary network.

Subject(s)

Brain/blood supply , Capillaries/diagnostic imaging , Image Processing, Computer-Assisted , Microscopy, Fluorescence , Animals , Brain/cytology , Capillaries/physiology , Male , Mice , Mice, Inbred C57BL , Neovascularization, Physiologic , Neurons/cytology , Signal-To-Noise Ratio , Tomography

8.

Shift Aggregate Extract Networks.

Orsini, Francesco; Baracchi, Daniele; Frasconi, Paolo.

Front Robot AI ; 5: 42, 2018.

Article in English | MEDLINE | ID: mdl-33500928

ABSTRACT

We introduce an architecture based on deep hierarchical decompositions to learn effective representations of large graphs. Our framework extends classic R-decompositions used in kernel methods, enabling nested part-of-part relations. Unlike recursive neural networks, which unroll a template on input graphs directly, we unroll a neural network template over the decomposition hierarchy, allowing us to deal with the high degree variability that typically characterize social network graphs. Deep hierarchical decompositions are also amenable to domain compression, a technique that reduces both space and time complexity by exploiting symmetries. We show empirically that our approach is able to outperform current state-of-the-art graph classification methods on large social network datasets, while at the same time being competitive on small chemobiological benchmark datasets.

9.

RNAcommender: genome-wide recommendation of RNA-protein interactions.

Corrado, Gianluca; Tebaldi, Toma; Costa, Fabrizio; Frasconi, Paolo; Passerini, Andrea.

Bioinformatics ; 32(23): 3627-3634, 2016 12 01.

Article in English | MEDLINE | ID: mdl-27503225

ABSTRACT

MOTIVATION: Information about RNA-protein interactions is a vital pre-requisite to tackle the dissection of RNA regulatory processes. Despite the recent advances of the experimental techniques, the currently available RNA interactome involves a small portion of the known RNA binding proteins. The importance of determining RNA-protein interactions, coupled with the scarcity of the available information, calls for in silico prediction of such interactions. RESULTS: We present RNAcommender, a recommender system capable of suggesting RNA targets to unexplored RNA binding proteins, by propagating the available interaction information taking into account the protein domain composition and the RNA predicted secondary structure. Our results show that RNAcommender is able to successfully suggest RNA interactors for RNA binding proteins using little or no interaction evidence. RNAcommender was tested on a large dataset of human RBP-RNA interactions, showing a good ranking performance (average AUC ROC of 0.75) and significant enrichment of correct recommendations for 75% of the tested RBPs. RNAcommender can be a valid tool to assist researchers in identifying potential interacting candidates for the majority of RBPs with uncharacterized binding preferences. AVAILABILITY AND IMPLEMENTATION: The software is freely available at http://rnacommender.disi.unitn.it CONTACT: gianluca.corrado@unitn.it or andrea.passerini@unitn.itSupplementary information: Supplementary data are available at Bioinformatics online.

Subject(s)

RNA-Binding Proteins/chemistry , RNA/chemistry , Software , Humans , Protein Binding

10.

Quantitative neuroanatomy of all Purkinje cells with light sheet microscopy and high-throughput image analysis.

Silvestri, Ludovico; Paciscopi, Marco; Soda, Paolo; Biamonte, Filippo; Iannello, Giulio; Frasconi, Paolo; Pavone, Francesco S.

Front Neuroanat ; 9: 68, 2015.

Article in English | MEDLINE | ID: mdl-26074783

ABSTRACT

Characterizing the cytoarchitecture of mammalian central nervous system on a brain-wide scale is becoming a compelling need in neuroscience. For example, realistic modeling of brain activity requires the definition of quantitative features of large neuronal populations in the whole brain. Quantitative anatomical maps will also be crucial to classify the cytoarchtitectonic abnormalities associated with neuronal pathologies in a high reproducible and reliable manner. In this paper, we apply recent advances in optical microscopy and image analysis to characterize the spatial distribution of Purkinje cells (PCs) across the whole cerebellum. Light sheet microscopy was used to image with micron-scale resolution a fixed and cleared cerebellum of an L7-GFP transgenic mouse, in which all PCs are fluorescently labeled. A fast and scalable algorithm for fully automated cell identification was applied on the image to extract the position of all the fluorescent PCs. This vectorized representation of the cell population allows a thorough characterization of the complex three-dimensional distribution of the neurons, highlighting the presence of gaps inside the lamellar organization of PCs, whose density is believed to play a significant role in autism spectrum disorders. Furthermore, clustering analysis of the localized somata permits dividing the whole cerebellum in groups of PCs with high spatial correlation, suggesting new possibilities of anatomical partition. The quantitative approach presented here can be extended to study the distribution of different types of cell in many brain regions and across the whole encephalon, providing a robust base for building realistic computational models of the brain, and for unbiased morphological tissue screening in presence of pathologies and/or drug treatments.

11.

Computer-based automatic identification of neurons in gigavoxel-sized 3D human brain images.

Soda, Paolo; Acciai, Ludovica; Cordelli, Ermanno; Costantini, Irene; Sacconi, Leonardo; Pavone, Francesco Saverio; Conti, Valerio; Guerrini, Renzo; Frasconi, Paolo; Iannello, Giulio.

Annu Int Conf IEEE Eng Med Biol Soc ; 2015: 7724-7, 2015.

Article in English | MEDLINE | ID: mdl-26738082

ABSTRACT

Achieving a comprehensive knowledge of the human brain cytoarchitecture is a fundamental step to understand how the nervous system works, i.e., one of the greatest challenge of 21(st) century science. The recent development of biological tissue labeling and automated microscopic imaging systems has permitted to acquire images at the micro-resolution, which produce a huge quantity of data that cannot be manually analyzed. In case of mammals brain, automatic methods to extract objective information at the microscale have been applied until now to mice, macaque and cat 3D volume images. Here we report a method to automatically localize neurons in a sample of human brain removed during a surgical procedure for the treatments of drug resistant epilepsy in a child with hemimegalencephaly, whose neurons and neurites were fluorescence labelled and finally imaged using the two-photon fluorescence microscope. The method provides the map of both parvalbuminergic neurons and all other cells nuclei with a satisfactory f-score measured using more than two thousand human labelled soma.

Subject(s)

Brain/cytology , Imaging, Three-Dimensional/methods , Neuroimaging/methods , Neurons/cytology , Humans

12.

Large-scale automated identification of mouse brain cells in confocal light sheet microscopy images.

Frasconi, Paolo; Silvestri, Ludovico; Soda, Paolo; Cortini, Roberto; Pavone, Francesco S; Iannello, Giulio.

Bioinformatics ; 30(17): i587-93, 2014 Sep 01.

Article in English | MEDLINE | ID: mdl-25161251

ABSTRACT

MOTIVATION: Recently, confocal light sheet microscopy has enabled high-throughput acquisition of whole mouse brain 3D images at the micron scale resolution. This poses the unprecedented challenge of creating accurate digital maps of the whole set of cells in a brain. RESULTS: We introduce a fast and scalable algorithm for fully automated cell identification. We obtained the whole digital map of Purkinje cells in mouse cerebellum consisting of a set of 3D cell center coordinates. The method is accurate and we estimated an F1 measure of 0.96 using 56 representative volumes, totaling 1.09 GVoxel and containing 4138 manually annotated soma centers. AVAILABILITY AND IMPLEMENTATION: Source code and its documentation are available at http://bcfind.dinfo.unifi.it/. The whole pipeline of methods is implemented in Python and makes use of Pylearn2 and modified parts of Scikit-learn. Brain images are available on request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Brain/cytology , Imaging, Three-Dimensional/methods , Microscopy, Confocal/methods , Neurons/cytology , Algorithms , Animals , Mice

13.

Markov logic networks for optical chemical structure recognition.

Frasconi, Paolo; Gabbrielli, Francesco; Lippi, Marco; Marinai, Simone.

J Chem Inf Model ; 54(8): 2380-90, 2014 Aug 25.

Article in English | MEDLINE | ID: mdl-25068386

ABSTRACT

Optical chemical structure recognition is the problem of converting a bitmap image containing a chemical structure formula into a standard structured representation of the molecule. We introduce a novel approach to this problem based on the pipelined integration of pattern recognition techniques with probabilistic knowledge representation and reasoning. Basic entities and relations (such as textual elements, points, lines, etc.) are first extracted by a low-level processing module. A probabilistic reasoning engine based on Markov logic, embodying chemical and graphical knowledge, is subsequently used to refine these pieces of information. An annotated connection table of atoms and bonds is finally assembled and converted into a standard chemical exchange format. We report a successful evaluation on two large image data sets, showing that the method compares favorably with the current state-of-the-art, especially on degraded low-resolution images. The system is available as a web server at http://mlocsr.dinfo.unifi.it.

Subject(s)

Markov Chains , Pattern Recognition, Automated/statistics & numerical data , Small Molecule Libraries/chemistry , Software , Computer Graphics , Databases, Chemical , Image Processing, Computer-Assisted

14.

Predicting metal-binding sites from protein sequence.

Passerini, Andrea; Lippi, Marco; Frasconi, Paolo.

IEEE/ACM Trans Comput Biol Bioinform ; 9(1): 203-13, 2012.

Article in English | MEDLINE | ID: mdl-21606549

ABSTRACT

Prediction of binding sites from sequence can significantly help toward determining the function of uncharacterized proteins on a genomic scale. The task is highly challenging due to the enormous amount of alternative candidate configurations. Previous research has only considered this prediction problem starting from 3D information. When starting from sequence alone, only methods that predict the bonding state of selected residues are available. The sole exception consists of pattern-based approaches, which rely on very specific motifs and cannot be applied to discover truly novel sites. We develop new algorithmic ideas based on structured-output learning for determining transition-metal-binding sites coordinated by cysteines and histidines. The inference step (retrieving the best scoring output) is intractable for general output types (i.e., general graphs). However, under the assumption that no residue can coordinate more than one metal ion, we prove that metal binding has the algebraic structure of a matroid, allowing us to employ a very efficient greedy algorithm. We test our predictor in a highly stringent setting where the training set consists of protein chains belonging to SCOP folds different from the ones used for accuracy estimation. In this setting, our predictor achieves 56 percent precision and 60 percent recall in the identification of ligand-ion bonds.

Subject(s)

Binding Sites , Computational Biology/methods , Metals , Proteins , Sequence Analysis, Protein/methods , Amino Acid Sequence , Databases, Protein , Metals/chemistry , Metals/metabolism , Molecular Sequence Data , Proteins/chemistry , Proteins/metabolism

15.

MetalDetector v2.0: predicting the geometry of metal binding sites from protein sequence.

Passerini, Andrea; Lippi, Marco; Frasconi, Paolo.

Nucleic Acids Res ; 39(Web Server issue): W288-92, 2011 Jul.

Article in English | MEDLINE | ID: mdl-21576237

ABSTRACT

MetalDetector identifies CYS and HIS involved in transition metal protein binding sites, starting from sequence alone. A major new feature of release 2.0 is the ability to predict which residues are jointly involved in the coordination of the same metal ion. The server is available at http://metaldetector.dsi.unifi.it/v2.0/.

Subject(s)

Metalloproteins/chemistry , Metals/chemistry , Software , Binding Sites , Cysteine/chemistry , Histidine/chemistry , Internet , Sequence Analysis, Protein

16.

Characterization of metalloproteins by high-throughput X-ray absorption spectroscopy.

Shi, Wuxian; Punta, Marco; Bohon, Jen; Sauder, J Michael; D'Mello, Rhijuta; Sullivan, Mike; Toomey, John; Abel, Don; Lippi, Marco; Passerini, Andrea; Frasconi, Paolo; Burley, Stephen K; Rost, Burkhard; Chance, Mark R.

Genome Res ; 21(6): 898-907, 2011 Jun.

Article in English | MEDLINE | ID: mdl-21482623

ABSTRACT

High-throughput X-ray absorption spectroscopy was used to measure transition metal content based on quantitative detection of X-ray fluorescence signals for 3879 purified proteins from several hundred different protein families generated by the New York SGX Research Center for Structural Genomics. Approximately 9% of the proteins analyzed showed the presence of transition metal atoms (Zn, Cu, Ni, Co, Fe, or Mn) in stoichiometric amounts. The method is highly automated and highly reliable based on comparison of the results to crystal structure data derived from the same protein set. To leverage the experimental metalloprotein annotations, we used a sequence-based de novo prediction method, MetalDetector, to identify Cys and His residues that bind to transition metals for the redundancy reduced subset of 2411 sequences sharing <70% sequence identity and having at least one His or Cys. As the HT-XAS identifies metal type and protein binding, while the bioinformatics analysis identifies metal- binding residues, the results were combined to identify putative metal-binding sites in the proteins and their associated families. We explored the combination of this data with homology models to generate detailed structure models of metal-binding sites for representative proteins. Finally, we used extended X-ray absorption fine structure data from two of the purified Zn metalloproteins to validate predicted metalloprotein binding site structures. This combination of experimental and bioinformatics approaches provides comprehensive active site analysis on the genome scale for metalloproteins as a class, revealing new insights into metalloprotein structure and function.

Subject(s)

Metalloproteins/chemistry , Software , X-Ray Absorption Spectroscopy/methods , Binding Sites/genetics , Computational Biology/methods , Fluorescence , Genomics/methods , Metals, Heavy/analysis , Synchrotrons

17.

Prediction of protein beta-residue contacts by Markov logic networks with grounding-specific weights.

Lippi, Marco; Frasconi, Paolo.

Bioinformatics ; 25(18): 2326-33, 2009 Sep 15.

Article in English | MEDLINE | ID: mdl-19592394

ABSTRACT

MOTIVATION: Accurate prediction of contacts between beta-strand residues can significantly contribute towards ab initio prediction of the 3D structure of many proteins. Contacts in the same protein are highly interdependent. Therefore, significant improvements can be expected by applying statistical relational learners that overcome the usual machine learning assumption that examples are independent and identically distributed. Furthermore, the dependencies among beta-residue contacts are subject to strong regularities, many of which are known a priori. In this article, we take advantage of Markov logic, a statistical relational learning framework that is able to capture dependencies between contacts, and constrain the solution according to domain knowledge expressed by means of weighted rules in a logical language. RESULTS: We introduce a novel hybrid architecture based on neural and Markov logic networks with grounding-specific weights. On a non-redundant dataset, our method achieves 44.9% F(1) measure, with 47.3% precision and 42.7% recall, which is significantly better (P < 0.01) than previously reported performance obtained by 2D recursive neural networks. Our approach also significantly improves the number of chains for which beta-strands are nearly perfectly paired (36% of the chains are predicted with F(1) >or= 70% on coarse map). It also outperforms more general contact predictors on recent CASP 2008 targets.

Subject(s)

Markov Chains , Neural Networks, Computer , Proteins/chemistry , Computational Biology/methods , Databases, Protein , Protein Conformation

18.

MetalDetector: a web server for predicting metal-binding sites and disulfide bridges in proteins from sequence.

Lippi, Marco; Passerini, Andrea; Punta, Marco; Rost, Burkhard; Frasconi, Paolo.

Bioinformatics ; 24(18): 2094-5, 2008 Sep 15.

Article in English | MEDLINE | ID: mdl-18635571

ABSTRACT

UNLABELLED: The web server MetalDetector classifies histidine residues in proteins into one of two states (free or metal bound) and cysteines into one of three states (free, metal bound or disulfide bridged). A decision tree integrates predictions from two previously developed methods (DISULFIND and Metal Ligand Predictor). Cross-validated performance assessment indicates that our server predicts disulfide bonding state at 88.6% precision and 85.1% recall, while it identifies cysteines and histidines in transition metal-binding sites at 79.9% precision and 76.8% recall, and at 60.8% precision and 40.7% recall, respectively. AVAILABILITY: Freely available at http://metaldetector.dsi.unifi.it. SUPPLEMENTARY INFORMATION: Details and data can be found at http://metaldetector.dsi.unifi.it/help.php.

Subject(s)

Computational Biology/methods , Cysteine/chemistry , Disulfides/chemistry , Histidine/chemistry , Metalloproteins/chemistry , Sequence Analysis, Protein , Amino Acid Sequence , Binding Sites , Computer Simulation , Databases, Protein , Disulfides/metabolism , Internet , Metalloproteins/metabolism , Molecular Sequence Data , Sequence Alignment

19.

A simplified approach to disulfide connectivity prediction from protein sequences.

Vincent, Marc; Passerini, Andrea; Labbé, Matthieu; Frasconi, Paolo.

BMC Bioinformatics ; 9: 20, 2008 Jan 14.

Article in English | MEDLINE | ID: mdl-18194539

ABSTRACT

BACKGROUND: Prediction of disulfide bridges from protein sequences is useful for characterizing structural and functional properties of proteins. Several methods based on different machine learning algorithms have been applied to solve this problem and public domain prediction services exist. These methods are however still potentially subject to significant improvements both in terms of prediction accuracy and overall architectural complexity. RESULTS: We introduce new methods for predicting disulfide bridges from protein sequences. The methods take advantage of two new decomposition kernels for measuring the similarity between protein sequences according to the amino acid environments around cysteines. Disulfide connectivity is predicted in two passes. First, a binary classifier is trained to predict whether a given protein chain has at least one intra-chain disulfide bridge. Second, a multiclass classifier (plemented by 1-nearest neighbor) is trained to predict connectivity patterns. The two passes can be easily cascaded to obtain connectivity prediction from sequence alone. We report an extensive experimental comparison on several data sets that have been previously employed in the literature to assess the accuracy of cysteine bonding state and disulfide connectivity predictors. CONCLUSION: We reach state-of-the-art results on bonding state prediction with a simple method that classifies chains rather than individual residues. The prediction accuracy reached by our connectivity prediction method compares favorably with respect to all but the most complex other approaches. On the other hand, our method does not need any model selection or hyperparameter tuning, a property that makes it less prone to overfitting and prediction accuracy overestimation.

Subject(s)

Disulfides/chemistry , Models, Chemical , Models, Molecular , Proteins/chemistry , Sequence Analysis, Protein/methods , Algorithms , Amino Acid Sequence , Binding Sites , Computer Simulation , Molecular Sequence Data , Protein Binding

20.

Classification of small molecules by two- and three-dimensional decomposition kernels.

Ceroni, Alessio; Costa, Fabrizio; Frasconi, Paolo.

Bioinformatics ; 23(16): 2038-45, 2007 Aug 15.

Article in English | MEDLINE | ID: mdl-17550912

ABSTRACT

MOTIVATION: Several kernel-based methods have been recently introduced for the classification of small molecules. Most available kernels on molecules are based on 2D representations obtained from chemical structures, but far less work has focused so far on the definition of effective kernels that can also exploit 3D information. RESULTS: We introduce new ideas for building kernels on small molecules that can effectively use and combine 2D and 3D information. We tested these kernels in conjunction with support vector machines for binary classification on the 60 NCI cancer screening datasets as well as on the NCI HIV data set. Our results show that 3D information leveraged by these kernels can consistently improve prediction accuracy in all datasets. AVAILABILITY: An implementation of the small molecule classifier is available from http://www.dsi.unifi.it/neural/src/3DDK.

Subject(s)

Biomarkers, Tumor/chemistry , Models, Chemical , Models, Molecular , Neoplasm Proteins/chemistry , Neoplasm Proteins/ultrastructure , Neoplasms/metabolism , Sequence Analysis, Protein/methods , Algorithms , Amino Acid Sequence , Computer Simulation , Molecular Sequence Data , Neoplasm Proteins/classification , Pattern Recognition, Automated/methods , Protein Conformation

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL