Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Sensors (Basel) ; 23(4)2023 Feb 07.
Article in English | MEDLINE | ID: mdl-36850472

ABSTRACT

Driver fatigue reduces the safety of traditional driving and limits the widespread adoption of self-driving cars; hence, the monitoring and early detection of drivers' drowsiness plays a key role in driving automation. When representing the drowsiness indicators as large feature vectors, fitting a machine learning model to the problem becomes challenging, and the problem's perspicuity decreases, making dimensionality reduction crucial in practice. For this reason, we propose an embedded feature selection algorithm that can be later utilized as a building block in the system development of a neural network-based drowsiness detector. We have adopted a technique: a so-called Feature Prune Layer is placed in front of the first layer in the architecture; as a result, its weights change regarding the importance of the corresponding input features and are deleted iteratively until the desired number is reached. We test the algorithm on EEG data, as it is one of the best indicators of drowsiness based on the literature. The proposed FS algorithm is able to reduce the original feature set by 95% with only 1% degradation in precision, while the precision increases by 1.5% and 2.7% respectively when selecting the top 10% and top 20% of the initial features. Moreover, the proposed method outperforms the widely popular Principal Component Analysis and the Chi-squared test when reducing the original feature set by 95%: it achieves 24.3% and 3.2% higher precision respectively.

2.
Bioinformatics ; 33(22): 3682-3684, 2017 Nov 15.
Article in English | MEDLINE | ID: mdl-29036655

ABSTRACT

MOTIVATION: It is commonplace that intrinsically disordered proteins (IDPs) are involved in crucial interactions in the living cell. However, the study of protein complexes formed exclusively by IDPs is hindered by the lack of data and such analyses remain sporadic. Systematic studies benefited other types of protein-protein interactions paving a way from basic science to therapeutics; yet these efforts require reliable datasets that are currently lacking for synergistically folding complexes of IDPs. RESULTS: Here we present the Mutual Folding Induced by Binding (MFIB) database, the first systematic collection of complexes formed exclusively by IDPs. MFIB contains an order of magnitude more data than any dataset used in corresponding studies and offers a wide coverage of known IDP complexes in terms of flexibility, oligomeric composition and protein function from all domains of life. The included complexes are grouped using a hierarchical classification and are complemented with structural and functional annotations. MFIB is backed by a firm development team and infrastructure, and together with possible future community collaboration it will provide the cornerstone for structural and functional studies of IDP complexes. AVAILABILITY AND IMPLEMENTATION: MFIB is freely accessible at http://mfib.enzim.ttk.mta.hu/. The MFIB application is hosted by Apache web server and was implemented in PHP. To enrich querying features and to enhance backend performance a MySQL database was also created. CONTACT: simon.istvan@ttk.mta.hu, meszaros.balint@ttk.mta.hu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , Databases, Factual , Intrinsically Disordered Proteins/metabolism , Protein Folding , Humans , Intrinsically Disordered Proteins/chemistry , Protein Binding
3.
Nucleic Acids Res ; 45(D1): D325-D330, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27924015

ABSTRACT

The TSTMP database is designed to help the target selection of human transmembrane proteins for structural genomics projects and structure modeling studies. Currently, there are only 60 known 3D structures among the polytopic human transmembrane proteins and about a further 600 could be modeled using existing structures. Although there are a great number of human transmembrane protein structures left to be determined, surprisingly only a small fraction of these proteins have 'selected' (or above) status according to the current version the TargetDB/TargetTrack database. This figure is even worse regarding those transmembrane proteins that would contribute the most to the structural coverage of the human transmembrane proteome. The database was built by sorting out proteins from the human transmembrane proteome with known structure and searching for suitable model structures for the remaining proteins by combining the results of a state-of-the-art transmembrane specific fold recognition algorithm and a sequence similarity search algorithm. Proteins were searched for homologues among the human transmembrane proteins in order to select targets whose successful structure determination would lead to the best structural coverage of the human transmembrane proteome. The pipeline constructed for creating the TSTMP database guarantees to keep the database up-to-date. The database is available at http://tstmp.enzim.ttk.mta.hu.


Subject(s)
Computational Biology/methods , Databases, Protein , Genomics/methods , Membrane Proteins , Humans , Membrane Proteins/chemistry , Membrane Proteins/genetics , Models, Molecular , Protein Conformation , Proteome , Proteomics/methods , Structure-Activity Relationship , Web Browser
4.
Nucleic Acids Res ; 43(W1): W408-12, 2015 Jul 01.
Article in English | MEDLINE | ID: mdl-25943549

ABSTRACT

The Consensus Constrained TOPology prediction (CCTOP; http://cctop.enzim.ttk.mta.hu) server is a web-based application providing transmembrane topology prediction. In addition to utilizing 10 different state-of-the-art topology prediction methods, the CCTOP server incorporates topology information from existing experimental and computational sources available in the PDBTM, TOPDB and TOPDOM databases using the probabilistic framework of hidden Markov model. The server provides the option to precede the topology prediction with signal peptide prediction and transmembrane-globular protein discrimination. The initial result can be recalculated by (de)selecting any of the prediction methods or mapped experiments or by adding user specified constraints. CCTOP showed superior performance to existing approaches. The reliability of each prediction is also calculated, which correlates with the accuracy of the per protein topology prediction. The prediction results and the collected experimental information are visualized on the CCTOP home page and can be downloaded in XML format. Programmable access of the CCTOP server is also available, and an example of client-side script is provided.


Subject(s)
Membrane Proteins/chemistry , Software , Algorithms , Humans , Internet , Protein Conformation
5.
Biol Direct ; 10: 31, 2015 May 28.
Article in English | MEDLINE | ID: mdl-26018427

ABSTRACT

BACKGROUND: Transmembrane proteins have important roles in cells, as they are involved in energy production, signal transduction, cell-cell interaction, cell-cell communication and more. In human cells, they are frequently targets for pharmaceuticals; therefore, knowledge about their properties and structure is crucial. Topology of transmembrane proteins provide a low resolution structural information, which can be a starting point for either laboratory experiments or modelling their 3D structures. RESULTS: Here, we present a database of the human α-helical transmembrane proteome, including the predicted and/or experimentally established topology of each transmembrane protein, together with the reliability of the prediction. In order to distinguish transmembrane proteins in the proteome as well as for topology prediction, we used a newly developed consensus method (CCTOP) that incorporates recent state of the art methods, with tested accuracies on a novel human benchmark protein set. CCTOP utilizes all available structure and topology data as well as bioinformatical evidences for topology prediction in a probabilistic framework provided by the hidden Markov model. This method shows the highest accuracy (98.5 % for discrinimating between transmembrane and non-transmembrane proteins and 84 % for per protein topology prediction) among the dozen tested topology prediction methods. Analysis of the human proteome with the CCTOP indicates that it contains 4998 (26 %) transmembrane proteins. Besides predicting topology, reliability of the predictions is estimated as well, and it is demonstrated that the per protein prediction accuracies of more than 60 % of the predictions are over 98 % on the benchmark sets and most probably on the predicted human transmembrane proteome too. CONCLUSIONS: Here, we present the most accurate prediction of the human transmembrane proteome together with the experimental topology data. These data, as well as various statistics about the human transmembrane proteins and their topologies can be downloaded from and can be visualized at the website of the human transmembrane proteome ( http://htp.enzim.hu ).


Subject(s)
Membrane Proteins/chemistry , Proteome , Algorithms , Cell Communication , Computational Biology , Databases, Protein , Humans , Markov Chains , Probability , Protein Conformation , Protein Sorting Signals , Signal Transduction
6.
Nucleic Acids Res ; 43(Database issue): D283-9, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25392424

ABSTRACT

The Topology Data Bank of Transmembrane Proteins (TOPDB, http://topdb.enzim.ttk.mta.hu) contains experimentally determined topology data of transmembrane proteins. Recently, we have updated TOPDB from several sources and utilized a newly developed topology prediction algorithm to determine the most reliable topology using the results of experiments as constraints. In addition to collecting the experimentally determined topology data published in the last couple of years, we gathered topographies defined by the TMDET algorithm using 3D structures from the PDBTM. Results of global topology analysis of various organisms as well as topology data generated by high throughput techniques, like the sequential positions of N- or O-glycosylations were incorporated into the TOPDB database. Moreover, a new algorithm was developed to integrate scattered topology data from various publicly available databases and a new method was introduced to measure the reliability of predicted topologies. We show that reliability values highly correlate with the per protein topology accuracy of the utilized prediction method. Altogether, more than 52,000 new topology data and more than 2600 new transmembrane proteins have been collected since the last public release of the TOPDB database.


Subject(s)
Databases, Protein , Membrane Proteins/chemistry , Protein Conformation
SELECTION OF CITATIONS
SEARCH DETAIL
...