Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Appl Intell (Dordr) ; 52(1): 71-80, 2022.
Article in English | MEDLINE | ID: mdl-34764595

ABSTRACT

Common compartmental modeling for COVID-19 is based on a priori knowledge and numerous assumptions. Additionally, they do not systematically incorporate asymptomatic cases. Our study aimed at providing a framework for data-driven approaches, by leveraging the strengths of the grey-box system theory or grey-box identification, known for its robustness in problem solving under partial, incomplete, or uncertain data. Empirical data on confirmed cases and deaths, extracted from an open source repository were used to develop the SEAIRD compartment model. Adjustments were made to fit current knowledge on the COVID-19 behavior. The model was implemented and solved using an Ordinary Differential Equation solver and an optimization tool. A cross-validation technique was applied, and the coefficient of determination R 2 was computed in order to evaluate the goodness-of-fit of the model. Key epidemiological parameters were finally estimated and we provided the rationale for the construction of SEAIRD model. When applied to Brazil's cases, SEAIRD produced an excellent agreement to the data, with an R 2 ≥ 90%. The probability of COVID-19 transmission was generally high (≥ 95%). On the basis of a 20-day modeling data, the incidence rate of COVID-19 was as low as 3 infected cases per 100,000 exposed persons in Brazil and France. Within the same time frame, the fatality rate of COVID-19 was the highest in France (16.4%) followed by Brazil (6.9%), and the lowest in Russia (≤ 1%). SEAIRD represents an asset for modeling infectious diseases in their dynamical stable phase, especially for new viruses when pathophysiology knowledge is very limited. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10489-021-02379-2.

3.
BMC Bioinformatics ; 19(Suppl 13): 466, 2019 Feb 04.
Article in English | MEDLINE | ID: mdl-30717663

ABSTRACT

BACKGROUND: With the recent advancements in high-throughput experimental procedures, biologists are gathering huge quantities of data. A main priority in bioinformatics and computational biology is to provide system level analytical tools capable of meeting an ever-growing production of high-throughput biological data while taking into account its biological context. In gene expression data analysis, genes have widely been considered as independent components. However, a systemic view shows that they act synergistically in living cells, forming functional complexes and more generally a biological system. RESULTS: In this paper, we propose LATNET, a signal transformation framework that, starting from an initial large-scale gene expression data, allows to generate new representations based on latent network-based relationships between the genes. LATNET aims to leverage system level relations between the genes as an underlying hidden structure to derive the new transformed latent signals. We present a concrete implementation of our framework, based on a gene regulatory network structure and two signal transformation approaches, to quantify latent network-based activity of regulators, as well as gene perturbation signals. The new gene/regulator signals are at the level of each sample of the input data and, thus, could directly be used instead of the initial expression signals for major bioinformatics analysis, including diagnosis and personalized medicine. CONCLUSION: Multiple patterns could be hidden or weakly observed in expression data. LATNET helps in uncovering latent signals that could emphasize hidden patterns based on the relations between the genes and, thus, enhancing the performance of gene expression-based analysis algorithms. We use LATNET for the analysis of real-world gene expression data of bladder cancer and we show the efficiency of our transformation framework as compared to using the initial expression data.


Subject(s)
Data Analysis , Gene Expression Regulation , Gene Regulatory Networks , Algorithms , Area Under Curve , Computational Biology/methods , Databases, Genetic , Humans
4.
IEEE/ACM Trans Comput Biol Bioinform ; 16(5): 1537-1549, 2019.
Article in English | MEDLINE | ID: mdl-28961123

ABSTRACT

Modeling the interface region of a protein complex paves the way for understanding its dynamics and functionalities. Existing works model the interface region of a complex by using different approaches, such as, the residue composition at the interface region, the geometry of the interface residues, or the structural alignment of interface regions. These approaches are useful for ranking a set of docked conformation or for building scoring function for protein-protein docking, but they do not provide a generic and scalable technique for the extraction of interface patterns leading to functional motif discovery. In this work, we model the interface region of a protein complex by graphs and extract interface patterns of the given complex in the form of frequent subgraphs. To achieve this, we develop a scalable algorithm for frequent subgraph mining. We show that a systematic review of the mined subgraphs provides an effective method for the discovery of functional motifs that exist along the interface region of a given protein complex. In our experiments, we use three PDB protein structure datasets. The first two datasets are composed of PDB structures from different conformations of two dimeric protein complexes: HIV-1 protease (329 structures), and triosephosphate isomerase (TIM) (86 structures). The third dataset is a collection of different enzyme structures protein structures from the six top-level enzyme classes, namely: Oxydoreductase, Transferase, Hydrolase, Lyase, Isomerase, and Ligase. We show that for the first two datasets, our method captures the locking mechanism at the dimeric interface by taking into account the spatial positioning of the interfacial residues through graphs. Indeed, our frequent subgraph mining based approach discovers the patterns representing the dimerization lock which is formed at the base of the structure in 323 of the 329 HIV-1 protease structures. Similarly, for 86 TIM structures, our approach discovers the dimerization lock formation in 50 structures. For the enzyme structures, we show that we are able to capture the functional motifs (active sites) that are specific to each of the six top-level classes of enzymes through frequent subgraphs.


Subject(s)
Amino Acid Motifs , Computational Biology/methods , Proteins , Algorithms , Data Mining , Databases, Protein , Models, Molecular , Protein Conformation , Protein Subunits , Proteins/chemistry , Proteins/metabolism
5.
J Comput Biol ; 26(6): 561-571, 2019 06.
Article in English | MEDLINE | ID: mdl-30517022

ABSTRACT

Studying protein structures is a major asset for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Yet, the classification of a protein structure remains a difficult, costly, and time-consuming task. Exploring spatial information on protein structures can provide important functional and structural insights. In this context, spatial motifs may correspond to relevant fragments, which might be very useful for a better understanding of proteins. In this article, we propose AntMot, a fast algorithm, to find spatial motifs from protein three-dimensional structures by extending the Karp-Miller-Rosenberg repetition finder, originally dedicated to sequences. The extracted motifs, termed ant-motifs, follow an ant-like shape that is composed of a backbone fragment from the primary structure, enriched with spatial refinements. We show that these motifs are biologically sound, and we used them as descriptors in the classification of several benchmark datasets. Experimental results show that our approach presents a trade-off between sequential motifs and subgraph motifs in terms of the number of extracted substructures, while providing a significant enhancement in the classification accuracy over sequential and frequent-subgraph motifs as well as alignment-based approaches.


Subject(s)
Computational Biology/methods , Data Mining/methods , Proteins/chemistry , Algorithms , Amino Acid Motifs , Databases, Protein , Protein Conformation
6.
BioData Min ; 9: 30, 2016.
Article in English | MEDLINE | ID: mdl-27688811

ABSTRACT

BACKGROUND: Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures. RESULTS: We propose ProtNN, a novel classification approach for protein 3D-structures. Given an unannotated query protein structure and a set of annotated proteins, ProtNN assigns to the query protein the class with the highest number of votes across the k nearest neighbor reference proteins, where k is a user-defined parameter. The search of the nearest neighbor annotated structures is based on a protein-graph representation model and pairwise similarities between vector embedding of the query and the reference protein structures in structural and topological spaces. CONCLUSIONS: We demonstrate through an extensive experimental evaluation that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude in runtime compared to state-of-the-art approaches.

7.
J Comput Biol ; 21(2): 162-72, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24117330

ABSTRACT

One of the most powerful techniques to study proteins is to look for recurrent fragments (also called substructures), then use them as patterns to characterize the proteins under study. Although protein sequences have been extensively studied in the literature, studying protein three-dimensional (3D) structures can reveal relevant structural and functional information that may not be derived from protein sequences alone. An emergent trend consists of parsing proteins 3D structures into graphs of amino acids. Hence, the search of recurrent substructures is formulated as a process of frequent subgraph discovery where each subgraph represents a 3D motif. In this scope, several efficient approaches for frequent 3D motif discovery have been proposed in the literature. However, the set of discovered 3D motifs is too large to be efficiently analyzed and explored in any further process. In this article, we propose a novel pattern selection approach that shrinks the large number of frequent 3D motifs by selecting a subset of representative ones. Existing pattern selection approaches do not exploit the domain knowledge. Yet, in our approach, we incorporate the evolutionary information of amino acids defined in the substitution matrices in order to select the representative 3D motifs. We show the effectiveness of our approach on a number of real datasets. The results issued from our experiments show that considering the substitution between amino acids allows our approach to detect many similarities between patterns that are ignored by current subgraph selection approaches, and that it is able to considerably decrease the number of 3D motifs while enhancing their interestingness.


Subject(s)
Amino Acid Motifs , Amino Acids/chemistry , Computational Biology/methods , Computer Graphics , Proteins/chemistry , Data Mining , Databases, Protein , Protein Conformation
SELECTION OF CITATIONS
SEARCH DETAIL
...