Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Data ; 10(1): 818, 2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-37993460

RESUMO

Land artificialization is a serious problem of civilization. Urban planning and natural risk management are aimed to improve it. In France, these practices operate the Local Land Plans (PLU - Plan Local d'Urbanisme) and the Natural risk prevention plans (PPRn - Plan de Prévention des Risques naturels) containing land use rules. To facilitate automatic extraction of the rules, we manually annotated a number of those documents concerning Montpellier, a rapidly evolving agglomeration exposed to natural risks. We defined a format for labeled examples in which each entry includes title and subtitle. In addition, we proposed a hierarchical representation of class labels to generalize the use of our corpus. Our corpus, consisting of 1934 textual segments, each of which labeled by one of the 4 classes (Verifiable, Non-verifiable, Informative and Not pertinent) is the first corpus in the French language in the fields of urban planning and natural risk management. Along with presenting the corpus, we tested a state-of-the-art approach for text classification to demonstrate its usability for automatic rule extraction.

2.
J Cheminform ; 15(1): 116, 2023 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-38031134

RESUMO

This paper presents a novel approach called Pharmacophore Activity Delta for extracting outstanding pharmacophores from a chemogenomic dataset, with a specific focus on a kinase target known as BCR-ABL. The method involves constructing a Hasse diagram, referred to as the pharmacophore network, by utilizing the subgraph partial order as an initial step, leading to the identification of pharmacophores for further evaluation. A pharmacophore is classified as a 'Pharmacophore Activity Delta' if its capability to effectively discriminate between active vs inactive molecules significantly deviates (by at least δ standard deviations) from the mean capability of its related pharmacophores. Among the 1479 molecules associated to BCR-ABL binding data, 130 Pharmacophore Activity Delta were identified. The pharmacophore network reveals distinct regions associated with active and inactive molecules. The study includes a discussion on representative key areas linked to different pharmacophores, emphasizing structure-activity relationships.

3.
Mol Inform ; 42(3): e2200232, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36529710

RESUMO

Maximum common substructures (MCS) have received a lot of attention in the chemoinformatics community. They are typically used as a similarity measure between molecules, showing high predictive performance when used in classification tasks, while being easily explainable substructures. In the present work, we applied the Pairwise Maximum Common Subgraph Feature Generation (PMCSFG) algorithm to automatically detect toxicophores (structural alerts) and to compute fingerprints based on MCS. We present a comparison between our MCS-based fingerprints and 12 well-known chemical fingerprints when used as features in machine learning models. We provide an experimental evaluation and discuss the usefulness of the different methods on mutagenicity data. The features generated by the MCS method have a state-of-the-art performance when predicting mutagenicity, while they are more interpretable than the traditional chemical fingerprints.


Assuntos
Algoritmos , Mutagênicos , Mutagênicos/química , Mutagênese , Aprendizado de Máquina
4.
Mol Inform ; 36(10)2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28590546

RESUMO

This article introduces a new type of structural fragment called a geometrical pattern. Such geometrical patterns are defined as molecular graphs that include a labelling of atoms together with constraints on interatomic distances. The discovery of geometrical patterns in a chemical dataset relies on the induction of multiple decision trees combined in random forests. Each computational step corresponds to a refinement of a preceding set of constraints, extending a previous geometrical pattern. This paper focuses on the mutagenicity of chemicals via the definition of structural alerts in relation with these geometrical patterns. It follows an experimental assessment of the main geometrical patterns to show how they can efficiently originate the definition of a chemical feature related to a chemical function or a chemical property. Geometrical patterns have provided a valuable and innovative approach to bring new pieces of information for discovering and assessing structural characteristics in relation to a particular biological phenotype.


Assuntos
Mutagênese/fisiologia , Carcinógenos/química , Mutagênese/genética , Testes de Mutagenicidade , Mutagênicos/química , Relação Estrutura-Atividade
5.
J Biomed Semantics ; 6: 27, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25992265

RESUMO

BACKGROUND: Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NLP) methods have been applied to extract background knowledge from biomedical texts. Some of existing NLP approaches are based on handcrafted rules and thus are time consuming and often devoted to a specific corpus. Machine learning based NLP methods, give good results but generate outcomes that are not really understandable by a user. RESULTS: We take advantage of an hybridization of data mining and natural language processing to propose an original symbolic method to automatically produce patterns conveying gene interactions and their characterizations. Therefore, our method not only allows gene interactions but also semantics information on the extracted interactions (e.g., modalities, biological contexts, interaction types) to be detected. Only limited resource is required: the text collection that is used as a training corpus. Our approach gives results comparable to the results given by state-of-the-art methods and is even better for the gene interaction detection in AIMed. CONCLUSIONS: Experiments show how our approach enables to discover interactions and their characterizations. To the best of our knowledge, there is few methods that automatically extract the interactions and also associated semantics information. The extracted gene interactions from PubMed are available through a simple web interface at https://bingotexte.greyc.fr/. The software is available at https://bingo2.greyc.fr/?q=node/22.

6.
J Chem Inf Model ; 55(5): 925-40, 2015 May 26.
Artigo em Inglês | MEDLINE | ID: mdl-25871768

RESUMO

This study is dedicated to the introduction of a novel method that automatically extracts potential structural alerts from a data set of molecules. These triggering structures can be further used for knowledge discovery and classification purposes. Computation of the structural alerts results from an implementation of a sophisticated workflow that integrates a graph mining tool guided by growth rate and stability. The growth rate is a well-established measurement of contrast between classes. Moreover, the extracted patterns correspond to formal concepts; the most robust patterns, named the stable emerging patterns (SEPs), can then be identified thanks to their stability, a new notion originating from the domain of formal concept analysis. All of these elements are explained in the paper from the point of view of computation. The method was applied to a molecular data set on mutagenicity. The experimental results demonstrate its efficiency: it automatically outputs a manageable number of structural patterns that are strongly related to mutagenicity. Moreover, a part of the resulting structures corresponds to already known structural alerts. Finally, an in-depth chemical analysis relying on these structures demonstrates how the method can initiate promising processes of chemical knowledge discovery.


Assuntos
Mineração de Dados/métodos , Descoberta de Drogas , Mutagênicos/química , Reconhecimento Automatizado de Padrão/métodos
7.
J Chem Inf Model ; 50(8): 1330-9, 2010 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-20726596

RESUMO

Starting from a random set of structures taken from the European Chemical Bureau (ECB) Web site, an estimation of the classification by acute category in ecotoxicology was carried out. This estimation was based on two approaches. One approach consists in starting with global quantitative structure-activity relationship (QSAR) equations, analyzing the results and defining an interpretation in terms of overall results and mode of action. The other starts with the notion of emerging fragments and more specifically with the introduction of a particular concept: the jumping fragments. This publication studies the scopes and limitations of each approach for the classification of the derivatives. A promising combination of the two methods is proposed for the classification and also for bringing new information about the importance, for the ecotoxicity, of specific chemical fragments considered alone or in association with others.


Assuntos
Ecotoxicologia/métodos , Poluentes Ambientais/química , Poluentes Ambientais/efeitos adversos , Modelos Biológicos , Estrutura Molecular , Relação Quantitativa Estrutura-Atividade
8.
In Silico Biol ; 8(2): 157-75, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18928203

RESUMO

Current analyses of co-expressed genes are often based on global approaches such as clustering or bi-clustering. An alternative way is to employ local methods and search for patterns--sets of genes displaying specific expression properties in a set of situations. The main bottleneck of this type of analysis is twofold--computational costs and an overwhelming number of candidate patterns which can hardly be further exploited. A timely application of background knowledge available in literature databases, biological ontologies and other sources can help to focus on the most plausible patterns only. The paper proposes, implements and tests a flexible constraint-based framework that enables the effective mining and representation of meaningful over-expression patterns representing intrinsic associations among genes and biological situations. The framework can be simultaneously applied to a wide spectrum of genomic data and we demonstrate that it allows to generate new biological hypotheses with clinical implications.


Assuntos
Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , Reconhecimento Automatizado de Padrão/métodos , Análise de Sequência de DNA/métodos , Software , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...