RESUMO
Two new encoding strategies, namely, wedge and twist codes, which are based on the DNA helical parameters, are introduced to represent DNA sequences in artificial neural network (ANN)-based modeling of biological systems. The performance of the new coding strategies has been evaluated by conducting three case studies involving mapping (modeling) and classification applications of ANNs. The proposed coding schemes have been compared rigorously and shown to outperform the existing coding strategies especially in situations wherein limited data are available for building the ANN models.
Assuntos
DNA/química , DNA/genética , Redes Neurais de Computação , Análise de Sequência de DNA/métodos , Algoritmos , Simulação por Computador , Conformação de Ácido Nucleico , Regiões Promotoras Genéticas , Análise de Sequência de DNA/estatística & dados numéricosRESUMO
In the present paper, a hybrid technique involving artificial neural network (ANN) and genetic algorithm (GA) has been proposed for performing modeling and optimization of complex biological systems. In this approach, first an ANN approximates (models) the nonlinear relationship(s) existing between its input and output example data sets. Next, the GA, which is a stochastic optimization technique, searches the input space of the ANN with a view to optimize the ANN output. The efficacy of this formalism has been tested by conducting a case study involving optimization of DNA curvature characterized in terms of the RL value. Using the ANN-GA methodology, a number of sequences possessing high RL values have been obtained and analyzed to verify the existence of features known to be responsible for the occurrence of curvature. A couple of sequences have also been tested experimentally. The experimental results validate qualitatively and also near-quantitatively, the solutions obtained using the hybrid formalism. The ANN-GA technique is a useful tool to obtain, ahead of experimentation, sequences that yield high RL values. The methodology is a general one and can be suitably employed for optimizing any other biological feature.
Assuntos
DNA/química , Conformação de Ácido Nucleico , Algoritmos , Simulação por Computador , Modelos Genéticos , Mutação , Redes Neurais de ComputaçãoRESUMO
MOTIVATION: Our aim is to utilize an artificial neural network (ANN) for the prediction of DNA curvature in terms of retardation anomaly. RESULTS: An ANN capturing the role of phasing, increased helix flexibility, run of poly(A) tracts and flanking base pair effects in determining the extent of DNA curvature has been developed. The network predictions validate the known experimental results and also explain how the base pairs other than ApA affect the curvature. The results suggest that ANN can be used as a model-free tool for studying DNA curvature. AVAILABILITY: The optimal weights and the procedure to compute the retardation anomaly value are available on request from the authors. CONTACT: bdk@ems. ncl.res.in