Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Evol Bioinform Online ; 9: 387-416, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24250217

RESUMEN

Essential proteins include the minimum required set of proteins to support cell life. Identifying essential proteins is important for understanding the cellular processes of an organism. However, identifying essential proteins experimentally is extremely time-consuming and labor-intensive. Alternative methods must be developed to examine essential proteins. There were two goals in this study: identifying the important features and building learning machines for discriminating essential proteins. Data for Saccharomyces cerevisiae and Escherichia coli were used. We first collected information from a variety of sources. We next proposed a modified backward feature selection method and build support vector machines (SVM) predictors based on the selected features. To evaluate the performance, we conducted cross-validations for the originally imbalanced data set and the down-sampling balanced data set. The statistical tests were applied on the performance associated with obtained feature subsets to confirm their significance. In the first data set, our best values of F-measure and Matthews correlation coefficient (MCC) were 0.549 and 0.495 in the imbalanced experiments. For the balanced experiment, the best values of F-measure and MCC were 0.770 and 0.545, respectively. In the second data set, our best values of F-measure and MCC were 0.421 and 0.407 in the imbalanced experiments. For the balanced experiment, the best values of F-measure and MCC were 0.718 and 0.448, respectively. The experimental results show that our selected features are compact and the performance improved. Prediction can also be conducted by users at the following internet address: http://bio2.cse.nsysu.edu.tw/esspredict.aspx.

2.
Evol Bioinform Online ; 9: 163-84, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23641141

RESUMEN

The Prediction of RNA secondary structures has drawn much attention from both biologists and computer scientists. Many useful tools have been developed for this purpose. These tools have their individual strengths and weaknesses. As a result, based on support vector machines (SVM), we propose a tool choice method which integrates three prediction tools: pknotsRG, RNAStructure, and NUPACK. Our method first extracts features from the target RNA sequence, and adopts two information-theoretic feature selection methods for feature ranking. We propose a method to combine feature selection and classifier fusion in an incremental manner. Our test data set contains 720 RNA sequences, where 225 pseudoknotted RNA sequences are obtained from PseudoBase, and 495 nested RNA sequences are obtained from RNA SSTRAND. The method serves as a preprocessing way in analyzing RNA sequences before the RNA secondary structure prediction tools are employed. In addition, the performance of various configurations is subject to statistical tests to examine their significance. The best base-pair accuracy achieved is 75.5%, which is obtained by the proposed incremental method, and is significantly higher than 68.8%, which is associated with the best predictor, pknotsRG.

3.
Int J Med Chem ; 2013: 743139, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-25505989

RESUMEN

Human estrogen receptor (ER) isoforms, ERα and ERß, have long been an important focus in the field of biology. To better understand the structural features associated with the binding of ERα ligands to ERα and modulate their function, several QSAR models, including CoMFA, CoMSIA, SVR, and LR methods, have been employed to predict the inhibitory activity of 68 raloxifene derivatives. In the SVR and LR modeling, 11 descriptors were selected through feature ranking and sequential feature addition/deletion to generate equations to predict the inhibitory activity toward ERα. Among four descriptors that constantly appear in various generated equations, two agree with CoMFA and CoMSIA steric fields and another two can be correlated to a calculated electrostatic potential of ERα.

4.
Biosystems ; 90(1): 242-52, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17029765

RESUMEN

An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.


Asunto(s)
Biología Computacional/métodos , Computadores Moleculares , ADN/química , Biología Molecular/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos , Algoritmos , Simulación por Computador , Metodologías Computacionales , Matemática , Modelos Genéticos , Modelos Teóricos , Conformación de Ácido Nucleico , Nucleótidos
5.
Biosystems ; 81(1): 1-9, 2005 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-15917122

RESUMEN

Among various DNA computing algorithms, it is very common to create an initial data pool that covers correct and incorrect answers at first place followed by a series of selection process to destroy the incorrect ones. The surviving DNA sequences are read as the solutions to the problem. However, algorithms based on such a brute force search will be limited to the problem size. That is, as the number of parameters in the studied problem grows, eventually the algorithm becomes impossible owing to the tremendous initial data pool size. In this theoretical work, we modify a well-known sticker model to design an algorithm that does not require an initial data pool for SAT problem. We propose to build solution sequences in parts to satisfy one clause in a step, and eventually solve the whole Boolean formula after a number of steps. Accordingly, the size of data pool grows from one sort of molecule to the number of solution assignments. The proposed algorithm is expected to provide a solution to SAT problem and become practical as the problem size scales up.


Asunto(s)
Biología Computacional/métodos , ADN/química , Modelos Genéticos , Algoritmos , Secuencia de Bases , Simulación por Computador , Computadores Moleculares , Metodologías Computacionales , Modelos Teóricos , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Estadística como Asunto , Biología de Sistemas , Teoría de Sistemas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...