Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 1 de 1
Filtrar
Añadir filtros








Intervalo de año
1.
Chinese Journal of Biochemistry and Molecular Biology ; (12): 1153-1167, 2023.
Artículo en Chino | WPRIM | ID: wpr-1015620

RESUMEN

DNA double-strand break(DSB) is a serious form of DNA damage in cells, which is closely related to a variety of genomic instability diseases, including cancer, abnormal recombination and neuronal development. Due to the limitations of cost and technical threshold, high-resolution DSB mapping by high-throughput sequencing technology is very limited. This hinders our understanding of the DSB situation in the genomes of different species. Therefore, we developed a classification prediction model based on random Forest(RF), support vector machine(SVM) and logistic regression(LR) classifiers to predict DSB loci in the whole genome of human NHEK cells. In addition to the epigenetic features and DNA shape features commonly used in previous prediction studies, we found that DNA sequence features(kmer frequency, GC content, GC-skew, Mutual Information) can also characterize DSB sites. At the same time, the prediction accuracy is improved after considering DNA physical properties, chemical shifts and autocorrelation information. After combining all the above features, logistic regression(LR) has the best prediction performance(AUC = 0. 97), which is comparable to previous prediction(AUC = 0. 964). In addition, the optimal feature collection consisting of 294 features was obtained by the incremental feature search method, and the corresponding AUC value reached 0. 974.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA