Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Small Methods ; : e2301585, 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38807543

RESUMO

DNA-based data storage is a new technology in computational and synthetic biology, that offers a solution for long-term, high-density data archiving. Given the critical importance of medical data in advancing human health, there is a growing interest in developing an effective medical data storage system based on DNA. Data integrity, accuracy, reliability, and efficient retrieval are all significant concerns. Therefore, this study proposes an Effective DNA Storage (EDS) approach for archiving medical MRI data. The EDS approach incorporates three key components (i) a novel fraction strategy to address the critical issue of rotating encoding, which often leads to data loss due to single base error propagation; (ii) a novel rule-based quaternary transcoding method that satisfies bio-constraints and ensure reliable mapping; and (iii) an indexing technique designed to simplify random search and access. The effectiveness of this approach is validated through computer simulations and biological experiments, confirming its practicality. The EDS approach outperforms existing methods, providing superior control over bio-constraints and reducing computational time. The results and code provided in this study open new avenues for practical DNA storage of medical MRI data, offering promising prospects for the future of medical data archiving and retrieval.

2.
JMIR Res Protoc ; 13: e52205, 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38329783

RESUMO

BACKGROUND: A considerable number of minors in the United States are diagnosed with developmental or psychiatric conditions, potentially influenced by underdiagnosis factors such as cost, distance, and clinician availability. Despite the potential of digital phenotyping tools with machine learning (ML) approaches to expedite diagnoses and enhance diagnostic services for pediatric psychiatric conditions, existing methods face limitations because they use a limited set of social features for prediction tasks and focus on a single binary prediction, resulting in uncertain accuracies. OBJECTIVE: This study aims to propose the development of a gamified web system for data collection, followed by a fusion of novel crowdsourcing algorithms with ML behavioral feature extraction approaches to simultaneously predict diagnoses of autism spectrum disorder and attention-deficit/hyperactivity disorder in a precise and specific manner. METHODS: The proposed pipeline will consist of (1) gamified web applications to curate videos of social interactions adaptively based on the needs of the diagnostic system, (2) behavioral feature extraction techniques consisting of automated ML methods and novel crowdsourcing algorithms, and (3) the development of ML models that classify several conditions simultaneously and that adaptively request additional information based on uncertainties about the data. RESULTS: A preliminary version of the web interface has been implemented, and a prior feature selection method has highlighted a core set of behavioral features that can be targeted through the proposed gamified approach. CONCLUSIONS: The prospect for high reward stems from the possibility of creating the first artificial intelligence-powered tool that can identify complex social behaviors well enough to distinguish conditions with nuanced differentiators such as autism spectrum disorder and attention-deficit/hyperactivity disorder. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): PRR1-10.2196/52205.

3.
Comput Biol Med ; 165: 107404, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37666064

RESUMO

DNA data storage is a promising technology that utilizes computer simulation, and synthetic biology, offering high-density and reliable digital information storage. It is challenging to store massive data in a small amount of DNA without losing the original data since nonspecific hybridization errors occur frequently and severely affect the reliability of stored data. This study proposes a novel biologically optimized encoding model for DNA data storage (BO-DNA) to overcome the reliability problem. BO-DNA model is developed by a new rule-based mapping method to avoid data drop during the transcoding of binary data to premier nucleotides. A customized optimization algorithm based on a tent chaotic map is applied to maximize the lower bounds that help to minimize the nonspecific hybridization errors. The robustness of BO-DNA is computed by four bio-constraints to confirm the reliability of newly generated DNA sequences. Experimentally, different medical images are encoded and decoded successfully with 12%-59% improved lower bounds and optimally constrained-based DNA sequences reported with 1.77bit/nt average density. BO-DNA's results demonstrate substantial advantages in constructing reliable DNA data storage.


Assuntos
Algoritmos , DNA , Simulação por Computador , Reprodutibilidade dos Testes , DNA/genética
4.
Front Genet ; 14: 1158337, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37021008

RESUMO

DNA is a practical storage medium with high density, durability, and capacity to accommodate exponentially growing data volumes. A DNA sequence structure is a biocomputing problem that requires satisfying bioconstraints to design robust sequences. Existing evolutionary approaches to DNA sequences result in errors during the encoding process that reduces the lower bounds of DNA coding sets used for molecular hybridization. Additionally, the disordered DNA strand forms a secondary structure, which is susceptible to errors during decoding. This paper proposes a computational evolutionary approach based on a synergistic moth-flame optimizer by Levy flight and opposition-based learning mutation strategies to optimize these problems by constructing reverse-complement constraints. The MFOS aims to attain optimal global solutions with robust convergence and balanced search capabilities to improve DNA code lower bounds and coding rates for DNA storage. The ability of the MFOS to construct DNA coding sets is demonstrated through various experiments that use 19 state-of-the-art functions. Compared with the existing studies, the proposed approach with three different bioconstraints substantially improves the lower bounds of the DNA codes by 12-28% and significantly reduces errors.

5.
Sci Rep ; 12(1): 8842, 2022 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-35614133

RESUMO

Today's growing phishing websites pose significant threats due to their extremely undetectable risk. They anticipate internet users to mistake them as genuine ones in order to reveal user information and privacy, such as login ids, pass-words, credit card numbers, etc. without notice. This paper proposes a new approach to solve the anti-phishing problem. The new features of this approach can be represented by URL character sequence without phishing prior knowledge, various hyperlink information, and textual content of the webpage, which are combined and fed to train the XGBoost classifier. One of the major contributions of this paper is the selection of different new features, which are capable enough to detect 0-h attacks, and these features do not depend on any third-party services. In particular, we extract character level Term Frequency-Inverse Document Frequency (TF-IDF) features from noisy parts of HTML and plaintext of the given webpage. Moreover, our proposed hyperlink features determine the relationship between the content and the URL of a webpage. Due to the absence of publicly available large phishing data sets, we needed to create our own data set with 60,252 webpages to validate the proposed solution. This data contains 32,972 benign webpages and 27,280 phishing webpages. For evaluations, the performance of each category of the proposed feature set is evaluated, and various classification algorithms are employed. From the empirical results, it was observed that the proposed individual features are valuable for phishing detection. However, the integration of all the features improves the detection of phishing sites with significant accuracy. The proposed approach achieved an accuracy of 96.76% with only 1.39% false-positive rate on our dataset, and an accuracy of 98.48% with 2.09% false-positive rate on benchmark dataset, which outperforms the existing baseline approaches.


Assuntos
Algoritmos , Segurança Computacional , Benchmarking , Coleta de Dados , Privacidade
6.
PeerJ Comput Sci ; 8: e877, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35494855

RESUMO

Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have been successfully applied to Natural Language Processing (NLP), especially in sentiment analysis. NLP can execute numerous functions to achieve significant results through RNN and CNN. Likewise, previous research shows that RNN achieved meaningful results than CNN due to extracting long-term dependencies. Meanwhile, CNN has its advantage; it can extract high-level features using its local fixed-size context at the input level. However, integrating these advantages into one network is challenging because of overfitting in training. Another problem with such models is the consideration of all the features equally. To this end, we propose an attention-based sentiment analysis using CNN and two independent bidirectional RNN networks to address the problems mentioned above and improve sentiment knowledge. Firstly, we apply a preprocessor to enhance the data quality by correcting spelling mistakes and removing noisy content. Secondly, our model utilizes CNN with max-pooling to extract contextual features and reduce feature dimensionality. Thirdly, two independent bidirectional RNN, i.e., Long Short-Term Memory and Gated Recurrent Unit are used to capture long-term dependencies. We also applied the attention mechanism to the RNN layer output to emphasize each word's attention level. Furthermore, Gaussian Noise and Dropout as regularization are applied to avoid the overfitting problem. Finally, we verify the model's robustness on four standard datasets. Compared with existing improvements on the most recent neural network models, the experiment results show that our model significantly outperformed the state-of-the-art models.

7.
Artigo em Inglês | MEDLINE | ID: mdl-35328897

RESUMO

Breast cancer death rates are higher than any other cancer in American women. Machine learning-based predictive models promise earlier detection techniques for breast cancer diagnosis. However, making an evaluation for models that efficiently diagnose cancer is still challenging. In this work, we proposed data exploratory techniques (DET) and developed four different predictive models to improve breast cancer diagnostic accuracy. Prior to models, four-layered essential DET, e.g., feature distribution, correlation, elimination, and hyperparameter optimization, were deep-dived to identify the robust feature classification into malignant and benign classes. These proposed techniques and classifiers were implemented on the Wisconsin Diagnostic Breast Cancer (WDBC) and Breast Cancer Coimbra Dataset (BCCD) datasets. Standard performance metrics, including confusion matrices and K-fold cross-validation techniques, were applied to assess each classifier's efficiency and training time. The models' diagnostic capability improved with our DET, i.e., polynomial SVM gained 99.3%, LR with 98.06%, KNN acquired 97.35%, and EC achieved 97.61% accuracy with the WDBC dataset. We also compared our significant results with previous studies in terms of accuracy. The implementation procedure and findings can guide physicians to adopt an effective model for a practical understanding and prognosis of breast cancer tumors.


Assuntos
Neoplasias da Mama , Feminino , Humanos , Algoritmos , Mama , Neoplasias da Mama/diagnóstico , Aprendizado de Máquina , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...