Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add filters








Year range
1.
Article | IMSEAR | ID: sea-221381

ABSTRACT

The groundwork for extracting a significant amount of biomedical information from unstructured texts into structured formats is the difficult research area of biological entity recognition from medical documents. The existing work implemented the named entity recognition for diseases using the sequence labelling framework. The performance of this strategy, however, is not always adequate, and it frequently cannot fully exploit the semantic information in the dataset. The Syndrome Diseases Named Entity problem is presented in this work as a sequence labelling with multi-context learning. By using well-designed text/queries, this formulation may incorporate more previous information and to decode it using decoding techniques such conditional random fields (CRF). We performed experiments on three biomedical datasets, and the outcomes show how effective our methodology is on the BC5CDR-Disease, JNLPBA and NCBI-Disease, compared with other techniques our methodology performs with accuracy levels of 96.70%,98.65 and 96.72% respectively.

2.
Genomics, Proteomics & Bioinformatics ; (4): 52-64, 2020.
Article in English | WPRIM | ID: wpr-829027

ABSTRACT

Proteases are enzymes that cleave and hydrolyse the peptide bonds between two specific amino acid residues of target substrate proteins. Protease-controlled proteolysis plays a key role in the degradation and recycling of proteins, which is essential for various physiological processes. Thus, solving the substrate identification problem will have important implications for the precise understanding of functions and physiological roles of proteases, as well as for therapeutic target identification and pharmaceutical applicability. Consequently, there is a great demand for bioinformatics methods that can predict novel substrate cleavage events with high accuracy by utilizing both sequence and structural information. In this study, we present Procleave, a novel bioinformatics approach for predicting protease-specific substrates and specific cleavage sites by taking into account both their sequence and 3D structural information. Structural features of known cleavage sites were represented by discrete values using a LOWESS data-smoothing optimization method, which turned out to be critical for the performance of Procleave. The optimal approximations of all structural parameter values were encoded in a conditional random field (CRF) computational framework, alongside sequence and chemical group-based features. Here, we demonstrate the outstanding performance of Procleave through extensive benchmarking and independent tests. Procleave is capable of correctly identifying most cleavage sites in the case study. Importantly, when applied to the human structural proteome encompassing 17,628 protein structures, Procleave suggests a number of potential novel target substrates and their corresponding cleavage sites of different proteases. Procleave is implemented as a webserver and is freely accessible at http://procleave.erc.monash.edu/.

3.
Academic Journal of Second Military Medical University ; (12): 497-506, 2019.
Article in Chinese | WPRIM | ID: wpr-837969

ABSTRACT

Objective To propose a conditional random field (CRF) model based on the new word segmentation method Re-entity, and to compare with bi-directional long short-term memory neural network (BiLSTM)-CRF and Lattice-long short-term memory neural network (LSTM). Methods After analyzing the existing entity recognition methods, we proposed CRF method based on Re-entity, BiLSTM-CRF and Lattice-LSTM for the China Conference on Knowledge Graph and Semantic Computing in 2018 (CCKS2018) task one: Chinese clinical named entity recognition, and trained character vector sets at different parameter levels based on different corpora. The comparative experiments on model performance were carried out in the different neural network models for each methods. Finally, the comparative study was carried out based on different input lengths such as the sentence level and the text level. Results Re-entity method can improve the performance of CRF model. Lattice-LSTM model based on sentence level achieved a strict F1-measure of 89.75% on this task, which was higher than the highest F1-measure (89.25%) on the task one of CCKS2018. Conclusion The CRF model based on Re-entity can effectively improve the recognition rate of traditional Chinese medicines in electronic medical records by using normalized Chinese clinical drug. Re-entity method can improve the error accumulation caused by word segmentation in data preprocessing. Lattice structure can better combine the latent semantic information of characters and word sequences. At the same time, sentence-level input can effectively improve the recognition accuracy of neural network models.

4.
Academic Journal of Second Military Medical University ; (12): 903-908, 2018.
Article in Chinese | WPRIM | ID: wpr-838165

ABSTRACT

Objective To recognize cancer regions by using segmentation algorithm for pathological slices of gastric cancer based on deep learning. Methods The U-net network was used as the basic structure to design a deeper segmentation algorithm deeper U-Net (DU-Net) for gastric cancer pathological slices. The datasets were segmented into several small blocks by the region overlapping segmentation method. Then the blocks were firstly segmented by the pre-trained DU-Net model, and the new samples were re-synthesized using the image classifier to remove false positive samples. The new samples were repeatedly trained by repeated learning methods, and the results of segmentation were processed with fully connected conditional random field (CRF). Finally, the segmentation pictures of gastric cancer were obtained and validated. Results After 3 times of repeated learning, the mean accuracy of the DU-Net model for pathological slices of gastric cancer was 91.5%, and the mean intersection over union coefficient (IoU) was 88.4%. Compared with the basic DU-Net model without repeated learning, the mean accuracy and mean IoU of the DU-Net network were increased by 2.9% and 5.6%, respectively. Conclusion The segmentation algorithm for pathological slices of gastric cancer based on deep learning can accurately recognize cancer regions, improve the generalization ability and robustness of the model, and can be used for computer-assisted diagnosis of gastric cancer.

5.
World Science and Technology-Modernization of Traditional Chinese Medicine ; (12): 70-77, 2017.
Article in Chinese | WPRIM | ID: wpr-513107

ABSTRACT

Clinical cases of TCM are used as important clinical data to record the whole process of the interaction between doctors and patients in the form of text.However,in the context of big data,there is a lack of research on the use of information covered in clinical cases.Therefore,we studied the method of extracting the symptom term from the history of present illness in TCM clinic in this paper,in order to lay the foundation for the further use of clinical cases.First,twelve thousand,three hundred and sixty-seven history data of present illness were obtained by random selection and expert review.According to the different disease types,they were divided into the two groups of the experiments:4,838 data in the diabetes group,7,529 data in the spleen and stomach disease group and 12,367 data in the mixed or combined group.A glossary of symptom terms covering 22,996 words were compiled.Then,five feature templates,such as sliding window feature,prefix and suffix character and lexical features,were selected.CRFs model was adopted to carry out named entity extraction experiment.As a result,in the open test,the performance of diabetes,spleen and stomach disease and mixed group were (0.83,0.8,0.82),(0.9,0.9,0.89) and (0.88,0.87,0.87),respectively,while the results were (0.83,0.82,0.83),(0.95,0.95,0.95) and (0.93,0.92,0.92) in the ten-fold cross validation.In conclusion,the results showed that the CRFs algorithm was an excellent sequence labeling algorithm and applied to the named entity extraction task of symptom history.

SELECTION OF CITATIONS
SEARCH DETAIL