Pesquisa | Portal Regional da BVS

1.

Deep Learning Model for Prediction of Bronchopulmonary Dysplasia in Preterm Infants Using Chest Radiographs.

Chou, Hao-Yang; Lin, Yung-Chieh; Hsieh, Sun-Yuan; Chou, Hsin-Hung; Lai, Cheng-Shih; Wang, Bow; Tsai, Yi-Shan.

J Imaging Inform Med ; 2024 Mar 18.

Artigo em Inglês | MEDLINE | ID: mdl-38499706

RESUMO

Bronchopulmonary dysplasia (BPD) is common in preterm infants and may result in pulmonary vascular disease, compromising lung function. This study aimed to employ artificial intelligence (AI) techniques to help physicians accurately diagnose BPD in preterm infants in a timely and efficient manner. This retrospective study involves two datasets: a lung region segmentation dataset comprising 1491 chest radiographs of infants, and a BPD prediction dataset comprising 1021 chest radiographs of preterm infants. Transfer learning of a pre-trained machine learning model was employed for lung region segmentation and image fusion for BPD prediction to enhance the performance of the AI model. The lung segmentation model uses transfer learning to achieve a dice score of 0.960 for preterm infants with ≤ 168 h postnatal age. The BPD prediction model exhibited superior diagnostic performance compared to that of experts and demonstrated consistent performance for chest radiographs obtained at ≤ 24 h postnatal age, and those obtained at 25 to 168 h postnatal age. This study is the first to use deep learning on preterm chest radiographs for lung segmentation to develop a BPD prediction model with an early detection time of less than 24 h. Additionally, this study compared the model's performance according to both NICHD and Jensen criteria for BPD. Results demonstrate that the AI model surpasses the diagnostic accuracy of experts in predicting lung development in preterm infants.

2.

A novel deep learning-based algorithm combining histopathological features with tissue areas to predict colorectal cancer survival from whole-slide images.

Li, Yan-Jun; Chou, Hsin-Hung; Lin, Peng-Chan; Shen, Meng-Ru; Hsieh, Sun-Yuan.

J Transl Med ; 21(1): 731, 2023 10 17.

Artigo em Inglês | MEDLINE | ID: mdl-37848862

RESUMO

BACKGROUND: Many methodologies for selecting histopathological images, such as sample image patches or segment histology from regions of interest (ROIs) or whole-slide images (WSIs), have been utilized to develop survival models. With gigapixel WSIs exhibiting diverse histological appearances, obtaining clinically prognostic and explainable features remains challenging. Therefore, we propose a novel deep learning-based algorithm combining tissue areas with histopathological features to predict cancer survival. METHODS: The Cancer Genome Atlas Colon Adenocarcinoma (TCGA-COAD) dataset was used in this investigation. A deep convolutional survival model (DeepConvSurv) extracted histopathological information from the image patches of nine different tissue types, including tumors, lymphocytes, stroma, and mucus. The tissue map of the WSIs was segmented using image processing techniques that involved localizing and quantifying the tissue region. Six survival models with the concordance index (C-index) were used as the evaluation metrics. RESULTS: We extracted 128 histopathological features from four histological types and five tissue area features from WSIs to predict colorectal cancer survival. Our method performed better in six distinct survival models than the Whole Slide Histopathological Images Survival Analysis framework (WSISA), which adaptively sampled patches using K-means from WSIs. The best performance using histopathological features was 0.679 using LASSO-Cox. Compared to histopathological features alone, tissue area features increased the C-index by 2.5%. Based on histopathological features and tissue area features, our approach achieved performance of 0.704 with RIDGE-Cox. CONCLUSIONS: A deep learning-based algorithm combining histopathological features with tissue area proved clinically relevant and effective for predicting cancer survival.

Assuntos

Adenocarcinoma , Neoplasias do Colo , Aprendizado Profundo , Humanos , Algoritmos , Processamento de Imagem Assistida por Computador

3.

Slice-Fusion: Reducing False Positives in Liver Tumor Detection for Mask R-CNN.

Tu, Deng-Yao; Lin, Peng-Chan; Chou, Hsin-Hung; Shen, Meng-Ru; Hsieh, Sun-Yuan.

IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 3267-3277, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37027274

RESUMO

Automatic liver tumor detection from computed tomography (CT) makes clinical examinations more accurate. However, deep learning-based detection algorithms are characterized by high sensitivity and low precision, which hinders diagnosis given that false-positive tumors must first be identified and excluded. These false positives arise because detection models incorrectly identify partial volume artifacts as lesions, which in turn stems from the inability to learn the perihepatic structure from a global perspective. To overcome this limitation, we propose a novel slice-fusion method in which mining the global structural relationship between the tissues in the target CT slices and fusing the features of adjacent slices according to the importance of the tissues. Furthermore, we design a new network based on our slice-fusion method and Mask R-CNN detection model, called Pinpoint-Net. We evaluated proposed model on the Liver Tumor Segmentation Challenge (LiTS) dataset and our liver metastases dataset. Experiments demonstrated that our slice-fusion method not only enhance tumor detection ability via reducing the number of false-positive tumors smaller than 10mm, but also improve segmentation performance. Without bells and whistles, a single Pinpoint-Net showed outstanding performance in liver tumor detection and segmentation on LiTS test dataset compared with other state-of-the-art models.

Assuntos

Processamento de Imagem Assistida por Computador , Neoplasias Hepáticas , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Neoplasias Hepáticas/diagnóstico por imagem , Abdome

4.

Pathogenicity Prediction of Single Amino Acid Variants With Machine Learning Model Based on Protein Structural Energies.

Wu, Tzu-Hsuan; Lin, Peng-Chan; Chou, Hsin-Hung; Shen, Meng-Ru; Hsieh, Sun-Yuan.

IEEE/ACM Trans Comput Biol Bioinform ; 20(1): 606-615, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-34962874

RESUMO

The most popular tools for predicting pathogenicity of single amino acid variants (SAVs) were developed based on sequence-based techniques. SAVs may change protein structure and function. In the context of van der Waals force and disulfide bridge calculations, no method directly predicts the impact of mutations on the energies of the protein structure. Here, we combined machine learning methods and energy scores of protein structures calculated by Rosetta Energy Function 2015 to predict SAV pathogenicity. The accuracy level of our model (0.76) is higher than that of six prediction tools. Further analyses revealed that the differential reference energies, attractive energies, and solvation of polar atoms between wildtype and mutant side-chains played essential roles in distinguishing benign from pathogenic variants. These features indicated the physicochemical properties of amino acids, which were observed in 3D structures instead of sequences. We added 16 features to Rhapsody (the prediction tool we used for our data set) and consequently improved its performance. The results indicated that these energy scores were more appropriate and more detailed representations of the pathogenicity of SAVs.

Assuntos

Aminoácidos , Proteínas , Aminoácidos/química , Virulência , Proteínas/química , Mutação/genética , Termodinâmica

5.

Novel Algorithm for Improved Protein Classification Using Graph Similarity.

Chou, Hsin-Hung; Hsu, Ching-Tien; Hsu, Chin-Wei; Yao, Kai-Hsun; Wang, Hao-Ching; Hsieh, Sun-Yuan.

IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3135-3143, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-34748498

RESUMO

Considerable sequence data are produced in genome annotation projects that relate to molecular levels, structural similarities, and molecular and biological functions. In structural genomics, the most essential task involves resolving protein structures efficiently with hardware or software, understanding these structures, and assigning their biological functions. Understanding the characteristics and functions of proteins enables the exploration of the molecular mechanisms of life. In this paper, we examine the problems of protein classification. Because they perform similar biological functions, proteins in the same family usually share similar structural characteristics. We employed this premise in designing a classification algorithm. In this algorithm, auxiliary graphs are used to represent proteins, with every amino acid in a protein to a vertex in a graph. Moreover, the links between amino acids correspond to the edges between the vertices. The proposed algorithm classifies proteins according to the similarities in their graphical structures. The proposed algorithm is efficient and accurate in distinguishing proteins from different families and outperformed related algorithms experimentally.

Assuntos

Algoritmos , Proteínas , Humanos , Proteínas/genética , Proteínas/química , Software , Genoma

6.

A Novel Branch-and-Bound Algorithm for the Protein Folding Problem in the 3D HP Model.

Chou, Hsin-Hung; Hsu, Ching-Tien; Chen, Li-Hsuan; Lin, Yue-Cheng; Hsieh, Sun-Yuan.

IEEE/ACM Trans Comput Biol Bioinform ; 18(2): 455-462, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-31403440

RESUMO

The protein folding problem (PFP) is an important issue in bioinformatics and biochemical physics. One of the most widely studied models of protein folding is the hydrophobic-polar (HP) model introduced by Dill. The PFP in the three-dimensional (3D) lattice HP model has been shown to be NP-complete; the proposed algorithms for solving the problem can therefore only find near-optimal energy structures for most long benchmark sequences within acceptable time periods. In this paper, we propose a novel algorithm based on the branch-and-bound approach to solve the PFP in the 3D lattice HP model. For 10 48-monomer benchmark sequences, our proposed algorithm finds the lowest energies so far within comparable computation times than previous methods.

Assuntos

Biologia Computacional/métodos , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Algoritmos , Proteínas/química , Proteínas/metabolismo

7.

Plasma proteome plus site-specific N-glycoprofiling for hepatobiliary carcinomas.

Chang, Ting-Tsung; Cheng, Ji-Hong; Tsai, Hung-Wen; Young, Kung-Chia; Hsieh, Sun-Yuan; Ho, Cheng-Hsun.

J Pathol Clin Res ; 5(3): 199-212, 2019 07.

Artigo em Inglês | MEDLINE | ID: mdl-31136099

RESUMO

Hepatobiliary cancer is the third leading cause of cancer death worldwide. Appropriate markers for early diagnosis, monitoring of disease progression, and prediction of postsurgical outcome are still lacking. As the majority of circulating N-glycoproteins are originated from the hepatobiliary system, we sought to explore new markers by assessing the dynamics of N-glycoproteome in plasma samples from patients with hepatocellular carcinoma (HCC), cholangiocarcinoma (CCA), or combined HCC and CCA (cHCC-CCA). Using a mass spectrometry-based quantitative proteomic approach, we found that 57 of 5358 identified plasma proteins were differentially expressed in hepatobiliary cancers. The levels of four essential proteins, including complement C3 and apolipoprotein C-III in HCC, galectin-3-binding protein in CCA, and 72 kDa inositol polyphosphate 5-phosphatase in cHCC-CCA, were highly correlated with tumor stage, tumor grade, recurrence-free survival, and overall survival. Postproteomic site-specific N-glycan analyses showed that human complement C3 bears high-mannose and hybrid glycoforms rather than complex glycoforms at Asn85. The abundance of complement C3 with mannose-5 or mannose-6 glycoform at Asn85 was associated with HCC tumor grade. Furthermore, stepwise Cox regression analyses revealed that HCC patients with a hybrid glycoform at Asn85 of complement C3 had a lower postsurgery tumor recurrence rate or mortality rate than those with a low amount of complement C3 protein. In conclusion, our data show that particular plasma N-glycoproteins with specific N-glycan compositions could be potential noninvasive markers to evaluate oncological status and prognosis of hepatobiliary cancers.

Assuntos

Neoplasias dos Ductos Biliares/sangue , Biomarcadores Tumorais/sangue , Carcinoma Hepatocelular/sangue , Colangiocarcinoma/sangue , Glicoproteínas/sangue , Neoplasias Hepáticas/sangue , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Proteômica

8.

eDRAM: Effective early disease risk assessment with matrix factorization on a large-scale medical database: A case study on rheumatoid arthritis.

Chin, Chu-Yu; Hsieh, Sun-Yuan; Tseng, Vincent S.

PLoS One ; 13(11): e0207579, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30475847

RESUMO

Recently, a number of analytical approaches for probing medical databases have been developed to assist in disease risk assessment and to determine the association of a clinical condition with others, so that better and intelligent healthcare can be provided. The early assessment of disease risk is an emerging topic in medical informatics. If diseases are detected at an early stage, prognosis can be improved and medical resources can be used more efficiently. For example, if rheumatoid arthritis (RA) is detected at an early stage, appropriate medications can be used to prevent bone deterioration. In early disease risk assessment, finding important risk factors from large-scale medical databases and performing individual disease risk assessment have been challenging tasks. A number of recent studies have considered risk factor analysis approaches, such as association rule mining, sequential rule mining, regression, and expert advice. In this study, to improve disease risk assessment, machine learning and matrix factorization techniques were integrated to discover important and implicit risk factors. A novel framework is proposed that can effectively assess early disease risks, and RA is used as a case study. This framework comprises three main stages: data preprocessing, risk factor optimization, and early disease risk assessment. This is the first study integrating matrix factorization and machine learning for disease risk assessment that is applied to a nation-wide and longitudinal medical diagnostic database. In the experimental evaluations, a cohort established from a large-scale medical database was used that included 1007 RA-diagnosed patients and 921,192 control patients examined over a nine-year follow-up period (2000-2008). The evaluation results demonstrate that the proposed approach is more efficient and stable for disease risk assessment than state-of-the-art methods.

Assuntos

Artrite Reumatoide/diagnóstico , Aprendizado de Máquina , Adulto , Idoso , Artrite Reumatoide/patologia , Estudos de Coortes , Bases de Dados Factuais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Medição de Risco , Fatores de Risco

9.

Detecting exact breakpoints of deletions with diversity in hepatitis B viral genomic DNA from next-generation sequencing data.

Cheng, Ji-Hong; Liu, Wen-Chun; Chang, Ting-Tsung; Hsieh, Sun-Yuan; Tseng, Vincent S.

Methods ; 129: 24-32, 2017 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-28802713

RESUMO

Many studies have suggested that deletions of Hepatitis B Viral (HBV) are associated with the development of progressive liver diseases, even ultimately resulting in hepatocellular carcinoma (HCC). Among the methods for detecting deletions from next-generation sequencing (NGS) data, few methods considered the characteristics of virus, such as high evolution rates and high divergence among the different HBV genomes. Sequencing high divergence HBV genome sequences using the NGS technology outputs millions of reads. Thus, detecting exact breakpoints of deletions from these big and complex data incurs very high computational cost. We proposed a novel analytical method named VirDelect (Virus Deletion Detect), which uses split read alignment base to detect exact breakpoint and diversity variable to consider high divergence in single-end reads data, such that the computational cost can be reduced without losing accuracy. We use four simulated reads datasets and two real pair-end reads datasets of HBV genome sequence to verify VirDelect accuracy by score functions. The experimental results show that VirDelect outperforms the state-of-the-art method Pindel in terms of accuracy score for all simulated datasets and VirDelect had only two base errors even in real datasets. VirDelect is also shown to deliver high accuracy in analyzing the single-end read data as well as pair-end data. VirDelect can serve as an effective and efficient bioinformatics tool for physiologists with high accuracy and efficient performance and applicable to further analysis with characteristics similar to HBV on genome length and high divergence. The software program of VirDelect can be downloaded at https://sourceforge.net/projects/virdelect/.

Assuntos

Carcinoma Hepatocelular/genética , Deleção de Genes , Vírus da Hepatite B/genética , Hepatite B/genética , Carcinoma Hepatocelular/virologia , Variação Genética , Genoma Viral/genética , Hepatite B/virologia , Vírus da Hepatite B/patogenicidade , Sequenciamento de Nucleotídeos em Larga Escala , Humanos

10.

A Faster cDNA Microarray Gene Expression Data Classifier for Diagnosing Diseases.

Hsieh, Sun-Yuan; Chou, Yu-Chun.

IEEE/ACM Trans Comput Biol Bioinform ; 13(1): 43-54, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-26336139

RESUMO

Profiling cancer molecules has several advantages; however, using microarray technology in routine clinical diagnostics is challenging for physicians. The classification of microarray data has two main limitations: 1) the data set is unreliable for building classifiers; and 2) the classifiers exhibit poor performance. Current microarray classification algorithms typically yield a high rate of false-positives cases, which is unacceptable in diagnostic applications. Numerous algorithms have been developed to detect false-positive cases; however, they require a considerable computation time. To address this problem, this study enhanced a previously proposed gene expression graph (GEG)-based classifier to shorten the computation time. The modified classifier filters genes by using an edge weight to determine their significance, thereby facilitating accurate comparison and classification. This study experimentally compared the proposed classifier with a GEG-based classifier by using real data and benchmark tests. The results show that the proposed classifier is faster at detecting false-positives.

Assuntos

DNA Complementar/genética , Perfilação da Expressão Gênica/métodos , Técnicas de Diagnóstico Molecular/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , DNA Complementar/análise , Humanos , Modelos Estatísticos

11.

A new branch and bound method for the protein folding problem under the 2D-HP model.

Hsieh, Sun-Yuan; Lai, De-Wei.

IEEE Trans Nanobioscience ; 10(2): 69-75, 2011 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-21742572

RESUMO

The protein folding problem is a fundamental problem in computational molecular biology and biochemical physics. The previously best known branch and bound method for the protein folding problem may find optimal or near-optimal energy structure from the benchmark sequences, but the total computation time is rather lengthy because it usually needs to run a great deal of simulating tests or else lack of accuracy. In this paper, we develop a new branch and bound method for the the protein folding problem under the two-dimensional HP model to overcome the mentioned drawbacks. By using benchmark sequences for evaluation, we demonstrate that the performance of our method is superior than previously known methods. Moreover, our method is a simple, flexible and easily implemented one for the protein folding problem.

Assuntos

Algoritmos , Biologia Computacional/métodos , Modelos Químicos , Dobramento de Proteína , Proteínas/química , Sequência de Aminoácidos , Aminoácidos , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Dados de Sequência Molecular , Conformação Proteica , Proteínas/metabolismo

12.

An improved heuristic algorithm for finding motif signals in DNA sequences.

Huang, Chao-Wen; Lee, Wun-Shiun; Hsieh, Sun-Yuan.

IEEE/ACM Trans Comput Biol Bioinform ; 8(4): 959-75, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-20855921

RESUMO

The planted (l, d)-motif search problem is a mathematical abstraction of the DNA functional site discovery task. In this paper, we propose a heuristic algorithm that can find planted (l, d)-signals in a given set of DNA sequences. Evaluations on simulated data sets demonstrate that the proposed algorithm outperforms current widely used motif finding algorithms. We also report the results of experiments on real biological data sets.

Assuntos

Algoritmos , Biologia Computacional/métodos , Sequência Consenso , DNA/química , Reconhecimento Automatizado de Padrão/métodos , Análise de Sequência de DNA/métodos , Animais , Sequência de Bases , Sítios de Ligação , Bovinos , Galinhas , Simulação por Computador , DNA/genética , Humanos , Camundongos , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Suínos , Tetraodontiformes

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA