Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 11(1): 9817, 2021 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-33972606

RESUMO

Fast diagnosis and treatment of pneumothorax, a collapsed or dropped lung, is crucial to avoid fatalities. Pneumothorax is typically detected on a chest X-ray image through visual inspection by experienced radiologists. However, the detection rate is quite low due to the complexity of visual inspection for small lung collapses. Therefore, there is an urgent need for automated detection systems to assist radiologists. Although deep learning classifiers generally deliver high accuracy levels in many applications, they may not be useful in clinical practice due to the lack of high-quality and representative labeled image sets. Alternatively, searching in the archive of past cases to find matching images may serve as a "virtual second opinion" through accessing the metadata of matched evidently diagnosed cases. To use image search as a triaging or diagnosis assistant, we must first tag all chest X-ray images with expressive identifiers, i.e., deep features. Then, given a query chest X-ray image, the majority vote among the top k retrieved images can provide a more explainable output. In this study, we searched in a repository with more than 550,000 chest X-ray images. We developed the Autoencoding Thorax Net (short AutoThorax -Net) for image search in chest radiographs. Experimental results show that image search based on AutoThorax -Net features can achieve high identification performance providing a path towards real-world deployment. We achieved 92% AUC accuracy for a semi-automated search in 194,608 images (pneumothorax and normal) and 82% AUC accuracy for fully automated search in 551,383 images (normal, pneumothorax and many other chest diseases).

2.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 2186-2189, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33018440

RESUMO

Chest radiography has become the modality of choice for diagnosing pneumonia. However, analyzing chest X-ray images may be tedious, time-consuming and requiring expert knowledge that might not be available in less-developed regions. therefore, computer-aided diagnosis systems are needed. Recently, many classification systems based on deep learning have been proposed. Despite their success, the high development cost for deep networks is still a hurdle for deployment. Deep transfer learning (or simply transfer learning) has the merit of reducing the development cost by borrowing architectures from trained models followed by slight fine-tuning of some layers. Nevertheless, whether deep transfer learning is effective over training from scratch in the medical setting remains a research question for many applications. In this work, we investigate the use of deep transfer learning to classify pneumonia among chest X-ray images. Experimental results demonstrated that, with slight fine-tuning, deep transfer learning brings performance advantage over training from scratch. Three models, ResNet-50, Inception V3 and DensetNet121, were trained separately through transfer learning and from scratch. The former can achieve a 4.1% to 52.5% larger area under the curve (AUC) than those obtained by the latter, suggesting the effectiveness of deep transfer learning for classifying pneumonia in chest X-ray images.


Assuntos
Aprendizado Profundo , Pneumonia , Diagnóstico por Computador , Humanos , Pneumonia/diagnóstico por imagem , Radiografia , Raios X
3.
BMC Med Genomics ; 11(Suppl 5): 103, 2018 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-30453949

RESUMO

BACKGROUND: A protein family has similar and diverse functions locally conserved. An aligned pattern cluster (APC) can reflect the conserved functionality. Discovering aligned residue associations (ARAs) in APCs can reveal subtle inner working characteristics of conserved regions of protein families. However, ARAs corresponding to different functionalities/subgroups/classes could be entangled because of subtle multiple entwined factors. METHODS: To discover and disentangle patterns from mixed-mode datasets, such as APCs when the residues are replaced by their fundamental biochemical properties list, this paper presents a novel method, Extended Aligned Residual Association Discovery and Disentanglement (E-ARADD). E-ARADD discretizes the numerical dataset to transform the mixed-mode dataset into an event-value dataset, constructs an ARA Frequency Matrix and then converts it into an adjusted Statistical Residual (SR) Vector Space (SRV) capturing statistical deviation from randomness. By applying Principal Component (PC) Decomposition on SRV, PCs ranked by their variance are obtained. Finally, the disentangled ARAs are discovered when the projections on a PC is re-projected to a vector space with the same basis vectors of SRV. RESULTS: Experiments on synthetic, cytochrome c and class A scavenger data have shown that E-ARADD can a) disentangle the entwined ARAs in APCs (with residues or biochemical properties), b) reveal subtle AR clusters relating to classes, subtle subgroups or specific functionalities. CONCLUSIONS: E-ARADD can discover and disentangle ARs and ARAs entangled in functionality and location of protein families to reveal functional subgroups and subgroup characteristics of biological conserved regions. Experimental results on synthetic data provides the proof-of-concept validation on the successful disentanglement that reveals class-associated ARAs with or without class labels as input. Experiments on cytochrome c data proved the efficacy of E-ARADD in handing both types of residue data. Our novel methodology is not only able to discover and disentangle ARs and ARAs in specific statistical/functional (PCs and RSRVs) spaces, but also their locations in the protein family functional domains. The success of E-ARADD shows its great potential to proteomic research, drug discovery and precision and personalized genetic medicine.


Assuntos
Biologia Computacional/métodos , Algoritmos , Análise por Conglomerados , Citocromos c/química , Citocromos c/metabolismo , Análise de Componente Principal
4.
IEEE Trans Nanobioscience ; 17(3): 209-218, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29994222

RESUMO

Functional region identification is of fundamental importance for protein sequences analysis. Such knowledge provides better scientific understanding and could assist drug discovery. Up-to-date, domain annotation is one approach, but it needs to leverage existing databases. For de novo discovery, motif discovery locates and aligns locally homologous sub-sequences to obtain a position-weight matrix (PWM), which is a fixed-length representation model, whereas protein functional region size varies. It thus requires computational expensive exhaustive search to obtain a PWM with width of optimal range. This paper presents a new method known as pattern-directed aligned pattern clustering (PD-APCn) to discover and align patterns in conserved protein functional regions. It adopts aligned pattern cluster (APC) with patterns of variable length and strong support to direct the incremental APC expansion. It allows substitution and frame-shift mutations until a robust termination condition is reached. The concept of breakpoint gap is introduced to identify spots of mutations, such as substitution and frame shifts. Experiments on synthetic data sets with different sizes and noise levels showed that PD-APCn outperforms MEME with much higher recall and Fmeasure and computational speed 665 times faster that MEME. When applying to Cytochrome C and Ubiquitin families, it found all key binding sites within the APCs.


Assuntos
Biologia Computacional/métodos , Reconhecimento Automatizado de Padrão/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Análise por Conglomerados , Bases de Dados de Proteínas , Humanos , Proteínas/química , Proteínas/genética
5.
Proteomes ; 6(1)2018 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-29419792

RESUMO

A protein family has similar and diverse functions locally conserved as aligned sequence segments. Further discovering their association patterns could reveal subtle family subgroup characteristics. Since aligned residues associations (ARAs) in Aligned Pattern Clusters (APCs) are complex and intertwined due to entangled function, factors, and variance in the source environment, we have recently developed a novel method: Aligned Residue Association Discovery and Disentanglement (ARADD) to solve this problem. ARADD first obtains from an APC an ARA Frequency Matrix and converts it to an adjusted statistical residual vectorspace (SRV). It then disentangles the SRV into Principal Components (PCs) and Re-projects their vectors to a SRV to reveal succinct orthogonal AR groups. In this study, we applied ARADD to class A scavenger receptors (SR-A), a subclass of a diverse protein family binding to modified lipoproteins with diverse biological functionalities not explicitly known. Our experimental results demonstrated that ARADD can unveil subtle subgroups in sequence segments with diverse functionality and highly variable sequence lengths. We also demonstrated that the ARAs captured in a Position Weight Matrix or an APC were entangled in biological function and domain location but disentangled by ARADD to reveal different subclasses without knowing their actual occurrence positions.

6.
Methods ; 110: 26-34, 2016 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-27476008

RESUMO

Predicting Protein-Protein Interaction (PPI) is important for making new discoveries in the molecular mechanisms inside a cell. Traditionally, new PPIs are identified through biochemical experiments but such methods are labor-intensive, expensive, time-consuming and technically ineffective due to high false positive rates. Sequence-based prediction is currently the most readily applicable and cost-effective method. It exploits known PPI Databases to construct classifiers for predicting unknown PPIs based only on sequence data without requiring any other prior knowledge. Among existing sequence-based methods, most feature-based methods use exact sequence patterns with fixed length as features - a constraint which is biologically unrealistic. SVM with Pairwise String Kernel renders better predicting performance. However it is difficult to be biologically interpretable since it is kernel-based where no concrete feature values are computed. Here we have developed a novel method WeMine-P2P to overcome these drawbacks. By assuming that the regions/sites that mediate PPI are more conserved, WeMine-P2P first discovers/locates the conserved sequence patterns in protein sequences in the form of Aligned Pattern Clusters (APCs), allowing pattern variations with variable length. It then pairs up all APCs into a set of Co-Occurring APC (cAPC) pairs, and computes a cAPC-PPI score for each cAPC pair on all PPI pairs. It further constructs a feature vector composed of all cAPC pairs with their cAPC-PPI scores for each PPI pair and uses them for constructing a PPI predictor. Through 40 independent experiments, we showed that (1) WeMine-P2P outperforms the well-known algorithm, PIPE2, which also utilizes co-occurring amino acid sequence segments but does not allow variable lengths and pattern variations; (2) WeMine-P2P achieves satisfactory PPI prediction performance, comparable to the SVM-based methods particularly among unseen protein sequences with a potential reduction of feature dimension of 1280×; (3) Unlike SVM-based methods, WeMine-P2P renders interpretable biological features from which we observed that co-occurring sequence patterns from the compositional bias regions are more discriminative. WeMine-P2P is extendable to predict other biosequence interactions such as Protein-DNA interactions.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas/genética , Análise de Sequência de Proteína/métodos , Algoritmos , Sequência de Aminoácidos/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...