Pesquisa | Portal Regional da BVS (teste)

An automated framework for NMR resonance assignment through simultaneous slice picking and spin system forming.

Abbas, Ahmed; Guo, Xianrong; Jing, Bing-Yi; Gao, Xin.

J Biomol NMR ; 59(2): 75-86, 2014 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-24748536

RESUMO

Despite significant advances in automated nuclear magnetic resonance-based protein structure determination, the high numbers of false positives and false negatives among the peaks selected by fully automated methods remain a problem. These false positives and negatives impair the performance of resonance assignment methods. One of the main reasons for this problem is that the computational research community often considers peak picking and resonance assignment to be two separate problems, whereas spectroscopists use expert knowledge to pick peaks and assign their resonances at the same time. We propose a novel framework that simultaneously conducts slice picking and spin system forming, an essential step in resonance assignment. Our framework then employs a genetic algorithm, directed by both connectivity information and amino acid typing information from the spin systems, to assign the spin systems to residues. The inputs to our framework can be as few as two commonly used spectra, i.e., CBCA(CO)NH and HNCACB. Different from the existing peak picking and resonance assignment methods that treat peaks as the units, our method is based on 'slices', which are one-dimensional vectors in three-dimensional spectra that correspond to certain ([Formula: see text]) values. Experimental results on both benchmark simulated data sets and four real protein data sets demonstrate that our method significantly outperforms the state-of-the-art methods while using a less number of spectra than those methods. Our method is freely available at http://sfb.kaust.edu.sa/Pages/Software.aspx.

Assuntos

Algoritmos , Ressonância Magnética Nuclear Biomolecular , Simulação por Computador , Humanos , Modelos Moleculares , Curva ROC , Saccharomyces cerevisiae/metabolismo , Thermotoga maritima/metabolismo

Automatic peak selection by a Benjamini-Hochberg-based algorithm.

Abbas, Ahmed; Kong, Xin-Bing; Liu, Zhi; Jing, Bing-Yi; Gao, Xin.

PLoS One ; 8(1): e53112, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23308147

RESUMO

A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into [Formula: see text]-values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx.

Assuntos

Algoritmos , Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas/química , Biologia Computacional/métodos , Software

WaVPeak: picking NMR peaks through wavelet-based smoothing and volume-based filtering.

Liu, Zhi; Abbas, Ahmed; Jing, Bing-Yi; Gao, Xin.

Bioinformatics ; 28(7): 914-20, 2012 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-22328784

RESUMO

MOTIVATION: Nuclear magnetic resonance (NMR) has been widely used as a powerful tool to determine the 3D structures of proteins in vivo. However, the post-spectra processing stage of NMR structure determination usually involves a tremendous amount of time and expert knowledge, which includes peak picking, chemical shift assignment and structure calculation steps. Detecting accurate peaks from the NMR spectra is a prerequisite for all following steps, and thus remains a key problem in automatic NMR structure determination. RESULTS: We introduce WaVPeak, a fully automatic peak detection method. WaVPeak first smoothes the given NMR spectrum by wavelets. The peaks are then identified as the local maxima. The false positive peaks are filtered out efficiently by considering the volume of the peaks. WaVPeak has two major advantages over the state-of-the-art peak-picking methods. First, through wavelet-based smoothing, WaVPeak does not eliminate any data point in the spectra. Therefore, WaVPeak is able to detect weak peaks that are embedded in the noise level. NMR spectroscopists need the most help isolating these weak peaks. Second, WaVPeak estimates the volume of the peaks to filter the false positives. This is more reliable than intensity-based filters that are widely used in existing methods. We evaluate the performance of WaVPeak on the benchmark set proposed by PICKY (Alipanahi et al., 2009), one of the most accurate methods in the literature. The dataset comprises 32 2D and 3D spectra from eight different proteins. Experimental results demonstrate that WaVPeak achieves an average of 96%, 91%, 88%, 76% and 85% recall on (15)N-HSQC, HNCO, HNCA, HNCACB and CBCA(CO)NH, respectively. When the same number of peaks are considered, WaVPeak significantly outperforms PICKY. AVAILABILITY: WaVPeak is an open source program. The source code and two test spectra of WaVPeak are available at http://faculty.kaust.edu.sa/sites/xingao/Pages/Publications.aspx. The online server is under construction. CONTACT: statliuzhi@xmu.edu.cn; ahmed.abbas@kaust.edu.sa; majing@ust.hk; xin.gao@kaust.edu.sa.

Assuntos

Processamento Eletrônico de Dados/métodos , Espectroscopia de Ressonância Magnética , Proteínas/química , Software , Análise de Ondaletas

Empirical likelihood-based confidence intervals for the sensitivity of a continuous-scale diagnostic test at a fixed level of specificity.

Davis, Angela E; Jing, Bing-Yi.

Stat Methods Med Res ; 20(3): 217-31, 2011 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-19654172

RESUMO

For a continuous-scale diagnostic test, it is often of interest to find the range of the sensitivity of the test at the cut-off that yields a desired specificity. In this article, we first define a profile empirical likelihood ratio for the sensitivity of a continuous-scale diagnostic test and show that its limiting distribution is a scaled chi-square distribution. We then propose two new empirical likelihood-based confidence intervals for the sensitivity of the test at a fixed level of specificity by using the scaled chi-square distribution. Simulation studies are conducted to compare the finite sample performance of the newly proposed intervals with the existing intervals for the sensitivity in terms of coverage probability. A real example is used to illustrate the application of the recommended methods.

Assuntos

Testes Diagnósticos de Rotina/estatística & dados numéricos , Área Sob a Curva , Antígeno CA-19-9/sangue , Distribuição de Qui-Quadrado , Intervalos de Confiança , Humanos , Funções Verossimilhança , Modelos Estatísticos , Neoplasias Pancreáticas/sangue , Neoplasias Pancreáticas/diagnóstico , Sensibilidade e Especificidade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA