Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 49
Filter
1.
Oncol Lett ; 16(4): 4713-4720, 2018 Oct.
Article in English | MEDLINE | ID: mdl-30214605

ABSTRACT

Using whole-exome sequencing (WES) for the detection of chromosomal aberrations from tumor samples has become increasingly popular, as it is cost-effective and time efficient. However, factors which present in WES tumor samples, including diversity in exon size, batch effect and tumor impurity, can complicate the identification of somatic mutation in each region of the exon. To address these issues, the authors of the present study have developed a novel method, PECNV, for the detection of genomic copy number variants and loss of heterozygosity in WES datasets. PECNV combines normalized logarithm ratio of read counts (Log Ratio) and B allele frequency (BAF), and then employs expectation maximization (EM) algorithm to estimate parameters involved in the models. A comprehensive assessment of PECNV of PECNV was performed by analyzing simulated datasets contaminated with different normal cell proportion and eight real primary triple-negative breast cancer samples. PECNV demonstrated superior results compared with ExomeCNV and EXCAVATOR for the detection of genomic aberrations in WES data.

2.
Mol Biosyst ; 12(7): 2224-32, 2016 06 21.
Article in English | MEDLINE | ID: mdl-27153230

ABSTRACT

Recently, accumulating studies have indicated that microRNAs (miRNAs) play an important role in exploring the pathogenesis of various human diseases at the molecular level and may result in the design of specific tools for diagnosis, treatment evaluation and prevention. Experimental identification of disease-related miRNAs is time-consuming and labour-intensive. Hence, there is a stressing need to propose efficient computational methods to detect more potential miRNA-disease associations. Currently, several computational approaches for identifying disease-related miRNAs on the miRNA-disease network have gained much attention by means of integrating miRNA functional similarities and disease semantic similarities. However, these methods rarely consider the network topological similarity of the miRNA-disease association network. Here, in this paper we develop an improved computational method named NTSMDA that is based on known miRNA-disease network topological similarity to exploit more potential disease-related miRNAs. We achieve an AUC of 89.4% by using the leave-one-out cross-validation experiment, demonstrating the excellent predictive performance of NTSMDA. Furthermore, predicted highly ranked miRNA-disease associations of breast neoplasms, lung neoplasms and prostatic neoplasms are manually confirmed by different related databases and literature, providing evidence for the good performance and potential value of the NTSMDA method in inferring miRNA-disease associations. The R code and readme file of NTSMDA can be downloaded from .


Subject(s)
Computational Biology/methods , Genetic Association Studies , Genetic Predisposition to Disease , MicroRNAs/genetics , Algorithms , Databases, Nucleic Acid , Gene Regulatory Networks , Humans , ROC Curve , Reproducibility of Results
3.
Biomed Mater Eng ; 26 Suppl 1: S1845-53, 2015.
Article in English | MEDLINE | ID: mdl-26405956

ABSTRACT

Application of the Next generation sequencing (NGS) technology has demonstrated that most tumor samples exhibit intra-tumor heterogeneity. Here we proposed SAPPH (Somatic Aberrations Prediction for Paired Heterogeneous tumor samples), as a new method for estimating tumor somatic copy number aberrations as well as inferring tumor subclone proportions from heterogeneous tumor sequencing data. This method is based on CBS and local proportion clustering strategy. When SAPPH is applied on simulated tumor samples, the agreement between the results analyzed by SAPPH and the sequencing signals suggests that SAPPH can find the solution to best fit the signal distributions. We benchmark the performance of SAPPH and show that it outperforms existing method in estimating tumor copy number aberrations.


Subject(s)
Clone Cells , DNA Copy Number Variations/genetics , DNA, Neoplasm/genetics , Genes, Neoplasm/genetics , High-Throughput Nucleotide Sequencing/methods , Neoplasms/genetics , Base Sequence , Computer Simulation , Gene Dosage/genetics , Humans , Models, Genetic , Models, Statistical , Molecular Sequence Data , Mutation/genetics , Sequence Analysis, DNA/methods
4.
PLoS One ; 10(6): e0129835, 2015.
Article in English | MEDLINE | ID: mdl-26111017

ABSTRACT

BACKGROUND: Tumor single nucleotide polymorphism (SNP) array is a common platform for investigating the cancer genomic aberration and the functionally important altered genes. Original SNP array signals are usually corrupted by noise, and need to be de-convoluted into absolute copy number profile by analytical methods. Unfortunately, in contrast with the popularity of tumor Affymetrix SNP array, the methods that are specifically designed for this platform are still limited. The complicated characteristics of noise in signals is one of the difficulties for dissecting tumor Affymetrix SNP array data, as they inevitably blur the distinction between aberrations and create an obstacle for the copy number aberration (CNA) identification. RESULTS: We propose a tool named TAFFYS for comprehensive analysis of tumor Affymetrix SNP array data. TAFFYS introduce a wavelet-based de-noising approach and copy number-specific signal variance model for suppressing and modelling the noise in signals. Then a hidden Markov model is employed for copy number inference. Finally, by using the absolute copy number profile, statistical significance of each aberration region is calculated in term of different aberration types, including amplification, deletion and loss of heterozygosity (LOH). The result shows that copy number specific-variance model and wavelet de-noising algorithm fits well with the Affymetrix SNP array signals, leading to more accurate estimation for diluted tumor sample (even with only 30% of cancer cells) than other existed methods. Results of examinations also demonstrate a good compatibility and extensibility for different Affymetrix SNP array platforms. Application on the 35 breast tumor samples shows that TAFFYS can automatically dissect the tumor samples and reveal statistically significant aberration regions where cancer-related genes locate. CONCLUSIONS: TAFFYS provide an efficient and convenient tool for identifying the copy number alteration and allelic imbalance and assessing the recurrent aberrations for the tumor Affymetrix SNP array data.


Subject(s)
Genomics/methods , Neoplasms/genetics , Oligonucleotide Array Sequence Analysis/methods , Polymorphism, Single Nucleotide , DNA Copy Number Variations , Gene Expression Regulation, Neoplastic , Humans , Loss of Heterozygosity , Software
5.
IET Syst Biol ; 8(2): 24-32, 2014 Apr.
Article in English | MEDLINE | ID: mdl-25014222

ABSTRACT

The authors describe an integrated method for analysing cancer driver aberrations and disrupted pathways by using tumour single nucleotide polymorphism (SNP) arrays. The authors new method adopts a novel statistical model to explicitly quantify the SNP signals, and therefore infers the genomic aberrations, including copy number alteration and loss of heterozygosity. Examination on the dilution series dataset shows that this method can correctly identify the genomic aberrations even with the existence of severe normal cell contamination in tumour sample. Furthermore, with the results of the aberration identification obtained from multiple tumour samples, a permutation-based approach is proposed for identifying the statistically significant driver aberrations, which are further incorporated with the known signalling pathways for pathway enrichment analysis. By applying the approach to 286 hepatocellular tumour samples, they successfully uncover numerous driver aberration regions across the cancer genome, for example, chromosomes 4p and 5q, which harbour many known hepatocellular cancer related genes such as alpha-fetoprotein (AFP) and ectodermal-neural cortex (ENC1). In addition, they identify nine disrupted pathways that are highly enriched by the driver aberrations, including the systemic lupus erythematosus pathway, the vascular endothelial growth factor (VEGF) signalling pathway and so on. These results support the feasibility and the utility of the proposed method on the characterisation of the cancer genome and the downstream analysis of the driver aberrations and the disrupted signalling pathways.


Subject(s)
Carcinoma, Hepatocellular/genetics , Liver Neoplasms/genetics , Neoplasms/genetics , Polymorphism, Single Nucleotide , Algorithms , Alleles , Chromosome Aberrations , Computational Biology , Data Interpretation, Statistical , Gene Expression Regulation, Neoplastic , Genotype , Humans , Loss of Heterozygosity , Markov Chains , Sequence Analysis, DNA , Signal Transduction
6.
PLoS One ; 9(1): e87212, 2014.
Article in English | MEDLINE | ID: mdl-24498045

ABSTRACT

Genomic copy number alteration and allelic imbalance are distinct features of cancer cells, and recent advances in the genotyping technology have greatly boosted the research in the cancer genome. However, the complicated nature of tumor usually hampers the dissection of the SNP arrays. In this study, we describe a bioinformatic tool, named GIANT, for genome-wide identification of somatic aberrations from paired normal-tumor samples measured with SNP arrays. By efficiently incorporating genotype information of matched normal sample, it accurately detects different types of aberrations in cancer genome, even for aneuploid tumor samples with severe normal cell contamination. Furthermore, it allows for discovery of recurrent aberrations with critical biological properties in tumorigenesis by using statistical significance test. We demonstrate the superior performance of the proposed method on various datasets including tumor replicate pairs, simulated SNP arrays and dilution series of normal-cancer cell lines. Results show that GIANT has the potential to detect the genomic aberration even when the cancer cell proportion is as low as 5∼10%. Application on a large number of paired tumor samples delivers a genome-wide profile of the statistical significance of the various aberrations, including amplification, deletion and LOH. We believe that GIANT represents a powerful bioinformatic tool for interpreting the complex genomic aberration, and thus assisting both academic study and the clinical treatment of cancer.


Subject(s)
DNA Copy Number Variations/genetics , Genome, Human/genetics , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide/genetics , Cell Line , Cell Line, Tumor , Genotype , Humans
7.
Amino Acids ; 46(4): 1069-78, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24452754

ABSTRACT

Reversible protein phosphorylation is one of the most important post-translational modifications, which regulates various biological cellular processes. Identification of the kinase-specific phosphorylation sites is helpful for understanding the phosphorylation mechanism and regulation processes. Although a number of computational approaches have been developed, currently few studies are concerned about hierarchical structures of kinases, and most of the existing tools use only local sequence information to construct predictive models. In this work, we conduct a systematic and hierarchy-specific investigation of protein phosphorylation site prediction in which protein kinases are clustered into hierarchical structures with four levels including kinase, subfamily, family and group. To enhance phosphorylation site prediction at all hierarchical levels, functional information of proteins, including gene ontology (GO) and protein-protein interaction (PPI), is adopted in addition to primary sequence to construct prediction models based on random forest. Analysis of selected GO and PPI features shows that functional information is critical in determining protein phosphorylation sites for every hierarchical level. Furthermore, the prediction results of Phospho.ELM and additional testing dataset demonstrate that the proposed method remarkably outperforms existing phosphorylation prediction methods at all hierarchical levels. The proposed method is freely available at http://bioinformatics.ustc.edu.cn/phos_pred/.


Subject(s)
Protein Kinases/metabolism , Proteins/chemistry , Proteins/metabolism , Amino Acid Motifs , Computational Biology , Databases, Protein , Gene Regulatory Networks , Humans , Phosphorylation , Protein Interaction Maps
8.
PLoS One ; 8(10): e78197, 2013.
Article in English | MEDLINE | ID: mdl-24205155

ABSTRACT

BACKGROUND: As one of the most common types of co-regulatory motifs, feed-forward loops (FFLs) control many cell functions and play an important role in human cancers. Therefore, it is crucial to reconstruct and analyze cancer-related FFLs that are controlled by transcription factor (TF) and microRNA (miRNA) simultaneously, in order to find out how miRNAs and TFs cooperate with each other in cancer cells and how they contribute to carcinogenesis. Current FFL studies rely on predicted regulation information and therefore suffer the false positive issue in prediction results. More critically, FFLs generated by existing approaches cannot represent the dynamic and conditional regulation relationship under different experimental conditions. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we proposed a novel filter-wrapper feature selection method to accurately identify co-regulatory mechanism by incorporating prior information from predicted regulatory interactions with parallel miRNA/mRNA expression datasets. By applying this method, we reconstructed 208 and 110 TF-miRNA co-regulatory FFLs from human pan-cancer and prostate datasets, respectively. Further analysis of these cancer-related FFLs showed that the top-ranking TF STAT3 and miRNA hsa-let-7e are key regulators implicated in human cancers, which have regulated targets significantly enriched in cellular process regulations and signaling pathways that are involved in carcinogenesis. CONCLUSIONS/SIGNIFICANCE: In this study, we introduced an efficient computational approach to reconstruct co-regulatory FFLs by accurately identifying gene co-regulatory interactions. The strength of the proposed feature selection method lies in the fact it can precisely filter out false positives in predicted regulatory interactions by quantitatively modeling the complex co-regulation of target genes mediated by TFs and miRNAs simultaneously. Moreover, the proposed feature selection method can be generally applied to other gene regulation studies using parallel expression data with respect to different biological contexts.


Subject(s)
Gene Expression Regulation, Neoplastic/genetics , Gene Regulatory Networks/genetics , MicroRNAs/genetics , Neoplasms/genetics , Transcription Factors/genetics , Carcinogenesis/genetics , Humans , Models, Genetic , STAT3 Transcription Factor/genetics , Signal Transduction/genetics
9.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 28(3): 579-81, 586, 2011 Jun.
Article in Chinese | MEDLINE | ID: mdl-21774227

ABSTRACT

A new method of automatic detection of brain lesion based on wavelet feature vector of CT images has been proposed in the present paper. Firstly, we created training samples by manually segmenting normal CT images into gray matter, white matter and cerebrospinal fluid sub images. Then, we obtained the cluster centers using FCM clustering algorithm. When detecting lesions, the CT images to be detected was automatically segmented into sub images, with a certain degree of over-segmenting allowed under the premise of ensuring accuracy as much as possible. Then we extended these sub images and extracted the features to compute the distances with the cluster centers and to determine whether they belonged to the three kinds of normal samples, or, otherwise, belonged to lesions. The proposed method was verified by experiments.


Subject(s)
Brain Neoplasms/diagnostic imaging , Image Interpretation, Computer-Assisted/methods , Intracranial Hemorrhages/diagnostic imaging , Tomography, X-Ray Computed/methods , Wavelet Analysis , Brain/diagnostic imaging , Electronic Data Processing/methods , Humans
10.
Article in English | MEDLINE | ID: mdl-19963990

ABSTRACT

In this paper, we present a method that detects intracranial space-occupying lesions in two-dimensional (2D) brain high-resolution CT images. Use of statistical texture atlas technique localizes anatomy variation in the gray level distribution of brain images, and in turn, identifies the regions with lesions. The statistical texture atlas involves 147 HRCT slices of normal individuals and its construction is extremely time-consuming. To improve the performance of atlas construction, we have implemented the pixel-wise texture extraction procedure on Nvidia 8800GTX GPU with Compute Unified Device Architecture (CUDA) platform. Experimental results indicate that the extracted texture feature is distinctive and robust enough, and is suitable for detecting uniform and mixed density space-occupying lesions. In addition, a significant speedup against straight forward CPU version was achieved with CUDA.


Subject(s)
Brain Diseases/diagnostic imaging , Brain Diseases/diagnosis , Diagnosis, Computer-Assisted/methods , Tomography, X-Ray Computed/methods , Algorithms , Biomedical Engineering , Biostatistics , Brain/anatomy & histology , Brain/diagnostic imaging , Computer Graphics , Diagnosis, Computer-Assisted/statistics & numerical data , Humans , Reference Values , Tomography, X-Ray Computed/statistics & numerical data
11.
Zhongguo Yi Liao Qi Xie Za Zhi ; 33(3): 172-5, 2009 Mar.
Article in Chinese | MEDLINE | ID: mdl-19771889

ABSTRACT

Based on the deep analysis of existing fingerprint identification algorithms, this article proposes an integrative solution to adopt the fingerprint identification technology into EMRS Electronic Medical Records System. It may improve the security of EMRS and raise the working efficiency of physicians effectively.


Subject(s)
Dermatoglyphics , Medical Records Systems, Computerized , Algorithms , Humans
12.
Zhongguo Yi Liao Qi Xie Za Zhi ; 33(2): 83-6, 149, 2009 Mar.
Article in Chinese | MEDLINE | ID: mdl-19565789

ABSTRACT

The user experience (EX) of current Electronic Medical Record systems (EMR) is needed to improve. This paper proposed a new method to enhance EX of EMR. Firstly, system template and text characterization are used to make the EMR data structured. Then, the structured date are mined based on mining the association rules of incremental updating data to find the association of the elements of template of EMR and the values of elements. Finally, with the help of mined results, the users of EMR are able to input data effectively and quickly.


Subject(s)
Data Mining/methods , Electronic Health Records , Medical Records Systems, Computerized , Information Systems , User-Computer Interface
13.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 26(2): 258-63, 2009 Apr.
Article in Chinese | MEDLINE | ID: mdl-19499782

ABSTRACT

Grey system theory was applied in analysis of Electroencephalogram (EEG) to extract features of driving fatigue in this study. Model GM(1,1) was built for EEG collected during simulative driving experiments. At the same time, the data of steering wheel movements and subjective fatigue level were analyzed as reference. The results of experiments reveal that the co-deviation of Model GM(1,1) parameter a and b, cov(a,b), coincides with the standard deviation of steering wheel movements. This indicates that Grey system theory is effective for EEG analysis and the parameters of GM(1,1) can well reflect the change of driving fatigue.


Subject(s)
Automobile Driving/psychology , Computer Simulation , Electroencephalography , Fatigue/physiopathology , Models, Theoretical , Adult , Electroencephalography/methods , Humans , Male
14.
Zhongguo Yi Liao Qi Xie Za Zhi ; 33(1): 7-10, 2009 Jan.
Article in Chinese | MEDLINE | ID: mdl-19459342

ABSTRACT

To detect lesions of brain CT automatically, a statistical atlas of attribute vectors (SAAV) was designed and created to describe the multiple features of medical images. By comparing the features of study image with those of SAAV, we successfully detected various kinds of brain lesion. It was demonstrated that the algorithm is effective in detecting various kinds of lesions found on brain CT images. Further studies are needed to make the algorithm more acceptable.


Subject(s)
Algorithms , Brain Diseases/diagnostic imaging , Image Processing, Computer-Assisted , Numerical Analysis, Computer-Assisted , Humans , Tomography, X-Ray Computed
15.
Bioinformation ; 2(7): 301-3, 2008 Apr 11.
Article in English | MEDLINE | ID: mdl-18478083

ABSTRACT

Gene selection is to detect the most significantly expressed genes under different conditions expression data. The current challenge in gene selection is the comparison of a large number of genes with limited patient samples. Thus it is trivial task in simple statistical analysis. Various statistical measurements are adopted by filter methods applied in gene selection studies. Their ability to discriminate phenotypes is crucial in classification and selection. Here we describe the standard deviation error distribution (SDED) method for gene selection. It utilizes variations within-class and among-class in gene expression data. We tested the method using 4 leukemia datasets available in the public domain. The method was compared with the GS2 and CHO methods. The Prediction accuracies by SDED are better than both GS2 and CHO for different datasets. These are 0.8-4.2% and 1.6-8.4% more that in GS2 and CHO. The related OMIM annotations and KEGG pathways analyses verified that SDED can pick out more 4.0% and 6.1% genes with biological significance than GS2 and CHO, respectively.

16.
Zhongguo Yi Liao Qi Xie Za Zhi ; 31(3): 185-8, 2007 May.
Article in Chinese | MEDLINE | ID: mdl-17672364

ABSTRACT

A spectrometer is one of the most important parts in a Magnetic Resonance Imaging (MRI) system. This paper describes the design of a digital MRI spectrometer. It is constructed on a PXI platform with several data acquisition boards and a high-resolution timing board. All functions of a MRI spectrometer are realized by the specially- designed software. The software architecture and its implementing details are discussed and experimental results are introduced.


Subject(s)
Magnetic Resonance Imaging/instrumentation , Magnetic Resonance Imaging/methods , Equipment Design , Signal Processing, Computer-Assisted , Software Design
17.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 24(2): 249-52, 2007 Apr.
Article in Chinese | MEDLINE | ID: mdl-17591235

ABSTRACT

Detrended fluctuation analysis (DFA) is fit for studies on the long-range exponential correlations of non-stationary time serial. In this paper, for elucidating the characteristics of different sleep stages, DFA is adopted to analyze the physiological data collected during sleep. The parameters such as electroencephalogram (EEG), R-R interval sequence and stroke volume (SV) are analyzed, and the scaling exponent a is calculated. The experimental results reveal that the values of a differ much in different sleep stages,that the rules of EEG and SV are alike, that alpha increases with the deepening of sleep, but in inverse for R-R interval sequence that alpha decreases with the deepening of sleep. These indicate that the method of DFA is practical in the analysis of physiological parameters.


Subject(s)
Electrocardiography/statistics & numerical data , Electroencephalography/statistics & numerical data , Signal Processing, Computer-Assisted , Sleep Stages/physiology , Data Interpretation, Statistical , Humans , Polysomnography
18.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 24(2): 444-8, 2007 Apr.
Article in Chinese | MEDLINE | ID: mdl-17591278

ABSTRACT

This paper is devoted to predicting the transmembrane helices in proteins by statistical modeling. A novel segment-training algorithm for Hidden Markov modeling based on the biological characters of transmembrane proteins has been introduced into training and predicting the topological characters of transmembrane helices such as location and orientation. Compared to the standard Balm-Welch training algorithm, this algorithm has lower complexity while prediction performance is better than or at least comparable to other existing methods. With a 10-fold cross-validation test on a database containing 160 transmembrane proteins, an HMM model trained with this algorithm outperformed two other prediction methods: TMHMM and MEMSTAT; the novel method was validated by its prediction sensitivity (97.0%) and correct location (91.3%). The results showed that this algorithm is an efficient and a reasonable supplement to modeling and prediction of transmembrane helices.


Subject(s)
Algorithms , Membrane Proteins/chemistry , Models, Statistical , Protein Conformation , Data Interpretation, Statistical , Mathematical Computing
19.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 23(5): 960-3, 2006 Oct.
Article in Chinese | MEDLINE | ID: mdl-17121331

ABSTRACT

Mental workload research is important to people's health and work efficiency, Psychophysiological measures such as electroencephalography (EEG), ECG and respiration measures can be used to predict mental workload level. A Multi-channel phase-space reconstruction method is proposed in this paper which rearranges signal serials by the correlation coefficients and select time delay by signal determinism. The study of determinism and correlation dimension on simulative data exhibits a good performance. The result of EEG series shows a clearly consistency to workload level variety. The method is useful for multi-channel signals nonlinear analysis and mental workload detection.


Subject(s)
Electroencephalography , Mental Processes/physiology , Task Performance and Analysis , Adult , Algorithms , Humans , Nonlinear Dynamics , Signal Processing, Computer-Assisted , Workload
20.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 23(5): 1109-13, 2006 Oct.
Article in Chinese | MEDLINE | ID: mdl-17121365

ABSTRACT

Residues in protein sequences can be classified into two (exposed / buried) or three (exposed/intermediate/buried) states according to their relative solvent accessibility. Markov chain model (MCM) had been adopted for statistical modeling and prediction. Different orders of MCM and classification thresholds were explored to find the best parameters. Prediction results for two different data sets and different cut-off thresholds were evaluated and compared with some existing methods, such as neural network, information theory and support vector machine. The best prediction accuracies achieved by the MCM method were 78.9% for the two-state prediction problem and 67.7% for the three-state prediction problem, respectively. A comprehensive comparison for all these results shows that the prediction accuracy and the correlative coefficient of the MCM method are better than or comparable to those obtained by the other prediction methods. At the same time, the advantage of this method is the lower computation complexity and better time-consuming performance.


Subject(s)
Computational Biology/methods , Markov Chains , Models, Chemical , Models, Molecular , Proteins/classification , Sequence Analysis, Protein/methods , Algorithms , Databases, Protein , Proteins/chemistry , Solubility
SELECTION OF CITATIONS
SEARCH DETAIL
...