Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
1.
World J Pediatr ; 2024 Feb 24.
Article in English | MEDLINE | ID: mdl-38401044

ABSTRACT

INTRODUCTION: Methylmalonic acidemia (MMA) is a disorder of autosomal recessive inheritance, with an estimated prevalence of 1:50,000. First-tier clinical diagnostic tests often return many false positives [five false positive (FP): one true positive (TP)]. In this work, our goal was to refine a classification model that can minimize the number of false positives, currently an unmet need in the upstream diagnostics of MMA. METHODS: We developed machine learning multivariable screening models for MMA with utility as a secondary-tier tool for false positives reduction. We utilized mass spectrometry-based features consisting of 11 amino acids and 31 carnitines derived from dried blood samples of neonatal patients, followed by additional ratio feature construction. Feature selection strategies (selection by filter, recursive feature elimination, and learned vector quantization) were used to determine the input set for evaluating the performance of 14 classification models to identify a candidate model set for an ensemble model development. RESULTS: Our work identified computational models that explore metabolic analytes to reduce the number of false positives without compromising sensitivity. The best results [area under the receiver operating characteristic curve (AUROC) of 97%, sensitivity of 92%, and specificity of 95%] were obtained utilizing an ensemble of the algorithms random forest, C5.0, sparse linear discriminant analysis, and autoencoder deep neural network stacked with the algorithm stochastic gradient boosting as the supervisor. The model achieved a good performance trade-off for a screening application with 6% false-positive rate (FPR) at 95% sensitivity, 35% FPR at 99% sensitivity, and 39% FPR at 100% sensitivity. CONCLUSIONS: The classification results and approach of this research can be utilized by clinicians globally, to improve the overall discovery of MMA in pediatric patients. The improved method, when adjusted to 100% precision, can be used to further inform the diagnostic process journey of MMA and help reduce the burden for patients and their families.

2.
Orphanet J Rare Dis ; 19(1): 51, 2024 Feb 08.
Article in English | MEDLINE | ID: mdl-38331897

ABSTRACT

BACKGROUND: Pitt-Hopkins syndrome (PTHS) is a neurodevelopmental disorder that remains underdiagnosed and its clinical presentations and mutation profiles in a diverse population are yet to be evaluated. This retrospective study aims to investigate the clinical and genetic characteristics of Chinese patients with PTHS. METHODS: The clinical, biochemical, genetic, therapeutic, and follow-up data of 47 pediatric patients diagnosed with PTHS between 2018 and 2021 were retrospectively analyzed. RESULTS: The Chinese PTHS patients presented with specific facial features and exhibited global developmental delay of wide severity range. The locus heterogeneity of the TCF4 gene in the patients was highlighted, emphasizing the significance of genetic studies for accurate diagnosis, albeit no significant correlations between genotype and phenotype were observed in this cohort. The study also reports the outcomes of patients who underwent therapeutic interventions, such as ketogenic diets and biomedical interventions. CONCLUSIONS: The findings of this retrospective analysis expand the phenotypic and molecular spectra of PTHS patients. The study underscores the need for a long-term prospective follow-up study to assess potential therapeutic interventions.


Subject(s)
Intellectual Disability , Child , Humans , Retrospective Studies , Follow-Up Studies , Prospective Studies , Transcription Factor 4/genetics , Intellectual Disability/genetics , Intellectual Disability/diagnosis , Hyperventilation/genetics , Hyperventilation/diagnosis , Facies , China
4.
Orphanet J Rare Dis ; 18(1): 102, 2023 05 02.
Article in English | MEDLINE | ID: mdl-37189159

ABSTRACT

BACKGROUND: The peroxisome is a ubiquitous single membrane-enclosed organelle with an important metabolic role. Peroxisomal disorders represent a class of medical conditions caused by deficiencies in peroxisome function and are segmented into enzyme-and-transporter defects (defects in single peroxisomal proteins) and peroxisome biogenesis disorders (defects in the peroxin proteins, critical for normal peroxisome assembly and biogenesis). In this study, we employed multivariate supervised and non-supervised statistical methods and utilized mass spectrometry data of neurological patients, peroxisomal disorder patients (X-linked adrenoleukodystrophy and Zellweger syndrome), and healthy controls to analyze the role of common metabolites in peroxisomal disorders, to develop and refine a classification models of X-linked adrenoleukodystrophy and Zellweger syndrome, and to explore analytes with utility in rapid screening and diagnostics. RESULTS: T-SNE, PCA, and (sparse) PLS-DA, operated on mass spectrometry data of patients and healthy controls were utilized in this study. The performance of exploratory PLS-DA models was assessed to determine a suitable number of latent components and variables to retain for sparse PLS-DA models. Reduced-features (sparse) PLS-DA models achieved excellent classification performance of X-linked adrenoleukodystrophy and Zellweger syndrome patients. CONCLUSIONS: Our study demonstrated metabolic differences between healthy controls, neurological patients, and peroxisomal disorder (X-linked adrenoleukodystrophy and Zellweger syndrome) patients, refined classification models and showed the potential utility of hexacosanoylcarnitine (C26:0-carnitine) as a screening analyte for Chinese patients in the context of a multivariate discriminant model predictive of peroxisomal disorders.


Subject(s)
Adrenoleukodystrophy , Peroxisomal Disorders , Zellweger Syndrome , Child , Humans , Adrenoleukodystrophy/diagnosis , East Asian People , Multivariate Analysis , Peroxisomal Disorders/diagnosis , Peroxisomal Disorders/metabolism , Zellweger Syndrome/diagnosis , Zellweger Syndrome/metabolism , China
5.
Commun Biol ; 5(1): 975, 2022 09 16.
Article in English | MEDLINE | ID: mdl-36114280

ABSTRACT

The quality control of variants from whole-genome sequencing data is vital in clinical diagnosis and human genetics research. However, current filtering methods (Frequency, Hard-Filter, VQSR, GARFIELD, and VEF) were developed to be utilized on particular variant callers and have certain limitations. Especially, the number of eliminated true variants far exceeds the number of removed false variants using these methods. Here, we present an adaptive method for quality control on genetic variants from different analysis pipelines, and validate it on the variants generated from four popular variant callers (GATK HaplotypeCaller, Mutect2, Varscan2, and DeepVariant). FVC consistently exhibited the best performance. It removed far more false variants than the current state-of-the-art filtering methods and recalled ~51-99% true variants filtered out by the other methods. Once trained, FVC can be conveniently integrated into a user-specific variant calling pipeline.


Subject(s)
Exome , High-Throughput Nucleotide Sequencing , High-Throughput Nucleotide Sequencing/methods , Humans , Polymorphism, Single Nucleotide , Software , Whole Genome Sequencing
6.
Methods Inf Med ; 60(5-06): 123-132, 2021 12.
Article in English | MEDLINE | ID: mdl-34695871

ABSTRACT

BACKGROUND: AI-enabled Clinical Decision Support Systems (AI + CDSSs) were heralded to contribute greatly to the advancement of health care services. There is an increased availability of monetary funds and technical expertise invested in projects and proposals targeting the building and implementation of such systems. Therefore, understanding the actual system implementation status in clinical practice is imperative. OBJECTIVES: The aim of the study is to understand (1) the current situation of AI + CDSSs clinical implementations in Chinese hospitals and (2) concerns regarding AI + CDSSs current and future implementations. METHODS: We investigated 160 tertiary hospitals from six provinces and province-level cities. Descriptive analysis, two-sided Fisher exact test, and Mann-Whitney U-test were utilized for analysis. RESULTS: Thirty-eight of the surveyed hospitals (23.75%) had implemented AI + CDSSs. There were statistical differences on grade, scales, and medical volume between the two groups of hospitals (implemented vs. not-implemented AI + CDSSs, p <0.05). On the 5-point Likert scale, 81.58% (31/38) of respondents rated their overall satisfaction with the systems as "just neutral" to "satisfied." The three most common concerns were system functions improvement and integration into the clinical process, data quality and availability, and methodological bias. CONCLUSION: While AI + CDSSs were not yet widespread in Chinese clinical settings, professionals recognize the potential benefits and challenges regarding in-hospital AI + CDSSs.


Subject(s)
Decision Support Systems, Clinical , Artificial Intelligence , China , Hospitals , Surveys and Questionnaires
7.
Neurogenetics ; 22(3): 161-169, 2021 07.
Article in English | MEDLINE | ID: mdl-34128147

ABSTRACT

Pitt-Hopkins syndrome is an underdiagnosed neurodevelopmental disorder which is characterized by specific facial features, early-onset developmental delay, and moderate to severe intellectual disability. The genetic cause, a deficiency of the TCF4 gene, has been established; however, the underlying pathological mechanisms of this disease are still unclear. Herein, we report four unrelated children with different de novo mutations (T606A, K607E, R578C, and V617I) located at highly conserved sites and with clinical phenotypes which present variable degrees of developmental delay and intellectual disability. Three of these four missense mutations have not yet been reported. The patient with V617I mutation exhibits mild intellectual disability and has attained more advanced motor and verbal skills, which is significantly different from other cases reported to date. Molecular dynamics simulations are used to explore the atomic level mechanism of how missense mutations impair the functions of TCF4. Mutations T606A, K607E, and R578C are found to affect DNA binding directly or indirectly, while V617I only induces subtle conformational changes, which is consistent with the milder clinical phenotype of the corresponding patient. The study expands the mutation spectrum and phenotypic characteristics of Pitt-Hopkins syndrome, and reinforces the genotype-phenotype correlation and strengthens the understanding of phenotype variability, which is helpful for further investigation of pathogenetic mechanisms and improved genetic counseling.


Subject(s)
Genetic Association Studies , Hyperventilation/genetics , Intellectual Disability/genetics , Mutation, Missense/genetics , Phenotype , Child , Child, Preschool , Facies , Female , Genetic Association Studies/methods , Genotype , Humans , Infant , Male , Transcription Factor 4/genetics
8.
Orphanet J Rare Dis ; 16(1): 262, 2021 06 08.
Article in English | MEDLINE | ID: mdl-34103049

ABSTRACT

BACKGROUND: Rare diseases are ailments which impose a heavy burden on individual patients and global society as a whole. The rare disease management landscape is not a smooth one-a rare disease is quite often hard to diagnose, treat, and investigate. In China, the country's rapid economic rise and development has brought an increased focus on rare diseases. At present, there is a growing focus placed on the importance and public health priority of rare diseases and on improving awareness, definitions, and treatments. METHODS: In this work we utilized clinical data from the Shanghai HIE System to characterize the status of 33 rare diseases with effective treatment in Shanghai for the time period of 2013-2016. RESULTS AND CONCLUSION: First, we describe the total number of patients, year-to-year change in new patients with diagnosis in one of the target diseases and the distribution of gender and age for the top six (by patient number) diseases of the set of 33 rare diseases. Second, we describe the hospitalization burden in terms of in-hospital ratio, length of stay, and medical expenses during hospitalization. Finally, rare disease period prevalence is calculated for the rare diseases set.


Subject(s)
Hospitalization , Rare Diseases , China , Hospitals , Humans , Prevalence
9.
J Med Internet Res ; 23(6): e25929, 2021 06 02.
Article in English | MEDLINE | ID: mdl-34076581

ABSTRACT

BACKGROUND: Clinical decision support systems are designed to utilize medical data, knowledge, and analysis engines and to generate patient-specific assessments or recommendations to health professionals in order to assist decision making. Artificial intelligence-enabled clinical decision support systems aid the decision-making process through an intelligent component. Well-defined evaluation methods are essential to ensure the seamless integration and contribution of these systems to clinical practice. OBJECTIVE: The purpose of this study was to develop and validate a measurement instrument and test the interrelationships of evaluation variables for an artificial intelligence-enabled clinical decision support system evaluation framework. METHODS: An artificial intelligence-enabled clinical decision support system evaluation framework consisting of 6 variables was developed. A Delphi process was conducted to develop the measurement instrument items. Cognitive interviews and pretesting were performed to refine the questions. Web-based survey response data were analyzed to remove irrelevant questions from the measurement instrument, to test dimensional structure, and to assess reliability and validity. The interrelationships of relevant variables were tested and verified using path analysis, and a 28-item measurement instrument was developed. Measurement instrument survey responses were collected from 156 respondents. RESULTS: The Cronbach α of the measurement instrument was 0.963, and its content validity was 0.943. Values of average variance extracted ranged from 0.582 to 0.756, and values of the heterotrait-monotrait ratio ranged from 0.376 to 0.896. The final model had a good fit (χ262=36.984; P=.08; comparative fit index 0.991; goodness-of-fit index 0.957; root mean square error of approximation 0.052; standardized root mean square residual 0.028). Variables in the final model accounted for 89% of the variance in the user acceptance dimension. CONCLUSIONS: User acceptance is the central dimension of artificial intelligence-enabled clinical decision support system success. Acceptance was directly influenced by perceived ease of use, information quality, service quality, and perceived benefit. Acceptance was also indirectly influenced by system quality and information quality through perceived ease of use. User acceptance and perceived benefit were interrelated.


Subject(s)
Decision Support Systems, Clinical , Artificial Intelligence , Humans , Reproducibility of Results , Surveys and Questionnaires
10.
Genomics Proteomics Bioinformatics ; 19(4): 534-548, 2021 08.
Article in English | MEDLINE | ID: mdl-33713851

ABSTRACT

Transcriptional regulators (TRs) participate in essential processes in cancer pathogenesis and are critical therapeutic targets. Identification of drug response-related TRs from cell line-based compound screening data is often challenging due to low mRNA abundance of TRs, protein modifications, and other confounders (CFs). In this study, we developed a regression-based pharmacogenomic and ChIP-seq data integration method (RePhine) to infer the impact of TRs on drug response through integrative analyses of pharmacogenomic and ChIP-seq data. RePhine was evaluated in simulation and pharmacogenomic data and was applied to pan-cancer datasets with the goal of biological discovery. In simulation data with added noises or CFs and in pharmacogenomic data, RePhine demonstrated an improved performance in comparison with three commonly used methods (including Pearson correlation analysis, logistic regression model, and gene set enrichment analysis). Utilizing RePhine and Cancer Cell Line Encyclopedia data, we observed that RePhine-derived TR signatures could effectively cluster drugs with different mechanisms of action. RePhine predicted that loss-of-function of EZH2/PRC2 reduces cancer cell sensitivity toward the BRAF inhibitor PLX4720. Experimental validation confirmed that pharmacological EZH2 inhibition increases the resistance of cancer cells to PLX4720 treatment. Our results support that RePhine is a useful tool for inferring drug response-related TRs and for potential therapeutic applications. The source code for RePhine is freely available at https://github.com/coexps/RePhine.


Subject(s)
Neoplasms , Transcription Factors , Humans , Neoplasms/drug therapy , Neoplasms/genetics , Pharmacogenetics , Protein Processing, Post-Translational , Software , Transcription Factors/genetics , Transcription Factors/metabolism
11.
FEBS Open Bio ; 11(7): 1841-1853, 2021 07.
Article in English | MEDLINE | ID: mdl-33085832

ABSTRACT

Understanding the regulation of cardiac muscle contraction at a molecular level is crucial for the development of therapeutics for heart conditions. Despite the availability of atomic structures of the protein components of cardiac muscle thin filaments, detailed insights into their dynamics and response to calcium are yet to be fully depicted. In this study, we used molecular dynamics simulations of the core domains of the cardiac muscle protein troponin to characterize the equilibrium dynamics of its calcium-bound and calcium-free forms, with a focus on elements of cardiac muscle contraction activation and deactivation, that is, calcium binding to the cardiac troponin Ca2+ -binding subunit (TnC) and the release of the switch region of the troponin inhibitory subunit (TnI) from TnC. The process of calcium binding to the TnC binding site is described as a three-step process commencing with calcium capture by the binding site residues, followed by cooperative residue interplay bringing the calcium ion to the binding site, and finally, calcium-water exchange. Furthermore, we uncovered a set of TnC-TnI interdomain interactions that are critical for TnC N-lobe hydrophobic pocket dynamics. Absence of these interactions allows the closure of the TnC N-lobe hydrophobic pocket while the TnI switch region remains expelled, whereas if the interactions are maintained, the hydrophobic pocket remains open. Modification of these interactions may fine-tune the ability of the TnC N-lobe hydrophobic pocket to close or remain open, modulate cardiac contractility and present potential therapy-relevant targets.


Subject(s)
Calcium , Troponin C , Calcium/metabolism , Molecular Dynamics Simulation , Signal Transduction , Troponin C/chemistry , Troponin C/metabolism , Troponin I/chemistry , Troponin I/metabolism
12.
Front Bioeng Biotechnol ; 8: 573866, 2020.
Article in English | MEDLINE | ID: mdl-33195135

ABSTRACT

Nuclei segmentation is a fundamental but challenging task in histopathological image analysis. One of the main problems is the existence of overlapping regions which increases the difficulty of independent nuclei separation. In this study, to solve the segmentation of nuclei and overlapping regions, we introduce a nuclei segmentation method based on two-stage learning framework consisting of two connected Stacked U-Nets (SUNets). The proposed SUNets consists of four parallel backbone nets, which are merged by the attention generation model. In the first stage, a Stacked U-Net is utilized to predict pixel-wise segmentation of nuclei. The output binary map together with RGB values of the original images are concatenated as the input of the second stage of SUNets. Due to the sizable imbalance of overlapping and background regions, the first network is trained with cross-entropy loss, while the second network is trained with focal loss. We applied the method on two publicly available datasets and achieved state-of-the-art performance for nuclei segmentation-mean Aggregated Jaccard Index (AJI) results were 0.5965 and 0.6210, and F1 scores were 0.8247 and 0.8060, respectively; our method also segmented the overlapping regions between nuclei, with average AJI = 0.3254. The proposed two-stage learning framework outperforms many current segmentation methods, and the consistent good segmentation performance on images from different organs indicates the generalized adaptability of our approach.

13.
Front Genet ; 11: 1023, 2020.
Article in English | MEDLINE | ID: mdl-33005184

ABSTRACT

Lung cancer is one of the most common human cancers both in incidence and mortality, with prognosis particularly poor in metastatic cases. Metastasis in lung cancer is a multifarious process driven by a complex regulatory landscape involving many mechanisms, genes, and proteins. Membrane proteins play a crucial role in the metastatic journey both inside tumor cells and the extra-cellular matrix and are a viable area of research focus with the potential to uncover biomarkers and drug targets. In this work we performed membrane proteome analysis of highly and poorly metastatic lung cells which integrated genomic, proteomic, and transcriptional data. A total of 1,762 membrane proteins were identified, and within this set, there were 163 proteins with significant changes between the two cell lines. We applied the Tied Diffusion through Interacting Events method to integrate the differentially expressed disease-related microRNAs and functionally dys-regulated membrane protein information to further explore the role of key membrane proteins and microRNAs in multi-omics context. Has-miR-137 was revealed as a key gene involved in the activity of membrane proteins by targeting MET and PXN, affecting membrane proteins through protein-protein interaction mechanism. Furthermore, we found that the membrane proteins CDH2, EGFR, ITGA3, ITGA5, ITGB1, and CALR may have significant effect on cancer prognosis and outcomes, which were further validated in vitro. Our study provides multi-omics-based network method of integrating microRNAs and membrane proteome information, and uncovers a differential molecular signatures of highly and poorly metastatic lung cancer cells; these molecules may serve as potential targets for giant-cell lung metastasis treatment and prognosis.

14.
Interdiscip Sci ; 12(4): 547-554, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33113078

ABSTRACT

A substantial body of research is focused to improve the understanding of the relationship between genotypes and phenotypes. Genotype-phenotype studies have shown promise in improving disease diagnosis in humans and identification of specific clinical phenotypes may be helpful in developing more effective therapeutic and diagnostic strategies. To expand on the existing paradigm of evaluating genotypes and phenotypes, we present an investigation of the correlation between biological processes as represented by genomic information and phenotypes in human disease. We focus on monogenic diseases and link biological process and phenotype utilizing information from the Online Mendelian Inheritance in Man, the Gene Ontology, and the Human Phenotype Ontology comprehensive genomic, phenotypic, and disease information resources. Our study uncovers 4661 statistically significant associations and identifies novel correlations between biological processes and phenotypes. We find new relationships between unique phenotype-genotype pairs related to cardiovascular diseases and hypertelorism, which suggests that differences between certain phenotype-genotype association may be the key to the divergence of corresponding phenotypes. Although the application of correlating genotype, phenotype, and biological processes may help to guide diagnosis and treatment of diseases, further investigation and more specific gene ontology descriptions are still required to elucidate mechanisms of action.


Subject(s)
Biological Ontologies , Databases, Genetic , Gene Ontology , Genetic Association Studies , Humans , Phenotype
15.
Front Mol Biosci ; 7: 115, 2020.
Article in English | MEDLINE | ID: mdl-32733913

ABSTRACT

Phenylketonuria (PKU) is a common genetic metabolic disorder that affects the infant's nerve development and manifests as abnormal behavior and developmental delay as the child grows. Currently, a triple-quadrupole mass spectrometer (TQ-MS) is a common high-accuracy clinical PKU screening method. However, there is high false-positive rate associated with this modality, and its reduction can provide a diagnostic and economic benefit to both pediatric patients and health providers. Machine learning methods have the advantage of utilizing high-dimensional and complex features, which can be obtained from the patient's metabolic patterns and interrogated for clinically relevant knowledge. In this study, using TQ-MS screening data of more than 600,000 patients collected at the Newborn Screening Center of Shanghai Children's Hospital, we derived a dataset containing 256 PKU-suspected cases. We then developed a machine learning logistic regression analysis model with the aim to minimize false-positive rates in the results of the initial PKU test. The model attained a 95-100% sensitivity, the specificity was improved 53.14%, and positive predictive value increased from 19.14 to 32.16%. Our study shows that machine learning models may be used as a pediatric diagnosis aid tool to reduce the number of suspected cases and to help eliminate patient recall. Our study can serve as a future reference for the selection and evaluation of computational screening methods.

16.
G3 (Bethesda) ; 10(8): 2801-2809, 2020 08 05.
Article in English | MEDLINE | ID: mdl-32532800

ABSTRACT

Despite continuous updates of the human reference genome, there are still hundreds of unresolved gaps which account for about 5% of the total sequence length. Given the availability of whole genome de novo assemblies, especially those derived from long-read sequencing data, gap-closing sequences can be determined. By comparing 17 de novo long-read sequencing assemblies with the human reference genome, we identified a total of 1,125 gap-closing sequences for 132 (16.9% of 783) gaps and added up to 2.2 Mb novel sequences to the human reference genome. More than 90% of the non-redundant sequences could be verified by unmapped reads from the Simons Genome Diversity Project dataset. In addition, 15.6% of the non-reference sequences were found in at least one of four non-human primate genomes. We further demonstrated that the non-redundant sequences had high content of simple repeats and satellite sequences. Moreover, 43 (32.6%) of the 132 closed gaps were shown to be polymorphic; such sequences may play an important biological role and can be useful in the investigation of human genetic diversity.


Subject(s)
Genome, Human , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA
17.
Orphanet J Rare Dis ; 14(1): 233, 2019 10 22.
Article in English | MEDLINE | ID: mdl-31640704

ABSTRACT

BACKGROUND: It is estimated that at present there are over 10 million rare disease patients in China. Recently an increased focus from policy perspective has been placed on rare diseases management. Improved disease definitions and the releases of local and national rare disease lists are some of the steps taken already. Despite these developments, few Chinese rare disease-related epidemiology and economic studies exist, thus hindering assessment of the true burden of rare diseases. For a rare disease with an effective treatment, this is a particularly important aspect due to the often-high cost associated. OBJECTIVE: The goal of this study is to address the data scarcity on the subject of rare diseases economic impact in China. We aim to address an existing knowledge gap and to provide a timely analysis of the economic burden of 23 rare diseases in Shanghai, China. METHODS: We utilized the data from the Health Information Exchange system of Shanghai and employed statistical modeling to analyze the economic burden of rare diseases with an effective treatment in Shanghai. RESULTS: First, we described the actual direct medical expenditure and analyzed its associated factors. Second, we found age, disease type, number of complications, and payment type were significantly associated with rare disease medical direct costs. Third, a generalized linear model was employed to estimate the annual direct cost. The mean direct medical cost was estimated as ¥9588 (US$1521) for inpatients and ¥1060 (US$168) for outpatients, and was over ¥15 million (~US$2.4 million) per year overall. CONCLUSION: Our study is one of the first quantifying the economic burden of an extensive set of rare diseases in Shanghai and China. Our results can serve to inform healthcare-focused policy making, contribute to the increase of public awareness, and incentivize development of rare-disease strategies and treatments specific to the Chinese context.


Subject(s)
Cost of Illness , Health Care Costs , Health Expenditures , Rare Diseases/economics , Rare Diseases/epidemiology , Adolescent , Adult , China/epidemiology , Cross-Sectional Studies , Health Policy , Humans , Middle Aged , Young Adult
18.
Front Genet ; 10: 729, 2019.
Article in English | MEDLINE | ID: mdl-31543893

ABSTRACT

Function annotation efforts provide a foundation to our understanding of cellular processes and the functioning of the living cell. This motivates high-throughput computational methods to characterize new protein members of a particular function. Research work has focused on discriminative machine-learning methods, which promise to make efficient, de novo predictions of protein function. Furthermore, available function annotation exists predominantly for individual proteins rather than residues of which only a subset is necessary for the conveyance of a particular function. This limits discriminative approaches to predicting functions for which there is sufficient residue-level annotation, e.g., identification of DNA-binding proteins or where an excellent global representation can be divined. Complete understanding of the various functions of proteins requires discovery and functional annotation at the residue level. Herein, we cast this problem into the setting of multiple-instance learning, which only requires knowledge of the protein's function yet identifies functionally relevant residues and need not rely on homology. We developed a new multiple-instance leaning algorithm derived from AdaBoost and benchmarked this algorithm against two well-studied protein function prediction tasks: annotating proteins that bind DNA and RNA. This algorithm outperforms certain previous approaches in annotating protein function while identifying functionally relevant residues involved in binding both DNA and RNA, and on one protein-DNA benchmark, it achieves near perfect classification.

19.
Genes (Basel) ; 9(8)2018 Jul 27.
Article in English | MEDLINE | ID: mdl-30060537

ABSTRACT

Inflammation and fibrosis in human liver are often precursors to hepatocellular carcinoma (HCC), yet none of them is easily modeled in animals. We previously generated transgenic mice with hepatocyte-specific expressed herpes simplex virus thymidine kinase (HSV-tk). These mice would develop hepatitis with the administration of ganciclovir (GCV)(Zhang, 2005 #1). However, our HSV-tk transgenic mice developed hepatitis and HCC tumor as early as six months of age even without GCV administration. We analyzed the transcriptome of the HSV-tk HCC tumor and hepatitis tissue using microarray analysis to investigate the possible causes of HCC. Gene Ontology (GO) enrichment analysis showed that the up-regulated genes in the HCC tissue mainly include the immune-inflammatory and cell cycle genes. The down-regulated genes in HCC tumors are mainly concentrated in the regions related to lipid metabolism. Gene set enrichment analysis (GSEA) showed that immune-inflammatory-related signals in the HSV-tk mice are up-regulated compared to those in Notch mice. Our study suggests that the immune system and inflammation play an important role in HCC development in HSV-tk mice. Specifically, increased expression of immune-inflammatory-related genes is characteristic of HSV-tk mice and that inflammation-induced cell cycle activation maybe a precursory step to cancer. The HSV-tk mouse provides a suitable model for the study of the relationship between immune-inflammation and HCC, and their underlying mechanism for the development of therapeutic application in the future.

20.
Interdiscip Sci ; 10(4): 836-847, 2018 Dec.
Article in English | MEDLINE | ID: mdl-30039492

ABSTRACT

Lung cancers are broadly classified into small cell lung cancers and non-small cell lung cancers (NSCLC). Lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) are two common subtypes of NSCLC, and despite the fact that both occur in lung tissues, these two subtypes show a number of different pathological characteristics. To investigate the differences and seek potential therapy targets, we used bioinformatics methods to analyze RNA-Seq data from different aspects. The previous studies and comparative pathway enrichment analysis on publicly available data showed that expressed or inhibited genes are different in two cancer subtypes through important pathways. Some of these genes could not only affect cell function through expression, but also could regulate other genes' expression by binding to a specific DNA sequence. This kind of genes is called transcription factor (TF) or sequence-specific DNA-binding factor. Transcription factors play important roles in controlling gene expression in carcinoma pathways. Our results revealed transcription factors that may cause differential expression of genes in cellular pathways of LUAD and LUSC, which provide new clues for study and treatment. Once such TF is NFE2l2 which may regulate genes in the Wnt signaling pathway, and the MAPK signaling pathway, thus leading to an increase the cell growth, cell division, and gene transcription. Another TF-XBP1 has high correlation with genes related to cell adhesion molecules and cytokine-cytokine receptor interaction pathways that may further affect the immune system. Moreover, the two TF and high correlated genes also show similar patterns in an independent GEO data set.


Subject(s)
Adenocarcinoma of Lung/genetics , Carcinoma, Squamous Cell/genetics , Gene Expression Regulation, Neoplastic , Lung Neoplasms/genetics , Signal Transduction/genetics , Transcription Factors/metabolism , Adenocarcinoma of Lung/pathology , Carcinoma, Squamous Cell/pathology , Gene Expression Profiling , Humans , Lung Neoplasms/pathology , Regression Analysis , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...