Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
J Med Internet Res ; 18(3): e44, 2016 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-26960745

RESUMO

BACKGROUND: The development of effective health care and public health interventions requires a comprehensive understanding of the perceptions, concerns, and stated needs of health care consumers and the public at large. Big datasets from social media and question-and-answer services provide insight into the public's health concerns and priorities without the financial, temporal, and spatial encumbrances of more traditional community-engagement methods and may prove a useful starting point for public-engagement health research (infodemiology). OBJECTIVE: The objective of our study was to describe user characteristics and health-related queries of the ChaCha question-and-answer platform, and discuss how these data may be used to better understand the perceptions, concerns, and stated needs of health care consumers and the public at large. METHODS: We conducted a retrospective automated textual analysis of anonymous user-generated queries submitted to ChaCha between January 2009 and November 2012. A total of 2.004 billion queries were read, of which 3.50% (70,083,796/2,004,243,249) were missing 1 or more data fields, leaving 1.934 billion complete lines of data for these analyses. RESULTS: Males and females submitted roughly equal numbers of health queries, but content differed by sex. Questions from females predominantly focused on pregnancy, menstruation, and vaginal health. Questions from males predominantly focused on body image, drug use, and sexuality. Adolescents aged 12-19 years submitted more queries than any other age group. Their queries were largely centered on sexual and reproductive health, and pregnancy in particular. CONCLUSIONS: The private nature of the ChaCha service provided a perfect environment for maximum frankness among users, especially among adolescents posing sensitive health questions. Adolescents' sexual health queries reveal knowledge gaps with serious, lifelong consequences. The nature of questions to the service provides opportunities for rapid understanding of health concerns and may lead to development of more effective tailored interventions.


Assuntos
Informação de Saúde ao Consumidor/estatística & dados numéricos , Comportamento de Busca de Informação , Internet , Adolescente , Adulto , Conjuntos de Dados como Assunto , Feminino , Humanos , Masculino , Gravidez , Saúde Reprodutiva , Estudos Retrospectivos , Fatores Sexuais , Comportamento Sexual , Mídias Sociais
2.
BMC Bioinformatics ; 16 Suppl 17: S5, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26679199

RESUMO

MOTIVATION: The identification of new therapeutic uses of existing drugs, or drug repositioning, offers the possibility of faster drug development, reduced risk, lesser cost and shorter paths to approval. The advent of high throughput microarray technology has enabled comprehensive monitoring of transcriptional response associated with various disease states and drug treatments. This data can be used to characterize disease and drug effects and thereby give a measure of the association between a given drug and a disease. Several computational methods have been proposed in the literature that make use of publicly available transcriptional data to reposition drugs against diseases. METHOD: In this work, we carry out a data mining process using publicly available gene expression data sets associated with a few diseases and drugs, to identify the existing drugs that can be used to treat genes causing lung cancer and breast cancer. RESULTS: Three strong candidates for repurposing have been identified- Letrozole and GDC-0941 against lung cancer, and Ribavirin against breast cancer. Letrozole and GDC-0941 are drugs currently used in breast cancer treatment and Ribavirin is used in the treatment of Hepatitis C.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Reposicionamento de Medicamentos/métodos , Regulação Neoplásica da Expressão Gênica , Antineoplásicos/uso terapêutico , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Feminino , Hepatite C/genética , Humanos , Letrozol , Neoplasias Pulmonares/genética , Nitrilas/uso terapêutico , Triazóis/uso terapêutico
3.
Int J Data Min Bioinform ; 11(1): 1-30, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26255374

RESUMO

In this paper we present a systems biology approach to the understanding of the miRNA-regulatory network in colon rectal cancer. An initial set of significant genes in Colon Rectal Cancer (CRC) were obtained by mining relevant literature. An initial set of cancer-related miRNAs were obtained from three databases: miRBase, miRWalk, Targetscan and GEO microarray experiment. First principle methods were then used to generate the global miRNA-gene network. Significant miRNAs and associated transcription factors in the global miRNA-gene network were identified using topological and sub-graph analyses. Eleven novel miRNAs were identified and three of the novel miRNAs, hsa-miR-630, hsa-miR-100 and hsa-miR-99a, were further analysed to elucidate their role in CRC. The proposed methodology effectively made use of literature data and was able to show novel, significant miRNA-transcription associations in CRC.


Assuntos
Neoplasias Colorretais/genética , Regulação Neoplásica da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , MicroRNAs/genética , Proteínas de Neoplasias/genética , Fatores de Transcrição/genética , Mineração de Dados/métodos , Bases de Dados Genéticas , Humanos , Biologia de Sistemas/métodos
4.
Stud Health Technol Inform ; 216: 604-8, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26262122

RESUMO

In this study we have developed a rule-based natural language processing (NLP) system to identify patients with family history of pancreatic cancer. The algorithm was developed in a Unstructured Information Management Architecture (UIMA) framework and consisted of section segmentation, relation discovery, and negation detection. The system was evaluated on data from two institutions. The family history identification precision was consistent across the institutions shifting from 88.9% on Indiana University (IU) dataset to 87.8% on Mayo Clinic dataset. Customizing the algorithm on the the Mayo Clinic data, increased its precision to 88.1%. The family member relation discovery achieved precision, recall, and F-measure of 75.3%, 91.6% and 82.6% respectively. Negation detection resulted in precision of 99.1%. The results show that rule-based NLP approaches for specific information extraction tasks are portable across institutions; however customization of the algorithm on the new dataset improves its performance.


Assuntos
Registros Eletrônicos de Saúde/classificação , Armazenamento e Recuperação da Informação/métodos , Anamnese/métodos , Processamento de Linguagem Natural , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/genética , Algoritmos , Predisposição Genética para Doença/epidemiologia , Predisposição Genética para Doença/genética , Humanos , Anamnese/estatística & dados numéricos , Registro Médico Coordenado , Neoplasias Pancreáticas/epidemiologia
5.
PLoS One ; 10(6): e0130819, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26098852

RESUMO

We tested the ability of the axolotl (Ambystoma mexicanum) fibula to regenerate across segment defects of different size in the absence of intervention or after implant of a unique 8-braid pig small intestine submucosa (SIS) scaffold, with or without incorporated growth factor combinations or tissue protein extract. Fractures and defects of 10% and 20% of the total limb length regenerated well without any intervention, but 40% and 50% defects failed to regenerate after either simple removal of bone or implanting SIS scaffold alone. By contrast, scaffold soaked in the growth factor combination BMP-4/HGF or in protein extract of intact limb tissue promoted partial or extensive induction of cartilage and bone across 50% segment defects in 30%-33% of cases. These results show that BMP-4/HGF and intact tissue protein extract can promote the events required to induce cartilage and bone formation across a segment defect larger than critical size and that the long bones of axolotl limbs are an inexpensive model to screen soluble factors and natural and synthetic scaffolds for their efficacy in stimulating this process.


Assuntos
Ambystoma mexicanum/fisiologia , Osso e Ossos/fisiologia , Extremidades/fisiologia , Fíbula/fisiologia , Osteogênese/fisiologia , Regeneração/fisiologia , Ambystoma mexicanum/metabolismo , Animais , Proteína Morfogenética Óssea 4/metabolismo , Osso e Ossos/metabolismo , Cartilagem/metabolismo , Cartilagem/fisiologia , Fíbula/metabolismo , Fator de Crescimento de Hepatócito/metabolismo , Mucosa Intestinal/metabolismo , Mucosa Intestinal/fisiologia , Intestino Delgado/metabolismo , Intestino Delgado/fisiologia , Suínos , Alicerces Teciduais
6.
J Biomed Inform ; 54: 213-9, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25791500

RESUMO

In Electronic Health Records (EHRs), much of valuable information regarding patients' conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients' condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx's false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos
7.
HPB (Oxford) ; 17(5): 447-53, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25537257

RESUMO

INTRODUCTION: As many as 3% of computed tomography (CT) scans detect pancreatic cysts. Because pancreatic cysts are incidental, ubiquitous and poorly understood, follow-up is often not performed. Pancreatic cysts may have a significant malignant potential and their identification represents a 'window of opportunity' for the early detection of pancreatic cancer. The purpose of this study was to implement an automated Natural Language Processing (NLP)-based pancreatic cyst identification system. METHOD: A multidisciplinary team was assembled. NLP-based identification algorithms were developed based on key words commonly used by physicians to describe pancreatic cysts and programmed for automated search of electronic medical records. A pilot study was conducted prospectively in a single institution. RESULTS: From March to September 2013, 566,233 reports belonging to 50,669 patients were analysed. The mean number of patients reported with a pancreatic cyst was 88/month (range 78-98). The mean sensitivity and specificity were 99.9% and 98.8%, respectively. CONCLUSION: NLP is an effective tool to automatically identify patients with pancreatic cysts based on electronic medical records (EMR). This highly accurate system can help capture patients 'at-risk' of pancreatic cancer in a registry.


Assuntos
Algoritmos , Automação , Detecção Precoce de Câncer/métodos , Processamento de Linguagem Natural , Cisto Pancreático/diagnóstico , Neoplasias Pancreáticas/diagnóstico , Seguimentos , Humanos , Projetos Piloto , Reprodutibilidade dos Testes , Estudos Retrospectivos
8.
BMC Dev Biol ; 14: 32, 2014 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-25063185

RESUMO

BACKGROUND: To gain insight into what differences might restrict the capacity for limb regeneration in Xenopus froglets, we used High Performance Liquid Chromatography (HPLC)/double mass spectrometry to characterize protein expression during fibroblastema formation in the amputated froglet hindlimb, and compared the results to those obtained previously for blastema formation in the axolotl limb. RESULTS: Comparison of the Xenopus fibroblastema and axolotl blastema revealed several similarities and significant differences in proteomic profiles. The most significant similarity was the strong parallel down regulation of muscle proteins and enzymes involved in carbohydrate metabolism. Regenerating Xenopus limbs differed significantly from axolotl regenerating limbs in several ways: deficiency in the inositol phosphate/diacylglycerol signaling pathway, down regulation of Wnt signaling, up regulation of extracellular matrix (ECM) proteins and proteins involved in chondrocyte differentiation, lack of expression of a key cell cycle protein, ecotropic viral integration site 5 (EVI5), that blocks mitosis in the axolotl, and the expression of several patterning proteins not seen in the axolotl that may dorsalize the fibroblastema. CONCLUSIONS: We have characterized global protein expression during fibroblastema formation after amputation of the Xenopus froglet hindlimb and identified several differences that lead to signaling deficiency, failure to retard mitosis, premature chondrocyte differentiation, and failure of dorsoventral axial asymmetry. These differences point to possible interventions to improve blastema formation and pattern formation in the froglet limb.


Assuntos
Ambystoma/metabolismo , Membro Posterior/metabolismo , Proteínas de Xenopus/metabolismo , Xenopus laevis/metabolismo , Ambystoma/crescimento & desenvolvimento , Animais , Regeneração Óssea/fisiologia , Cromatografia Líquida de Alta Pressão , Regulação da Expressão Gênica no Desenvolvimento , Espectrometria de Massas , Proteômica , Transdução de Sinais , Proteínas de Xenopus/genética , Xenopus laevis/crescimento & desenvolvimento
9.
BMC Syst Biol ; 7: 141, 2013 Dec 26.
Artigo em Inglês | MEDLINE | ID: mdl-24369052

RESUMO

BACKGROUND: Epigenetics refers to the reversible functional modifications of the genome that do not correlate to changes in the DNA sequence. The aim of this study is to understand DNA methylation patterns across different stages of lung adenocarcinoma (LUAD). RESULTS: Our study identified 72, 93 and 170 significant DNA methylated genes in Stages I, II and III respectively. A set of common 34 significant DNA methylated genes located in the promoter section of the true CpG islands were found across stages, and these were: HOX genes, FOXG1, GRIK3, HAND2, PRKCB, etc. Of the total significant DNA methylated genes, 65 correlated with transcription function. The epigenetic analysis identified the following novel genes across all stages: PTGDR, TLX3, and POU4F2. The stage-wise analysis observed the appearance of NEUROG1 gene in Stage I and its re-appearance in Stage III. The analysis showed similar epigenetic pattern across Stage I and Stage III. Pathway analysis revealed important signaling and metabolic pathways of LUAD to correlate with epigenetics. Epigenetic subnetwork analysis identified a set of seven conserved genes across all stages: UBC, KRAS, PIK3CA, PIK3R3, RAF1, BRAF, and RAP1A. A detailed literature analysis elucidated epigenetic genes like FOXG1, HLA-G, and NKX6-2 to be known as prognostic targets. CONCLUSION: Integrating epigenetic information for genes with expression data can be useful for comprehending in-depth disease mechanism and for the ultimate goal of better target identification.


Assuntos
Adenocarcinoma/genética , Adenocarcinoma/patologia , Epigênese Genética , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Biologia de Sistemas , Adenocarcinoma de Pulmão , Cromossomos Humanos/genética , Ilhas de CpG/genética , Metilação de DNA , Genes Neoplásicos/genética , Humanos , Estadiamento de Neoplasias , Regiões Promotoras Genéticas/genética , Fatores de Transcrição/metabolismo
10.
Stud Health Technol Inform ; 192: 822-6, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23920672

RESUMO

Pancreatic cancer is one of the deadliest cancers, mostly diagnosed at late stages. Patients with pancreatic cysts are at higher risk of developing cancer and their surveillance can help to diagnose the disease in earlier stages. In this retrospective study we collected a corpus of 1064 records from 44 patients at Indiana University Hospital from 1990 to 2012. A Natural Language Processing (NLP) system was developed and used to identify patients with pancreatic cysts. NegEx algorithm was used initially to identify the negation status of concepts that resulted in precision and recall of 98.9% and 89% respectively. Stanford Dependency parser (SDP) was then used to improve the NegEx performance resulting in precision of 98.9% and recall of 95.7%. Features related to pancreatic cysts were also extracted from patient medical records using regex and NegEx algorithm with 98.5% precision and 97.43% recall. SDP improved the NegEx algorithm by increasing the recall to 98.12%.


Assuntos
Registros Eletrônicos de Saúde , Registros de Saúde Pessoal , Processamento de Linguagem Natural , Cisto Pancreático/classificação , Cisto Pancreático/diagnóstico , Vocabulário Controlado , Algoritmos , Inteligência Artificial , Mineração de Dados/métodos , Sistemas de Apoio a Decisões Clínicas , Humanos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
11.
BMC Cancer ; 12: 331, 2012 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-22852817

RESUMO

BACKGROUND: Biological entities do not perform in isolation, and often, it is the nature and degree of interactions among numerous biological entities which ultimately determines any final outcome. Hence, experimental data on any single biological entity can be of limited value when considered only in isolation. To address this, we propose that augmenting individual entity data with the literature will not only better define the entity's own significance but also uncover relationships with novel biological entities.To test this notion, we developed a comprehensive text mining and computational methodology that focused on discovering new targets of one class of molecular entities, transcription factors (TF), within one particular disease, colorectal cancer (CRC). METHODS: We used 39 molecular entities known to be associated with CRC along with six colorectal cancer terms as the bait list, or list of search terms, for mining the biomedical literature to identify CRC-specific genes and proteins. Using the literature-mined data, we constructed a global TF interaction network for CRC. We then developed a multi-level, multi-parametric methodology to identify TFs to CRC. RESULTS: The small bait list, when augmented with literature-mined data, identified a large number of biological entities associated with CRC. The relative importance of these TF and their associated modules was identified using functional and topological features. Additional validation of these highly-ranked TF using the literature strengthened our findings. Some of the novel TF that we identified were: SLUG, RUNX1, IRF1, HIF1A, ATF-2, ABL1, ELK-1 and GATA-1. Some of these TFs are associated with functional modules in known pathways of CRC, including the Beta-catenin/development, immune response, transcription, and DNA damage pathways. CONCLUSIONS: Our methodology of using text mining data and a multi-level, multi-parameter scoring technique was able to identify both known and novel TF that have roles in CRC. Starting with just one TF (SMAD3) in the bait list, the literature mining process identified an additional 116 CRC-associated TFs. Our network-based analysis showed that these TFs all belonged to any of 13 major functional groups that are known to play important roles in CRC. Among these identified TFs, we obtained a novel six-node module consisting of ATF2-P53-JNK1-ELK1-EPHB2-HIF1A, from which the novel JNK1-ELK1 association could potentially be a significant marker for CRC.


Assuntos
Neoplasias Colorretais/genética , Neoplasias Colorretais/metabolismo , Biologia de Sistemas/métodos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Mineração de Dados , Perfilação da Expressão Gênica/métodos , Humanos
12.
BMC Syst Biol ; 6 Suppl 3: S17, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23282040

RESUMO

BACKGROUND: Colorectal cancer (CRC) is one of the most commonly diagnosed cancers worldwide. Studies have correlated risk of CRC development with dietary habits and environmental conditions. Gene signatures for any disease can identify the key biological processes, which is especially useful in studying cancer development. Such processes can be used to evaluate potential drug targets. Though recognition of CRC gene-signatures across populations is crucial to better understanding potential novel treatment options for CRC, it remains a challenging task. RESULTS: We developed a topological and biological feature-based network approach for identifying the gene signatures across populations. In this work, we propose a novel approach of using cliques to understand the variability within population. Cliques are more conserved and co-expressed, therefore allowing identification and comparison of cliques across a population which can help researchers study gene variations. Our study was based on four publicly available expression datasets belonging to four different populations across the world. We identified cliques of various sizes (0 to 7) across the four population networks. Cliques of size seven were further analyzed across populations for their commonality and uniqueness. Forty-nine common cliques of size seven were identified. These cliques were further analyzed based on their connectivity profiles. We found associations between the cliques and their connectivity profiles across networks. With these clique connectivity profiles (CCPs), we were able to identify the divergence among the populations, important biological processes (cell cycle, signal transduction, and cell differentiation), and related gene pathways. Therefore the genes identified in these cliques and their connectivity profiles can be defined as the gene-signatures across populations. In this work we demonstrate the power and effectiveness of cliques to study CRC across populations. CONCLUSIONS: We developed a new approach where cliques and their connectivity profiles helped elucidate the variation and similarity in CRC gene profiles across four populations with unique dietary habits.


Assuntos
Neoplasias Colorretais/genética , Biologia Computacional/métodos , Genética Populacional/métodos , Transcriptoma , China , Neoplasias Colorretais/patologia , Bases de Dados Genéticas , Comportamento Alimentar , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Alemanha , Humanos , Análise em Microsséries , Arábia Saudita , Transdução de Sinais , Estados Unidos
13.
BMC Bioinformatics ; 12: 80, 2011 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-21418574

RESUMO

BACKGROUND: Studies on amphibian limb regeneration began in the early 1700's but we still do not completely understand the cellular and molecular events of this unique process. Understanding a complex biological process such as limb regeneration is more complicated than the knowledge of the individual genes or proteins involved. Here we followed a systems biology approach in an effort to construct the networks and pathways of protein interactions involved in formation of the accumulation blastema in regenerating axolotl limbs. RESULTS: We used the human orthologs of proteins previously identified by our research team as bait to identify the transcription factor (TF) pathways and networks that regulate blastema formation in amputated axolotl limbs. The five most connected factors, c-Myc, SP1, HNF4A, ESR1 and p53 regulate ~50% of the proteins in our data. Among these, c-Myc and SP1 regulate 36.2% of the proteins. c-Myc was the most highly connected TF (71 targets). Network analysis showed that TGF-ß1 and fibronectin (FN) lead to the activation of these TFs. We found that other TFs known to be involved in epigenetic reprogramming, such as Klf4, Oct4, and Lin28 are also connected to c-Myc and SP1. CONCLUSIONS: Our study provides a systems biology approach to how different molecular entities inter-connect with each other during the formation of an accumulation blastema in regenerating axolotl limbs. This approach provides an in silico methodology to identify proteins that are not detected by experimental methods such as proteomics but are potentially important to blastema formation. We found that the TFs, c-Myc and SP1 and their target genes could potentially play a central role in limb regeneration. Systems biology has the potential to map out numerous other pathways that are crucial to blastema formation in regeneration-competent limbs, to compare these to the pathways that characterize regeneration-deficient limbs and finally, to identify stem cell markers in regeneration.


Assuntos
Extremidades/fisiologia , Proteômica , Regeneração/genética , Fatores de Transcrição/genética , Ambystoma mexicanum/genética , Ambystoma mexicanum/fisiologia , Animais , DNA Complementar/genética , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Fator 4 Semelhante a Kruppel , Fator de Crescimento Transformador beta1/genética
14.
J Biomed Inform ; 44(4): 536-44, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21284958

RESUMO

Health social networking communities are emerging resources for translational research. We have designed and implemented a framework called HyGen, which combines Semantic Web technologies, graph algorithms and user profiling to discover and prioritize novel associations across disciplines. This manuscript focuses on the key strategies developed to overcome the challenges in handling patient-generated content in Health social networking communities. Heuristic and quantitative evaluations were carried out in colorectal cancer. The results demonstrate the potential of our approach to bridge silos and to identify hidden links among clinical observations, drugs, genes and diseases. In Amyotrophic Lateral Sclerosis case studies, HyGen has identified 15 of the 20 published disease genes. Additionally, HyGen has highlighted new candidates for future investigations, as well as a scientifically meaningful connection between riluzole and alcohol abuse.


Assuntos
Biologia Computacional/métodos , Internet , Apoio Social , Pesquisa Translacional Biomédica/métodos , Algoritmos , Esclerose Lateral Amiotrófica/genética , Neoplasias Colorretais/genética , Redes Comunitárias , Doença/genética , Humanos , Modelos Teóricos , Semântica
15.
Artif Intell Med ; 49(3): 145-54, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20382004

RESUMO

OBJECTIVES: Biological research literature, as in many other domains of human endeavor, represents a rich, ever growing source of knowledge. An important form of such biological knowledge constitutes associations among biological entities such as genes, proteins, diseases, drugs and chemicals, etc. There has been a considerable amount of recent research in extraction of various kinds of binary associations (e.g., gene-gene, gene-protein, protein-protein, etc.) using different text mining approaches. However, an important aspect of such associations (e.g., "gene A activates protein B") is identifying the context in which such associations occur (e.g., "gene A activates protein B in the context of disease C in organ D under the influence of chemical E"). Such contexts can be represented appropriately by a multi-way relationship involving more than two objects (e.g., objects A, B, C, D, E) rather than usual binary relationship (objects A and B). METHODS: Such multi-way relations naturally lead to a hyper-graph representation of the knowledge rather than a binary graph. The hyper-graph based multi-way knowledge extraction from biological text literature represents a computationally difficult problem (due to its combinatorial nature) which has not received much attention from the Bioinformatics research community. In this paper, we describe and compare two different approaches to such multi-way hyper-graph extraction: one based on an exhaustive enumeration of all multi-way hyper-edges and the other based on an extension of the well-known A Priori algorithm for structured data to the case unstructured textual data. We also present a representative graph based approach towards visualizing these genetic association hyper-graphs. RESULTS: Two case studies are conducted for two biomedical problems (related to the diseases of lung cancer and colorectal cancer respectively), illustrating that the latter approach (using the text-based A Priori method) identifies the same hyper-edges as the former approach (the exhaustive method), but at a much less computational cost. The extracted hyper-relations are presented in the paper as cognition-rich representative graphs, representing the corresponding hyper-graphs. CONCLUSIONS: The text-based A Priori algorithm is a practical, useful method to extract hyper-graphs representing multi-way associations among biological objects. These hyper-graphs and their visualization using representative graphs can provide important contextual information for understanding gene-gene associations relevant to specific diseases.


Assuntos
Gráficos por Computador , Doença/genética , Estudo de Associação Genômica Ampla , Biologia Computacional , Humanos
16.
BMC Biol ; 7: 83, 2009 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-19948009

RESUMO

BACKGROUND: Following amputation, urodele salamander limbs reprogram somatic cells to form a blastema that self-organizes into the missing limb parts to restore the structure and function of the limb. To help understand the molecular basis of blastema formation, we used quantitative label-free liquid chromatography-mass spectrometry/mass spectrometry (LC-MS/MS)-based methods to analyze changes in the proteome that occurred 1, 4 and 7 days post amputation (dpa) through the mid-tibia/fibula of axolotl hind limbs. RESULTS: We identified 309 unique proteins with significant fold change relative to controls (0 dpa), representing 10 biological process categories: (1) signaling, (2) Ca2+ binding and translocation, (3) transcription, (4) translation, (5) cytoskeleton, (6) extracellular matrix (ECM), (7) metabolism, (8) cell protection, (9) degradation, and (10) cell cycle. In all, 43 proteins exhibited exceptionally high fold changes. Of these, the ecotropic viral integrative factor 5 (EVI5), a cell cycle-related oncoprotein that prevents cells from entering the mitotic phase of the cell cycle prematurely, was of special interest because its fold change was exceptionally high throughout blastema formation. CONCLUSION: Our data were consistent with previous studies indicating the importance of inositol triphosphate and Ca2+ signaling in initiating the ECM and cytoskeletal remodeling characteristic of histolysis and cell dedifferentiation. In addition, the data suggested that blastema formation requires several mechanisms to avoid apoptosis, including reduced metabolism, differential regulation of proapoptotic and antiapoptotic proteins, and initiation of an unfolded protein response (UPR). Since there is virtually no mitosis during blastema formation, we propose that high levels of EVI5 function to arrest dedifferentiated cells somewhere in the G1/S/G2 phases of the cell cycle until they have accumulated under the wound epidermis and enter mitosis in response to neural and epidermal factors. Our findings indicate the general value of quantitative proteomic analysis in understanding the regeneration of complex structures.


Assuntos
Ambystoma/fisiologia , Extremidades/fisiologia , Proteômica , Regeneração/fisiologia , Amputação Cirúrgica , Animais , Sinalização do Cálcio/genética , Cromatografia Líquida de Alta Pressão , Matriz Extracelular/metabolismo , Extremidades/cirurgia , Inositol 1,4,5-Trifosfato/metabolismo , Mapeamento de Peptídeos , Espectrometria de Massas em Tandem , Cicatrização
18.
Int J Data Min Bioinform ; 3(1): 40-54, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19432375

RESUMO

This paper presents a user-centric biological query system for information integration and knowledge acquisition from distributed, semantically heterogeneous data sources. The proposed system, BioXBase, extracts user requested query information over the internet from multiple biological sources and organises this information into a homogeneous unified view to the user. This entire process is done in real time on-the-fly. The BioXBase system has improved the results retrieved by 30% compared to a system that has only a local database. The BioXBase system is further enhanced by 20% while combining the results with a local database, making the results more significant in biological domain.


Assuntos
Biologia/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Internet , Processamento de Linguagem Natural , Interface Usuário-Computador , Simulação por Computador , Modelos Teóricos , Integração de Sistemas
19.
BMC Med Genomics ; 1: 39, 2008 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-18786252

RESUMO

BACKGROUND: Numerous studies have used microarrays to identify gene signatures for predicting cancer patient clinical outcome and responses to chemotherapy. However, the potential impact of gene expression profiling in cancer diagnosis, prognosis and development of personalized treatment may not be fully exploited due to the lack of consensus gene signatures and poor understanding of the underlying molecular mechanisms. METHODS: We developed a novel approach to derive gene signatures for breast cancer prognosis in the context of known biological pathways. Using unsupervised methods, cancer patients were separated into distinct groups based on gene expression patterns in one of the following pathways: apoptosis, cell cycle, angiogenesis, metastasis, p53, DNA repair, and several receptor-mediated signaling pathways including chemokines, EGF, FGF, HIF, MAP kinase, JAK and NF-kappaB. The survival probabilities were then compared between the patient groups to determine if differential gene expression in a specific pathway is correlated with differential survival. RESULTS: Our results revealed expression of cell cycle genes is strongly predictive of breast cancer outcomes. We further confirmed this observation by building a cell cycle gene signature model using supervised methods. Validated in multiple independent datasets, the cell cycle gene signature is a more accurate predictor for breast cancer clinical outcome than the previously identified Amsterdam 70-gene signature that has been developed into a FDA approved clinical test MammaPrint. CONCLUSION: Taken together, the gene expression signature model we developed from well defined pathways is not only a consistently powerful prognosticator but also mechanistically linked to cancer biology. Our approach provides an alternative to the current methodology of identifying gene expression markers for cancer prognosis and drug responses using the whole genome gene expression data.

20.
J Biomed Sci ; 15(3): 317-31, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18204916

RESUMO

The cell division control protein (Cdc2) kinase is a catalytic subunit of a protein kinase complex, called the M phase promoting factor, which induces entry into mitosis and is universal among eukaryotes. This protein is believed to play a major role in cell division and control. The lives of biological cells are controlled by proteins interacting in metabolic and signaling pathways, in complexes that replicate genes and regulate gene activity, and in the assembly of the cytoskeletal infrastructure. Our knowledge of protein-protein (P-P) interactions has been accumulated from biochemical and genetic experiments, including the widely used yeast two-hybrid test. In this paper we examine if P-P interactions in regenerating tissues and cells of the anuran Xenopus laevis can be discovered from biomedical literature using computational and literature mining techniques. Using literature mining techniques, we have identified a set of implicitly interacting proteins in regenerating tissues and cells of Xenopus laevis that may interact with Cdc2 to control cell division. Genome sequence based bioinformatics tools were then applied to validate a set of proteins that appear to interact with the Cdc2 protein. Pathway analysis of these proteins suggests that Myc proteins function as the regulator of M phase initiation by controlling expression of the Akt1 molecule that ultimately inhibits the Cdc2-cyclin B complex in cells. P-P interactions that are implicitly appearing in literature can be effectively discovered using literature mining techniques. By applying evolutionary principles on the P-P interacting pairs, it is possible to quantitatively analyze the significance of the associations with biological relevance. The developed BioMap system allows discovering implicit P-P interactions from large quantity of biomedical literature data. The unique similarities and differences observed within the interacting proteins can lead to the development of the new hypotheses that can be used to design further laboratory experiments.


Assuntos
Ciclo Celular , Biologia Computacional , Proteínas/metabolismo , Animais , Bases de Dados de Proteínas , Xenopus laevis
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...