Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-38979215

RESUMO

Tight control over cell identity gene expression is necessary for proper adult form and function. The opposing activities of Polycomb and trithorax complexes determine the ON/OFF state of targets like the Hox genes. Trithorax encodes a methyltransferase specific to histone H3 lysine-4 (H3K4). However, there is no direct evidence that H3K4 regulates Polycomb group target genes in vivo . Here, we demonstrate two key roles for replication-dependent histone H3.2K4 in target control. We find that H3.2K4 antagonizes Polycomb group catalytic activity and that it is required for proper target gene activation. We conclude that H3.2K4 directly regulates expression of Polycomb targets.

2.
J Biol Chem ; : 107527, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38960040

RESUMO

In an unmodified state, positively charged histone N-terminal tails engage nucleosomal DNA in a manner which restricts access to not only the underlying DNA, but also key tail residues subject to binding and/or modification. Charge-neutralizing modifications, such as histone acetylation, serve to disrupt this DNA-tail interaction, facilitating access to such residues. We previously showed that a polyacetylation-mediated chromatin "switch" governs the read-write capability of H3K4me3 by the MLL1 methyltransferase complex. Here, we discern the relative contributions of site-specific acetylation states along the H3 tail and extend our interrogation to other chromatin modifiers. We show that the contributions of H3 tail acetylation to H3K4 methylation by MLL1 are highly variable, with H3K18 and H3K23 acetylation exhibiting robust stimulatory effects, and that this extends to the related H3K4 methyltransferase complex, MLL4. We show that H3K4me1 and H3K4me3 are found preferentially co-enriched with H3 N-terminal tail proteoforms bearing dual H3K18 and H3K23 acetylation (H3{K18acK23ac}). We further show that this effect is specific to H3K4 methylation, while methyltransferases targeting other H3 tail residues (H3K9, H3K27, & H3K36), a methyltransferase targeting the nucleosome core (H3K79), and a kinase targeting a residue directly adjacent to H3K4 (H3T3) are insensitive to tail acetylation. Together, these findings indicate a unique and robust stimulation of H3K4 methylation by H3K18 and H3K23 acetylation and provide key insight into why H3K4 methylation is often associated with histone acetylation in the context of active gene expression.

3.
bioRxiv ; 2024 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-38798640

RESUMO

In an unmodified state, positively charged histone N-terminal tails engage nucleosomal DNA in a manner which restricts access to not only the underlying DNA, but also key tail residues subject to binding and/or modification. Charge-neutralizing modifications, such as histone acetylation, serve to disrupt this DNA-tail interaction, facilitating access to such residues. We previously showed that a polyacetylation-mediated chromatin "switch" governs the read-write capability of H3K4me3 by the MLL1 methyltransferase complex. Here, we discern the relative contributions of site-specific acetylation states along the H3 tail and extend our interrogation to other chromatin modifiers. We show that the contributions of H3 tail acetylation to H3K4 methylation by MLL1 are highly variable, with H3K18 and H3K23 acetylation exhibiting robust stimulatory effects, and that this extends to the related H3K4 methyltransferase complex, MLL4. We show that H3K4me1 and H3K4me3 are found preferentially co-enriched with H3 N-terminal tail proteoforms bearing dual H3K18 and H3K23 acetylation (H3{K18acK23ac}). We further show that this effect is specific to H3K4 methylation, while methyltransferases targeting other H3 tail residues (H3K9, H3K27, & H3K36), a methyltransferase targeting the nucleosome core (H3K79), and a kinase targeting a residue directly adjacent to H3K4 (H3T3) are insensitive to tail acetylation. Together, these findings indicate a unique and robust stimulation of H3K4 methylation by H3K18 and H3K23 acetylation and provide key insight into why H3K4 methylation is often associated with histone acetylation in the context of active gene expression.

5.
Elife ; 122023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37204295

RESUMO

In nucleosomes, histone N-terminal tails exist in dynamic equilibrium between free/accessible and collapsed/DNA-bound states. The latter state is expected to impact histone N-termini availability to the epigenetic machinery. Notably, H3 tail acetylation (e.g. K9ac, K14ac, K18ac) is linked to increased H3K4me3 engagement by the BPTF PHD finger, but it is unknown if this mechanism has a broader extension. Here, we show that H3 tail acetylation promotes nucleosomal accessibility to other H3K4 methyl readers, and importantly, extends to H3K4 writers, notably methyltransferase MLL1. This regulation is not observed on peptide substrates yet occurs on the cis H3 tail, as determined with fully-defined heterotypic nucleosomes. In vivo, H3 tail acetylation is directly and dynamically coupled with cis H3K4 methylation levels. Together, these observations reveal an acetylation 'chromatin switch' on the H3 tail that modulates read-write accessibility in nucleosomes and resolves the long-standing question of why H3K4me3 levels are coupled with H3 acetylation.


Assuntos
Cromatina , Histonas , Histonas/metabolismo , Nucleossomos , Metilação , Acetilação
6.
Earth Space Sci ; 9(11): e2022EA002343, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36583191

RESUMO

Nowcasting is a term originating from economics, finance, and meteorology. It refers to the process of determining the uncertain state of the economy, markets or the weather at the current time by indirect means. In this paper, we describe a simple two-parameter data analysis that reveals hidden order in otherwise seemingly chaotic earthquake seismicity. One of these parameters relates to a mechanism of seismic quiescence arising from the physics of strain-hardening of the crust prior to major events. We observe an earthquake cycle associated with major earthquakes in California, similar to what has long been postulated. An estimate of the earthquake hazard revealed by this state variable time series can be optimized by the use of machine learning in the form of the Receiver Operating Characteristic skill score. The ROC skill is used here as a loss function in a supervised learning mode. Our analysis is conducted in the region of 5° × 5° in latitude-longitude centered on Los Angeles, a region which we used in previous papers to build similar time series using more involved methods (Rundle & Donnellan, 2020, https://doi.org/10.1029/2020EA001097; Rundle, Donnellan et al., 2021, https://doi.org/10.1029/2021EA001757; Rundle, Stein et al., 2021, https://doi.org/10.1088/1361-6633/abf893). Here we show that not only does the state variable time series have forecast skill, the associated spatial probability densities have skill as well. In addition, use of the standard ROC and Precision (PPV) metrics allow probabilities of current earthquake hazard to be defined in a simple, straightforward, and rigorous way.

7.
Front Physiol ; 12: 667828, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34248661

RESUMO

In many mechanistic medical, biological, physical, and engineered spatiotemporal dynamic models the numerical solution of partial differential equations (PDEs), especially for diffusion, fluid flow and mechanical relaxation, can make simulations impractically slow. Biological models of tissues and organs often require the simultaneous calculation of the spatial variation of concentration of dozens of diffusing chemical species. One clinical example where rapid calculation of a diffusing field is of use is the estimation of oxygen gradients in the retina, based on imaging of the retinal vasculature, to guide surgical interventions in diabetic retinopathy. Furthermore, the ability to predict blood perfusion and oxygenation may one day guide clinical interventions in diverse settings, i.e., from stent placement in treating heart disease to BOLD fMRI interpretation in evaluating cognitive function (Xie et al., 2019; Lee et al., 2020). Since the quasi-steady-state solutions required for fast-diffusing chemical species like oxygen are particularly computationally costly, we consider the use of a neural network to provide an approximate solution to the steady-state diffusion equation. Machine learning surrogates, neural networks trained to provide approximate solutions to such complicated numerical problems, can often provide speed-ups of several orders of magnitude compared to direct calculation. Surrogates of PDEs could enable use of larger and more detailed models than are possible with direct calculation and can make including such simulations in real-time or near-real time workflows practical. Creating a surrogate requires running the direct calculation tens of thousands of times to generate training data and then training the neural network, both of which are computationally expensive. Often the practical applications of such models require thousands to millions of replica simulations, for example for parameter identification and uncertainty quantification, each of which gains speed from surrogate use and rapidly recovers the up-front costs of surrogate generation. We use a Convolutional Neural Network to approximate the stationary solution to the diffusion equation in the case of two equal-diameter, circular, constant-value sources located at random positions in a two-dimensional square domain with absorbing boundary conditions. Such a configuration caricatures the chemical concentration field of a fast-diffusing species like oxygen in a tissue with two parallel blood vessels in a cross section perpendicular to the two blood vessels. To improve convergence during training, we apply a training approach that uses roll-back to reject stochastic changes to the network that increase the loss function. The trained neural network approximation is about 1000 times faster than the direct calculation for individual replicas. Because different applications will have different criteria for acceptable approximation accuracy, we discuss a variety of loss functions and accuracy estimators that can help select the best network for a particular application. We briefly discuss some of the issues we encountered with overfitting, mismapping of the field values and the geometrical conditions that lead to large absolute and relative errors in the approximate solution.

8.
BMC Med Inform Decis Mak ; 21(Suppl 3): 51, 2021 02 24.
Artigo em Inglês | MEDLINE | ID: mdl-33627109

RESUMO

BACKGROUND: In this work, we aimed to demonstrate how to utilize the lab test results and other clinical information to support precision medicine research and clinical decisions on complex diseases, with the support of electronic medical record facilities. We defined "clinotypes" as clinical information that could be observed and measured objectively using biomedical instruments. From well-known 'omic' problem definitions, we defined problems using clinotype information, including stratifying patients-identifying interested sub cohorts for future studies, mining significant associations between clinotypes and specific phenotypes-diseases, and discovering potential linkages between clinotype and genomic information. We solved these problems by integrating public omic databases and applying advanced machine learning and visual analytic techniques on two-year health exam records from a large population of healthy southern Chinese individuals (size n = 91,354). When developing the solution, we carefully addressed the missing information, imbalance and non-uniformed data annotation issues. RESULTS: We organized the techniques and solutions to address the problems and issues above into CPA framework (Clinotype Prediction and Association-finding). At the data preprocessing step, we handled the missing value issue with predicted accuracy of 0.760. We curated 12,635 clinotype-gene associations. We found 147 Associations between 147 chronic diseases-phenotype and clinotypes, which improved the disease predictive performance to AUC (average) of 0.967. We mined 182 significant clinotype-clinotype associations among 69 clinotypes. CONCLUSIONS: Our results showed strong potential connectivity between the omics information and the clinical lab test information. The results further emphasized the needs to utilize and integrate the clinical information, especially the lab test results, in future PheWas and omic studies. Furthermore, it showed that the clinotype information could initiate an alternative research direction and serve as an independent field of data to support the well-known 'phenome' and 'genome' researches.


Assuntos
Registros Eletrônicos de Saúde , Genótipo , Humanos , Fenótipo , Exame Físico
9.
Front Big Data ; 4: 756041, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35198971

RESUMO

Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field has led to other implementations that do not work well together. The HPTMT architecture that we proposed recently, identifies a set of data structures, operators, and an execution model for creating rich data applications that links all aspects of data engineering and data science together efficiently. This paper elaborates and illustrates this architecture using an end-to-end application with deep learning and data engineering parts working together. Our analysis show that the proposed system architecture is better suited for high performance computing environments compared to the current big data processing systems. Furthermore our proposed system emphasizes the importance of efficient compact data structures such as Apache Arrow tabular data representation defined for high performance. Thus the system integration we proposed scales a sequential computation to a distributed computation retaining optimum performance along with highly usable application programming interface.

10.
Appl Environ Microbiol ; 82(16): 4921-30, 2016 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-27260357

RESUMO

UNLABELLED: Arbuscular mycorrhizal (AM) fungi form mutualisms with plant roots that increase plant growth and shape plant communities. Each AM fungal cell contains a large amount of genetic diversity, but it is unclear if this diversity varies across evolutionary lineages. We found that sequence variation in the nuclear large-subunit (LSU) rRNA gene from 29 isolates representing 21 AM fungal species generally assorted into genus- and species-level clades, with the exception of species of the genera Claroideoglomus and Entrophospora However, there were significant differences in the levels of sequence variation across the phylogeny and between genera, indicating that it is an evolutionarily constrained trait in AM fungi. These consistent patterns of sequence variation across both phylogenetic and taxonomic groups pose challenges to interpreting operational taxonomic units (OTUs) as approximations of species-level groups of AM fungi. We demonstrate that the OTUs produced by five sequence clustering methods using 97% or equivalent sequence similarity thresholds failed to match the expected species of AM fungi, although OTUs from AbundantOTU, CD-HIT-OTU, and CROP corresponded better to species than did OTUs from mothur or UPARSE. This lack of OTU-to-species correspondence resulted both from sequences of one species being split into multiple OTUs and from sequences of multiple species being lumped into the same OTU. The OTU richness therefore will not reliably correspond to the AM fungal species richness in environmental samples. Conservatively, this error can overestimate species richness by 4-fold or underestimate richness by one-half, and the direction of this error will depend on the genera represented in the sample. IMPORTANCE: Arbuscular mycorrhizal (AM) fungi form important mutualisms with the roots of most plant species. Individual AM fungi are genetically diverse, but it is unclear whether the level of this diversity differs among evolutionary lineages. We found that the amount of sequence variation in an rRNA gene that is commonly used to identify AM fungal species varied significantly between evolutionary groups that correspond to different genera, with the exception of two genera that are genetically indistinguishable from each other. When we clustered groups of similar sequences into operational taxonomic units (OTUs) using five different clustering methods, these patterns of sequence variation caused the number of OTUs to either over- or underestimate the actual number of AM fungal species, depending on the genus. Our results indicate that OTU-based inferences about AM fungal species composition from environmental sequences can be improved if they take these taxonomically structured patterns of sequence variation into account.


Assuntos
Genes Fúngicos , Genes de RNAr , Micorrizas/genética , Filogenia , Análise por Conglomerados , Variação Genética , Micorrizas/classificação , Análise de Sequência de DNA
11.
Bioinformatics ; 32(16): 2502-4, 2016 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-27153595

RESUMO

UNLABELLED: : MGEScan-long terminal repeat (LTR) and MGEScan-non-LTR are successfully used programs for identifying LTRs and non-LTR retrotransposons in eukaryotic genome sequences. However, these programs are not supported by easy-to-use interfaces nor well suited for data visualization in general data formats. Here, we present MGEScan, a user-friendly system that combines these two programs with a Galaxy workflow system accelerated with MPI and Python threading on compute clusters. MGEScan and Galaxy empower researchers to identify transposable elements in a graphical user interface with ready-to-use workflows. MGEScan also visualizes the custom annotation tracks for mobile genetic elements in public genome browsers. A maximum speed-up of 3.26× is attained for execution time using concurrent processing and MPI on four virtual cores. MGEScan provides four operational modes: as a command line tool, as a Galaxy Toolshed, on a Galaxy-based web server, and on a virtual cluster on the Amazon cloud. AVAILABILITY AND IMPLEMENTATION: MGEScan tutorials and source code are available at http://mgescan.readthedocs.org/ CONTACT: hatang@indiana.edu or syoh@ajou.ac.kr SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Linguagens de Programação , Retroelementos , Biologia Computacional/métodos , Genoma , Software , Integração de Sistemas
12.
OMICS ; 18(1): 10-4, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24456465

RESUMO

Biological processes are fundamentally driven by complex interactions between biomolecules. Integrated high-throughput omics studies enable multifaceted views of cells, organisms, or their communities. With the advent of new post-genomics technologies, omics studies are becoming increasingly prevalent; yet the full impact of these studies can only be realized through data harmonization, sharing, meta-analysis, and integrated research. These essential steps require consistent generation, capture, and distribution of metadata. To ensure transparency, facilitate data harmonization, and maximize reproducibility and usability of life sciences studies, we propose a simple common omics metadata checklist. The proposed checklist is built on the rich ontologies and standards already in use by the life sciences community. The checklist will serve as a common denominator to guide experimental design, capture important parameters, and be used as a standard format for stand-alone data publications. The omics metadata checklist and data publications will create efficient linkages between omics data and knowledge-based life sciences innovation and, importantly, allow for appropriate attribution to data generators and infrastructure science builders in the post-genomics era. We ask that the life sciences community test the proposed omics metadata checklist and data publications and provide feedback for their use and improvement.


Assuntos
Disseminação de Informação/ética , Metagenômica/estatística & dados numéricos , Projetos de Pesquisa/normas , Mineração de Dados , Humanos , Metagenômica/economia , Metagenômica/tendências , Editoração , Reprodutibilidade dos Testes
14.
Big Data ; 1(4): 196-201, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-27447251

RESUMO

Biological processes are fundamentally driven by complex interactions between biomolecules. Integrated high-throughput omics studies enable multifaceted views of cells, organisms, or their communities. With the advent of new post-genomics technologies, omics studies are becoming increasingly prevalent; yet the full impact of these studies can only be realized through data harmonization, sharing, meta-analysis, and integrated research. These essential steps require consistent generation, capture, and distribution of metadata. To ensure transparency, facilitate data harmonization, and maximize reproducibility and usability of life sciences studies, we propose a simple common omics metadata checklist. The proposed checklist is built on the rich ontologies and standards already in use by the life sciences community. The checklist will serve as a common denominator to guide experimental design, capture important parameters, and be used as a standard format for stand-alone data publications. The omics metadata checklist and data publications will create efficient linkages between omics data and knowledge-based life sciences innovation and, importantly, allow for appropriate attribution to data generators and infrastructure science builders in the post-genomics era. We ask that the life sciences community test the proposed omics metadata checklist and data publications and provide feedback for their use and improvement.

15.
BMC Bioinformatics ; 13 Suppl 2: S9, 2012 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-22536872

RESUMO

BACKGROUND: Modern pyrosequencing techniques make it possible to study complex bacterial populations, such as 16S rRNA, directly from environmental or clinical samples without the need for laboratory purification. Alignment of sequences across the resultant large data sets (100,000+ sequences) is of particular interest for the purpose of identifying potential gene clusters and families, but such analysis represents a daunting computational task. The aim of this work is the development of an efficient pipeline for the clustering of large sequence read sets. METHODS: Pairwise alignment techniques are used here to calculate genetic distances between sequence pairs. These methods are pleasingly parallel and have been shown to more accurately reflect accurate genetic distances in highly variable regions of rRNA genes than do traditional multiple sequence alignment (MSA) approaches. By utilizing Needleman-Wunsch (NW) pairwise alignment in conjunction with novel implementations of interpolative multidimensional scaling (MDS), we have developed an effective method for visualizing massive biosequence data sets and quickly identifying potential gene clusters. RESULTS: This study demonstrates the use of interpolative MDS to obtain clustering results that are qualitatively similar to those obtained through full MDS, but with substantial cost savings. In particular, the wall clock time required to cluster a set of 100,000 sequences has been reduced from seven hours to less than one hour through the use of interpolative MDS. CONCLUSIONS: Although work remains to be done in selecting the optimal training set size for interpolative MDS, substantial computational cost savings will allow us to cluster much larger sequence sets in the future.


Assuntos
Metagenômica/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Análise por Conglomerados , RNA Ribossômico 16S/genética , Alinhamento de Sequência
16.
In Silico Biol ; 11(1-2): 41-60, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22475751

RESUMO

Some of the latest trends in cheminformatics, computation, and the world wide web are reviewed with predictions of how these are likely to impact the field of cheminformatics in the next five years. The vision and some of the work of the Chemical Informatics and Cyberinfrastructure Collaboratory at Indiana University are described, which we base around the core concepts of e-Science and cyberinfrastructure that have proven successful in other fields. Our chemical informatics cyberinfrastructure is realized by building a flexible, generic infrastructure for cheminformatics tools and databases, exporting "best of breed" methods as easily-accessible web APIs for cheminformaticians, scientists, and researchers in other disciplines, and hosting a unique chemical informatics education program aimed at scientists and cheminformatics practitioners in academia and industry.


Assuntos
Química/educação , Bases de Dados Factuais , Informática/educação , Internet , Comportamento Cooperativo , Software , Universidades
17.
Curr Comput Aided Drug Des ; 6(1): 50-67, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20370695

RESUMO

In recent years, there has been an explosion in the availability of publicly accessible chemical information, including chemical structures of small molecules, structure-derived properties and associated biological activities in a variety of assays. These data sources present us with a significant opportunity to develop and apply computational tools to extract and understand the underlying structure-activity relationships. Furthermore, by integrating chemical data sources with biological information (protein structure, gene expression and so on), we can attempt to build up a holistic view of the effects of small molecules in biological systems. Equally important is the ability for non-experts to access and utilize state of the art cheminformatics method and models. In this review we present recent developments in cheminformatics methodologies and infrastructure that provide a robust, distributed approach to mining large and complex chemical datasets. In the area of methodology development, we highlight recent work on characterizing structure-activity landscapes, Quantitative Structure Activity Relationship (QSAR) model domain applicability and the use of chemical similarity in text mining. In the area of infrastructure, we discuss a distributed web services framework that allows easy deployment and uniform access to computational (statistics, cheminformatics and computational chemistry) methods, data and models. We also discuss the development of PubChem derived databases and highlight techniques that allow us to scale the infrastructure to extremely large compound collections, by use of distributed processing on Grids. Given that the above work is applicable to arbitrary types of cheminformatics problems, we also present some case studies related to virtual screening for anti-malarials and predictions of anti-cancer activity.


Assuntos
Mineração de Dados/tendências , Bases de Dados Factuais/tendências , Informática/tendências , Modelos Químicos , Animais , Mineração de Dados/métodos , Humanos , Informática/métodos , Relação Quantitativa Estrutura-Atividade
18.
BMC Bioinformatics ; 11 Suppl 12: S3, 2010 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-21210982

RESUMO

BACKGROUND: Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications. However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much data analysis. Such problems can be run efficiently on clusters using MPI leading to a hybrid cloud and cluster environment. This motivates the design and implementation of an open source Iterative MapReduce system Twister. RESULTS: Comparisons of Amazon, Azure, and traditional Linux and Windows environments on common applications have shown encouraging performance and usability comparisons in several important non iterative cases. These are linked to MPI applications for final stages of the data analysis. Further we have released the open source Twister Iterative MapReduce and benchmarked it against basic MapReduce (Hadoop) and MPI in information retrieval and life sciences applications. CONCLUSIONS: The hybrid cloud (MapReduce) and cluster (MPI) approach offers an attractive production environment while Twister promises a uniform programming environment for many Life Sciences applications. METHODS: We used commercial clouds Amazon and Azure and the NSF resource FutureGrid to perform detailed comparisons and evaluations of different approaches to data intensive computing. Several applications were developed in MPI, MapReduce and Twister in these different environments.


Assuntos
Biologia Computacional/métodos , Software , Disciplinas das Ciências Biológicas , Análise por Conglomerados , Mineração de Dados , Metagenômica
19.
J Chem Inf Model ; 47(4): 1303-7, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17602467

RESUMO

The vast increase of pertinent information available to drug discovery scientists means that there is a strong demand for tools and techniques for organizing and intelligently mining this information for manageable human consumption. At Indiana University, we have developed an infrastructure of chemoinformatics Web services that simplifies the access to this information and the computational techniques that can be applied to it. In this paper, we describe this infrastructure, give some examples of its use, and then discuss our plans to use it as a platform for chemoinformatics application development in the future.


Assuntos
Informática , Internet , Sistemas de Gerenciamento de Base de Dados , Linguagens de Programação
20.
Philos Trans A Math Phys Eng Sci ; 363(1833): 1757-73, 2005 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-16099746

RESUMO

Grid application frameworks have increasingly aligned themselves with the developments in Web services. Web services are currently the most popular infrastructure based on service-oriented architecture (SOA) paradigm. There are three core areas within the SOA framework: (i) a set of capabilities that are remotely accessible, (ii) communications using messages and (iii) metadata pertaining to the aforementioned capabilities. In this paper, we focus on issues related to the messaging substrate hosting these services; we base these discussions on the NARADABROKERING system. We outline strategies to leverage capabilities available within the substrate without the need to make any changes to the service implementations themselves. We also identify the set of services needed to build Grids of Grids. Finally, we discuss another technology, HPSEARCH, which facilitates the administration of the substrate and the deployment of applications via a scripting interface. These issues have direct relevance to scientific Grid applications, which need to go beyond remote procedure calls in client-server interactions to support integrated distributed applications that couple databases, high performance computing codes and visualization codes.


Assuntos
Simulação por Computador , Informática/métodos , Armazenamento e Recuperação da Informação/métodos , Internet , Computação Matemática , Modelos Teóricos , Projetos de Pesquisa , Software , Design de Software , Integração de Sistemas , Estados Unidos , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...