Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Knowl Data Eng ; 35(2): 1402-1420, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36798878

RESUMO

Shortened time to knowledge discovery and adapting prior domain knowledge is a challenge for computational and data-intensive communities such as e.g., bioinformatics and neuroscience. The challenge for a domain scientist lies in the actions to obtain guidance through query of massive information from diverse text corpus comprising of a wide-ranging set of topics when: investigating new methods, developing new tools, or integrating datasets. In this paper, we propose a novel "domain-specific topic model" (DSTM) to discover latent knowledge patterns about relationships among research topics, tools and datasets from exemplary scientific domains. Our DSTM is a generative model that extends the Latent Dirichlet Allocation (LDA) model and uses the Markov chain Monte Carlo (MCMC) algorithm to infer latent patterns within a specific domain in an unsupervised manner. We apply our DSTM to large collections of data from bioinformatics and neuroscience domains that include more than 25,000 of papers over the last ten years, featuring hundreds of tools and datasets that are commonly used in relevant studies. Evaluation experiments based on generalization and information retrieval metrics show that our model has better performance than the state-of-the-art baseline models for discovering highly-specific latent topics within a domain. Lastly, we demonstrate applications that benefit from our DSTM to discover intra-domain, cross-domain and trend knowledge patterns.

2.
Peer Peer Netw Appl ; 14(5): 3012-3028, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33968293

RESUMO

Healthcare innovations are increasingly becoming reliant on high variety and standards-compliant (e.g., HIPAA, common data model) distributed data sets that enable predictive analytics. Consequently, health information systems need to be developed using cooperation and distributed trust principles to allow protected data sharing between multiple domains or entities (e.g., health data service providers, hospitals and research labs). In this paper, we present a novel health information sharing system viz., HonestChain that uses Blockchain technology to allow organizations to have incentive-based and trustworthy cooperation to either access or provide protected healthcare records. More specifically, we use a consortium Blockchain approach coupled with chatbot guided interfaces that allow data requesters to: (a) comply with data access standards, and (b) allow them to gain reputation in a consortium. We also propose a reputation scheme for creation and sustenance of the consortium with peers using Requester Reputation and Provider Reputation metrics. We evaluate HonestChain using Hyperledger Composer in a realistic simulation testbed on a public cloud infrastructure. Our results show that our HonestChain performs better than the state-of-the-art requester reputation schemes for data request handling, while choosing the most appropriate provider peers. We particularly show that HonestChain achieves a better tradeoff in metrics such as service time and request resubmission rate. Additionally, we also demonstrate the scalability of our consortium platform in terms of the Blockchain transaction times.

3.
Concurr Comput ; 33(19)2021 Oct 10.
Artigo em Inglês | MEDLINE | ID: mdl-35495546

RESUMO

Scientists in disciplines such as neuroscience and bioinformatics are increasingly relying on science gateways for experimentation on voluminous data, as well as analysis and visualization in multiple perspectives. Though current science gateways provide easy access to computing resources, datasets and tools specific to the disciplines, scientists often use slow and tedious manual efforts to perform knowledge discovery to accomplish their research/education tasks. Recommender systems can provide expert guidance and can help them to navigate and discover relevant publications, tools, data sets, or even automate cloud resource configurations suitable for a given scientific task. To realize the potential of integration of recommenders in science gateways in order to spur research productivity, we present a novel "OnTimeRecommend" recommender system. The OnTimeRecommend comprises of several integrated recommender modules implemented as microservices that can be augmented to a science gateway in the form of a recommender-as-a-service. The guidance for use of the recommender modules in a science gateway is aided by a chatbot plug-in viz., Vidura Advisor. To validate our OnTimeRecommend, we integrate and show benefits for both novice and expert users in domain-specific knowledge discovery within two exemplar science gateways, one in neuroscience (CyNeuro) and the other in bioinformatics (KBCommons).

4.
Am J Disaster Med ; 14(2): 89-95, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31637689

RESUMO

OBJECTIVE: Search and rescue after mass casualty incidents relies on robust data infrastructure. Federal Emergency Management Agency (FEMA's) Task Force 1 (TF1) trains its volunteers to locate and virtually tag scene incidents using a global positioning satellite (GPS) device programmed with markers for each incident (Iron Sights). The authors performed a pilot study comparing Iron Sights™ to a Wi-Fi-based real-time incident geolocation and virtual tagging dashboard (Panacea™) in creating a dynamic common operating picture. DESIGN: Twenty-nine stations were placed at a predefined scene incident, each featuring a set of varying waypoint markers using standard FEMA/TF1 nomenclature. Two volunteers performed the experiment for both the Iron Sights and Panacea systems, digitally tagging all station waypoints. SETTING: TF1 simulation training field. MAIN OUTCOME MEASURE(S): Metrics compared included GPS location precision, marker accuracy, and delay between scene sweep and common operational picture (COP) generation. RESULTS: Two hundred and sixty-one waypoints were digitally tagged after excluding three stations for missing data. The average GPS location difference for all waypoints between Iron Sights and Panacea was 3.65 m. Marker tagging accuracy between Iron Sights and Panacea was equivalent and not statistically different (78.8 percent vs 66.2 percent, respectively, p = 0.11). Waypoints were tagged in 26.59 minutes and 10.55 minutes on average, respectively. Time from scene sweep to virtual COP generation was 7.97 minutes for Iron Sights after complete scene sweep and 37 seconds for Panacea for each waypoint posting in real-time. CONCLUSIONS: Panacea generated the COP in real-time compared to a delay with Iron Sights while maintaining the same location precision and marker accuracy. This pilot trial successfully demonstrated the ability to provide real-time actionable intelligence to incident commanders during mass casualty search and rescue missions. Larger field trials are recommended to refine the system and broaden its capabilities.


Assuntos
Simulação por Computador , Planejamento em Desastres/métodos , Serviços Médicos de Emergência , Incidentes com Feridos em Massa , Catalogação , Humanos , Projetos Piloto
5.
BMC Bioinformatics ; 17(Suppl 13): 337, 2016 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-27766951

RESUMO

BACKGROUND: With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. RESULTS: We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. CONCLUSION: PGen workflow has been optimized for the most efficient analysis of soybean data using thorough testing and validation. This research serves as an example of best practices for development of genomics data analysis workflows by integrating remote HPC resources and efficient data management with ease of use for biological users. PGen workflow can also be easily customized for analysis of data in other species.


Assuntos
Genoma de Planta , Glycine max/genética , Polimorfismo Genético , Análise de Sequência de DNA/métodos , Software , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...