Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Front Res Metr Anal ; 7: 982435, 2022.
Article in English | MEDLINE | ID: mdl-36419537

ABSTRACT

With the Square Kilometer Array (SKA) project and the new Multi-Purpose Reactor (MPR) soon coming on-line, South Africa and other collaborating countries in Africa will need to make the management, analysis, publication, and curation of "Big Scientific Data" a priority. In addition, the recent draft Open Science policy from the South African Department of Science and Innovation (DSI) requires both Open Access to scholarly publications and research outputs, and an Open Data policy that facilitates equal opportunity of access to research data. The policy also endorses the deposit, discovery and dissemination of data and metadata in a manner consistent with the FAIR principles - making data Findable, Accessible, Interoperable and Re-usable (FAIR). The challenge to achieve Open Science in Africa starts with open access for research publications and the provision of persistent links to the supporting data. With the deluge of research data expected from the new experimental facilities in South Africa, the problem of how to make such data FAIR takes center stage. One promising approach to make such scientific datasets more "Findable" and "Interoperable" is to rely on the Dataset representation of the Schema.org vocabulary which has been endorsed by all the major search engines. The approach adds some semantic markup to Web pages and makes scientific datasets more "Findable" by search engines. This paper does not address all aspects of the Open Science agenda but instead is focused on the management and analysis challenges of the "Big Scientific Data" that will be produced by the SKA project. The paper summarizes the role of the SKA Regional Centers (SRCs) and then discusses the goal of ensuring reproducibility for the SKA data products. Experiments at the new MPR neutron source will also have to conform to the DSI's Open Science policy. The Open Science and FAIR data practices used at the ISIS Neutron source at the Rutherford Appleton Laboratory in the UK are then briefly described. The paper concludes with some remarks about the important role of interdisciplinary teams of research software engineers, data engineers and research librarians in research data management.

2.
Patterns (N Y) ; 2(1): 100156, 2021 Jan 08.
Article in English | MEDLINE | ID: mdl-33511362

ABSTRACT

Digital technology is having a major impact on many areas of society, and there is equal opportunity for impact on science. This is particularly true in the environmental sciences as we seek to understand the complexities of the natural environment under climate change. This perspective presents the outcomes of a summit in this area, a unique cross-disciplinary gathering bringing together environmental scientists, data scientists, computer scientists, social scientists, and representatives of the creative arts. The key output of this workshop is an agreed vision in the form of a framework and associated roadmap, captured in the Windermere Accord. This accord envisions a new kind of environmental science underpinned by unprecedented amounts of data, with technological advances leading to breakthroughs in taming uncertainty and complexity, and also supporting openness, transparency, and reproducibility in science. The perspective also includes a call to build an international community working in this important area.

3.
Philos Trans A Math Phys Eng Sci ; 378(2166): 20190054, 2020 Mar 06.
Article in English | MEDLINE | ID: mdl-31955675

ABSTRACT

This paper reviews some of the challenges posed by the huge growth of experimental data generated by the new generation of large-scale experiments at UK national facilities at the Rutherford Appleton Laboratory (RAL) site at Harwell near Oxford. Such 'Big Scientific Data' comes from the Diamond Light Source and Electron Microscopy Facilities, the ISIS Neutron and Muon Facility and the UK's Central Laser Facility. Increasingly, scientists are now required to use advanced machine learning and other AI technologies both to automate parts of the data pipeline and to help find new scientific discoveries in the analysis of their data. For commercially important applications, such as object recognition, natural language processing and automatic translation, deep learning has made dramatic breakthroughs. Google's DeepMind has now used the deep learning technology to develop their AlphaFold tool to make predictions for protein folding. Remarkably, it has been able to achieve some spectacular results for this specific scientific problem. Can deep learning be similarly transformative for other scientific problems? After a brief review of some initial applications of machine learning at the RAL, we focus on challenges and opportunities for AI in advancing materials science. Finally, we discuss the importance of developing some realistic machine learning benchmarks using Big Scientific Data coming from several different scientific domains. We conclude with some initial examples of our 'scientific machine learning' benchmark suite and of the research challenges these benchmarks will enable. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'.

4.
Harv Bus Rev ; 88(11): 56-63, 150, 2010 Nov.
Article in English | MEDLINE | ID: mdl-21049680

ABSTRACT

For decades, computer scientists have tried to teach computers to think like human experts. Until recently, most of those efforts have failed to come close to generating the creative insights and solutions that seem to come naturally to the best researchers, doctors, and engineers. But now, Tony Hey, a VP of Microsoft Research, says we're witnessing the dawn of a new generation of powerful computer tools that can "mash up" vast quantities of data from many sources, analyze them, and help produce revolutionary scientific discoveries. Hey and his colleagues call this new method of scientific exploration "machine learning." At Microsoft, a team has already used it to innovate a method of predicting with impressive accuracy whether a patient with congestive heart failure who is released from the hospital will be readmitted within 30 days. It was developed by directing a computer program to pore through hundreds of thousands of data points on 300,000 patients and "learn" the profiles of patients most likely to be rehospitalized. The economic impact of this prediction tool could be huge: If a hospital understands the likelihood that a patient will "bounce back," it can design programs to keep him stable and save thousands of dollars in health care costs. Similar efforts to uncover important correlations that could lead to scientific breakthroughs are under way in oceanography, conservation, and AIDS research. And in business, deep data exploration has the potential to unearth critical insights about customers, supply chains, advertising effectiveness, and more.


Subject(s)
Data Mining , Diffusion of Innovation , Science , Statistics as Topic , United States
5.
Science ; 323(5919): 1297-8, 2009 Mar 06.
Article in English | MEDLINE | ID: mdl-19265007
6.
Science ; 308(5723): 817-21, 2005 May 06.
Article in English | MEDLINE | ID: mdl-15879209

ABSTRACT

Here we describe the requirements of an e-Infrastructure to enable faster, better, and different scientific research capabilities. We use two application exemplars taken from the United Kingdom's e-Science Programme to illustrate these requirements and make the case for a service-oriented infrastructure. We provide a brief overview of the UK "plug-and-play composable services" vision and the role of semantics in such an e-Infrastructure.


Subject(s)
Computational Biology , Computer Communication Networks , Computing Methodologies , Internet , Research , Software , Chemical Phenomena , Chemistry , Combinatorial Chemistry Techniques , Databases as Topic , Graves Disease/genetics , Williams Syndrome/genetics
7.
Philos Trans A Math Phys Eng Sci ; 361(1809): 1809-25, 2003 Aug 15.
Article in English | MEDLINE | ID: mdl-12952686

ABSTRACT

After a definition of e-science and the Grid, the paper begins with an overview of the technological context of Grid developments. NASA's Information Power Grid is described as an early example of a 'prototype production Grid'. The discussion of e-science and the Grid is then set in the context of the UK e-Science Programme and is illustrated with reference to some UK e-science projects in science, engineering and medicine. The Open Standards approach to Grid middleware adopted by the community in the Global Grid Forum is described and compared with community-based standardization processes used for the Internet, MPI, Linux and the Web. Some implications of the imminent data deluge that will arise from the new generation of e-science experiments in terms of archiving and curation are then considered. The paper concludes with remarks about social and technological issues posed by Grid-enabled 'collaboratories' in both scientific and commercial contexts.


Subject(s)
Archives , Database Management Systems , Databases, Factual , Information Dissemination/methods , Information Storage and Retrieval/methods , International Cooperation , Science/methods , Information Storage and Retrieval/standards , Research , Science/standards , Science/trends , United States , United States National Aeronautics and Space Administration
SELECTION OF CITATIONS
SEARCH DETAIL
...