Search | VHL Regional Portal

Correction: Seek and you may (not) find: A multi-institutional analysis of where research data are shared.

Johnston, Lisa R; Hofelich Mohr, Alicia; Herndon, Joel; Taylor, Shawna; Carlson, Jake R; Ge, Lizhao; Moore, Jennifer; Petters, Jonathan; Kozlowski, Wendy; Hudson Vitale, Cynthia.

PLoS One ; 19(6): e0306199, 2024.

Article in English | MEDLINE | ID: mdl-38905250

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pone.0302426.].

Understanding the value of curation: A survey of US data repository curation practices and perceptions.

Johnston, Lisa R; Curty, Renata; Braxton, Susan M; Carlson, Jake; Hadley, Hannah; Lafferty-Hess, Sophia; Luong, Hoa; Petters, Jonathan L; Kozlowski, Wendy A.

PLoS One ; 19(6): e0301171, 2024.

Article in English | MEDLINE | ID: mdl-38875230

ABSTRACT

Data curators play an important role in assessing data quality and take actions that may ultimately lead to better, more valuable data products. This study explores the curation practices of data curators working within US-based data repositories. We performed a survey in January 2021 to benchmark the levels of curation performed by repositories and assess the perceived value and impact of curation on the data sharing process. Our analysis included 95 responses from 59 unique data repositories. Respondents primarily were professionals working within repositories and examined curation performed within a repository setting. A majority 72.6% of respondents reported that "data-level" curation was performed by their repository and around half reported their repository took steps to ensure interoperability and reproducibility of their repository's datasets. Curation actions most frequently reported include checking for duplicate files, reviewing documentation, reviewing metadata, minting persistent identifiers, and checking for corrupt/broken files. The most "value-add" curation action across generalist, institutional, and disciplinary repository respondents was related to reviewing and enhancing documentation. Respondents reported high perceived impact of curation by their repositories on specific data sharing outcomes including usability, findability, understandability, and accessibility of deposited datasets; respondents associated with disciplinary repositories tended to perceive higher impact on most outcomes. Most survey participants strongly agreed that data curation by the repository adds value to the data sharing process and that it outweighs the effort and cost. We found some differences between institutional and disciplinary repositories, both in the reported frequency of specific curation actions as well as the perceived impact of data curation. Interestingly, we also found variation in the perceptions of those working within the same repository regarding the level and frequency of curation actions performed, which exemplifies the complexity of a repository curation work. Our results suggest data curation may be better understood in terms of specific curation actions and outcomes than broadly defined curation levels and that more research is needed to understand the resource implications of performing these activities. We share these results to provide a more nuanced view of curation, and how curation impacts the broader data lifecycle and data sharing behaviors.

Subject(s)

Data Curation , Humans , Surveys and Questionnaires , United States , Information Dissemination , Data Accuracy , Databases, Factual , Reproducibility of Results

Seek and you may (not) find: A multi-institutional analysis of where research data are shared.

Johnston, Lisa R; Hofelich Mohr, Alicia; Herndon, Joel; Taylor, Shawna; Carlson, Jake R; Ge, Lizhao; Moore, Jennifer; Petters, Jonathan; Kozlowski, Wendy; Hudson Vitale, Cynthia.

PLoS One ; 19(4): e0302426, 2024.

Article in English | MEDLINE | ID: mdl-38662676

ABSTRACT

Research data sharing has become an expected component of scientific research and scholarly publishing practice over the last few decades, due in part to requirements for federally funded research. As part of a larger effort to better understand the workflows and costs of public access to research data, this project conducted a high-level analysis of where academic research data is most frequently shared. To do this, we leveraged the DataCite and Crossref application programming interfaces (APIs) in search of Publisher field elements demonstrating which data repositories were utilized by researchers from six academic research institutions between 2012-2022. In addition, we also ran a preliminary analysis of the quality of the metadata associated with these published datasets, comparing the extent to which information was missing from metadata fields deemed important for public access to research data. Results show that the top 10 publishers accounted for 89.0% to 99.8% of the datasets connected with the institutions in our study. Known data repositories, including institutional data repositories hosted by those institutions, were initially lacking from our sample due to varying metadata standards and practices. We conclude that the metadata quality landscape for published research datasets is uneven; key information, such as author affiliation, is often incomplete or missing from source data repositories and aggregators. To enhance the findability, interoperability, accessibility, and reusability (FAIRness) of research data, we provide a set of concrete recommendations that repositories and data authors can take to improve scholarly metadata associated with shared datasets.

Subject(s)

Information Dissemination , Metadata , Information Dissemination/methods , Humans , Biomedical Research

The TRUST Principles for digital repositories.

Lin, Dawei; Crabtree, Jonathan; Dillo, Ingrid; Downs, Robert R; Edmunds, Rorie; Giaretta, David; De Giusti, Marisa; L'Hours, Hervé; Hugo, Wim; Jenkyns, Reyna; Khodiyar, Varsha; Martone, Maryann E; Mokrane, Mustapha; Navale, Vivek; Petters, Jonathan; Sierman, Barbara; Sokolova, Dina V; Stockhause, Martina; Westbrook, John.

Sci Data ; 7(1): 144, 2020 05 14.

Article in English | MEDLINE | ID: mdl-32409645

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL