Search | VHL Regional Portal

1.

Reflections on the Intermediate Data Structure (IDS).

Alter, George.

Hist Life Course Stud ; 10: 71-75, 2021 Mar 31.

Article in English | MEDLINE | ID: mdl-38410260

ABSTRACT

The Intermediate Data Structure (IDS) encourages sharing historical life course data by storing data in a common format. To encompass the complexity of life histories, IDS relies on data structures that are unfamiliar to most social scientists. This article examines four features of IDS that make it flexible and expandable: the Entity-Attribute-Value model, the relational database model, embedded metadata, and the Chronicle file. I also consider IDS from the perspective of current discussions about sharing data across scientific domains. We can find parallels to IDS in other fields that may lead to future innovations.

2.

The Data Tags Suite (DATS) model for discovering data access and use requirements.

Alter, George; Gonzalez-Beltran, Alejandra; Ohno-Machado, Lucila; Rocca-Serra, Philippe.

Gigascience ; 9(2)2020 02 01.

Article in English | MEDLINE | ID: mdl-32031623

ABSTRACT

BACKGROUND: Data reuse is often controlled to protect the privacy of subjects and patients. Data discovery tools need ways to inform researchers about restrictions on data access and re-use. RESULTS: We present elements in the Data Tags Suite (DATS) metadata schema describing data access, data use conditions, and consent information. DATS metadata are explained in terms of the administrative, legal, and technical systems used to protect confidential data. CONCLUSIONS: The access and use metadata items in DATS are designed from the perspective of a researcher who wants to find and re-use existing data. We call for standard ways of describing informed consent and data use agreements that will enable automated systems for managing research data.

Subject(s)

Data Management/methods , Computer Security/standards , Data Management/standards , Data Mining/methods , Data Mining/standards , Metadata

3.

The Evolution of Models in Historical Demography.

Alter, George.

J Interdiscip Hist ; 50(3): 325-362, 2020.

Article in English | MEDLINE | ID: mdl-37667772

ABSTRACT

This reflection on the evolution of methods and data in historical demography argues that we can still find inspiration and guidance in the work of the founders of our discipline. Historical demography is in the midst of a transition from a data-poor to a data-rich environment. Previous generations relied on demographic models to squeeze as much information as possible from the small amounts of data available. Today we live in a new era of large data sets and regression models. Researchers are creating both regional and international historical data sets of unprecedented size and depth. When examined closely, however, the methods that we use now make the same simplifying assumptions that generated the key advances of earlier generations. As we transition to new methods, demographic insight must inform our analyses and enrich our conclusions.

4.

Re-introducing the Cambridge Group Family Reconstitutions.

Alter, George; Newton, Gill; Oeppen, Jim.

Hist Life Course Stud ; 9: 24-48, 2020.

Article in English | MEDLINE | ID: mdl-38464868

ABSTRACT

English Population History from Family Reconstitution 1580-1837 was important both for its scope and its methodology. The volume was based on data from family reconstitutions of 26 parishes carefully selected to represent 250 years of English demographic history. These data remain relevant for new research questions, such as studying the intergenerational inheritance of fertility and mortality. To expand their availability the family reconstitutions have been translated into new formats: a relational database, the Intermediate Data Structure (IDS) and an episode file for fertility analysis. This paper describes that process and examines the impact of methodological decisions on analysis of the data. Wrigley, Davies, Oeppen, and Schofield were sensitive to changes in the quality of the parish registers and cautiously applied the principles of family reconstitution developed by Louis Henry. We examine how these choices affect the measurement of fertility and biases that are introduced when important principles are ignored.

5.

Responsible practices for data sharing.

Alter, George; Gonzalez, Richard.

Am Psychol ; 73(2): 146-156, 2018.

Article in English | MEDLINE | ID: mdl-29481108

ABSTRACT

Research transparency, reproducibility, and data sharing uphold core principles of science at a time when the integrity of scientific research is being questioned. This article discusses how research data in psychology can be made accessible for reproducibility and reanalysis by describing practical ways to overcome barriers to data sharing. We examine key issues surrounding the sharing of data such as who owns research data, how to protect the confidentiality of the research participant, how to give appropriate credit to the data creator, how to deal with metadata and codebooks, how to address provenance, and other specifics such as versioning and file formats. The protection of research subjects is a fundamental obligation, and we explain frameworks and procedures designed to protect against the harms that may result from disclosure of confidential information. We also advocate greater recognition for data creators and the authors of program code used in the management and analysis of data. We argue that research data and program code are important scientific contributions that should be cited in the same way as publications. (PsycINFO Database Record

Subject(s)

Confidentiality/ethics , Ethics, Research , Information Dissemination/ethics , Psychology/ethics , Research Design , Research Subjects , Humans , Reproducibility of Results

6.

DataMed - an open source discovery index for finding biomedical datasets.

Chen, Xiaoling; Gururaj, Anupama E; Ozyurt, Burak; Liu, Ruiling; Soysal, Ergin; Cohen, Trevor; Tiryaki, Firat; Li, Yueling; Zong, Nansu; Jiang, Min; Rogith, Deevakar; Salimi, Mandana; Kim, Hyeon-Eui; Rocca-Serra, Philippe; Gonzalez-Beltran, Alejandra; Farcas, Claudiu; Johnson, Todd; Margolis, Ron; Alter, George; Sansone, Susanna-Assunta; Fore, Ian M; Ohno-Machado, Lucila; Grethe, Jeffrey S; Xu, Hua.

J Am Med Inform Assoc ; 25(3): 300-308, 2018 Mar 01.

Article in English | MEDLINE | ID: mdl-29346583

ABSTRACT

OBJECTIVE: Finding relevant datasets is important for promoting data reuse in the biomedical domain, but it is challenging given the volume and complexity of biomedical data. Here we describe the development of an open source biomedical data discovery system called DataMed, with the goal of promoting the building of additional data indexes in the biomedical domain. MATERIALS AND METHODS: DataMed, which can efficiently index and search diverse types of biomedical datasets across repositories, is developed through the National Institutes of Health-funded biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) consortium. It consists of 2 main components: (1) a data ingestion pipeline that collects and transforms original metadata information to a unified metadata model, called DatA Tag Suite (DATS), and (2) a search engine that finds relevant datasets based on user-entered queries. In addition to describing its architecture and techniques, we evaluated individual components within DataMed, including the accuracy of the ingestion pipeline, the prevalence of the DATS model across repositories, and the overall performance of the dataset retrieval engine. RESULTS AND CONCLUSION: Our manual review shows that the ingestion pipeline could achieve an accuracy of 90% and core elements of DATS had varied frequency across repositories. On a manually curated benchmark dataset, the DataMed search engine achieved an inferred average precision of 0.2033 and a precision at 10 (P@10, the number of relevant results in the top 10 search results) of 0.6022, by implementing advanced natural language processing and terminology services. Currently, we have made the DataMed system publically available as an open source package for the biomedical community.

7.

A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge.

Cohen, Trevor; Roberts, Kirk; Gururaj, Anupama E; Chen, Xiaoling; Pournejati, Saeid; Alter, George; Hersh, William R; Demner-Fushman, Dina; Ohno-Machado, Lucila; Xu, Hua.

Database (Oxford) ; 20172017 01 01.

Article in English | MEDLINE | ID: mdl-29220453

ABSTRACT

Database URL: https://biocaddie.org/benchmark-data.

Subject(s)

Biomedical Research , Databases, Factual , Information Storage and Retrieval/methods

8.

DATS, the data tag suite to enable discoverability of datasets.

Sansone, Susanna-Assunta; Gonzalez-Beltran, Alejandra; Rocca-Serra, Philippe; Alter, George; Grethe, Jeffrey S; Xu, Hua; Fore, Ian M; Lyle, Jared; Gururaj, Anupama E; Chen, Xiaoling; Kim, Hyeon-Eui; Zong, Nansu; Li, Yueling; Liu, Ruiling; Ozyurt, I Burak; Ohno-Machado, Lucila.

Sci Data ; 4: 170059, 2017 06 06.

Article in English | MEDLINE | ID: mdl-28585923

ABSTRACT

Today's science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the National Institutes of Health (NIH)'s Big Data to Knowledge (BD2K) initiative, we have designed and implemented the DAta Tag Suite (DATS) model to support the DataMed data discovery index. DataMed's goal is to be for data what PubMed has been for the scientific literature. Akin to the Journal Article Tag Suite (JATS) used in PubMed, the DATS model enables submission of metadata on datasets to DataMed. DATS has a core set of elements, which are generic and applicable to any type of dataset, and an extended set that can accommodate more specialized data types. DATS is a platform-independent model also available as an annotated serialization in schema.org, which in turn is widely used by major search engines like Google, Microsoft, Yahoo and Yandex.

9.

Finding useful data across multiple biomedical data repositories using DataMed.

Ohno-Machado, Lucila; Sansone, Susanna-Assunta; Alter, George; Fore, Ian; Grethe, Jeffrey; Xu, Hua; Gonzalez-Beltran, Alejandra; Rocca-Serra, Philippe; Gururaj, Anupama E; Bell, Elizabeth; Soysal, Ergin; Zong, Nansu; Kim, Hyeon-Eui.

Nat Genet ; 49(6): 816-819, 2017 05 26.

Article in English | MEDLINE | ID: mdl-28546571

Subject(s)

Biological Ontologies , Biomedical Research , Computational Biology/methods , Databases, Factual , Metadata , Humans , Software , Systems Integration

10.

IDS Transposer: A Users Guide.

Merchant, Emily Klancher; Alter, George; Wang, Jane; Bhargav, Ashok.

Hist Life Course Stud ; 4: 59-96, 2017.

Article in English | MEDLINE | ID: mdl-30505920

ABSTRACT

The Intermediate Data Structure (IDS) provides a standard format for storing and sharing individual-level longitudinal life-course data (Alter and Mandemakers 2014; Alter, Mandemakers and Gutmann 2009). Once the data are in the IDS format, a standard set of programs can be used to extract data for analysis, facilitating the analysis of data across multiple databases. Currently, life-course databases store information in a variety of formats, and the process of translating data into IDS can be long and tedious. The IDS Transposer is a software tool that automates this process for source data in any format, allowing database administrators to specify how their datasets are to be represented in IDS. This article describes how the IDS Transposer works, first by going through an example step-by-step, and then by discussing each part of the process and potential options and exceptions in detail.

11.

Addressing Global Data Sharing Challenges.

Alter, George C; Vardigan, Mary.

J Empir Res Hum Res Ethics ; 10(3): 317-23, 2015 Jul.

Article in English | MEDLINE | ID: mdl-26297753

ABSTRACT

This issue of the Journal of Empirical Research on Human Research Ethics highlights the ethical issues that arise when researchers conducting projects in low- and middle-income countries seek to share the data they produce. Although sharing data is considered a best practice, the barriers to doing so are considerable and there is a need for guidance and examples. To that end, the authors of this article reviewed the articles in this special issue to identify challenges common to the five countries and to offer some practical advice to assist researchers in navigating this "uncharted territory," as some termed it. Concerns around informed consent, data management, data dissemination, and validation of research contributions were cited frequently as particularly challenging areas, so the authors focused on these four topics with the goal of providing specific resources to consult as well as examples of successful projects attempting to solve many of the problems raised.

Subject(s)

Biomedical Research , Cooperative Behavior , Developing Countries , Information Dissemination/ethics , Public Health , Data Collection , Data Curation , Global Health , Humans , Income , Informed Consent , Privacy , Research Personnel

12.

The Intermediate Data Structure (IDS) for Longitudinal Historical Microdata, version 4.

Alter, George; Mandemakers, Kees.

Hist Life Course Stud ; 1: 1-26, 2014.

Article in English | MEDLINE | ID: mdl-30505919

ABSTRACT

The Intermediate Data Structure (IDS) is a standard data format that has been adopted by several large longitudinal databases on historical populations. Since the publication of the first version in Historical Social Research in 2009, two improved and extended versions have been published in the Collaboratory Historical Life Courses. In this publication we present version 4 which is the latest 'official' standard of the IDS. Discussions with users over the last four years resulted in important changes, like the inclusion of a new table defining the hierarchical relationships among 'contexts,' decision schemes for recording relationships, additional fields in the metadata table, rules for handling stillbirths, a reciprocal model for relationships, guidance for linking IDS data with geospatial information, and the introduction of an extended IDS for computed variables.

13.

Effects of Inheritance and Environment on the Heights of Brothers in Nineteenth-Century Belgium.

Alter, George; Oris, Michel.

Hum Nat ; 19(1): 44-55, 2008 Mar.

Article in English | MEDLINE | ID: mdl-26181377

ABSTRACT

Shared genetic inheritance results in a high correlation in the heights of brothers, but experiences in childhood and adolescence can intervene. Poor diet, disease, and heavy labor can prevent the achievement of height potentials. If families cannot control variations in these conditions, the heights of brothers will be less strongly correlated. We use heights measured at military conscription examinations from three communities in nineteenth-century Belgium. The Generalized Estimating Equation procedure allows us to estimate effects of covariates on mean heights as well as the correlations within families. Both average height and the correlation of brothers' heights differed by socioeconomic status. Members of the local elite were taller and the heights of brothers in those families were more strongly correlated. This suggests that elite families were much better able to control the environmental challenges faced by their offspring.

14.

Widowhood, family size, and post-reproductive mortality: a comparative analysis of three populations in nineteenth-century Europe.

Alter, George; Dribe, Martin; Van Poppel, Frans.

Demography ; 44(4): 785-806, 2007 Nov.

Article in English | MEDLINE | ID: mdl-18232211

ABSTRACT

Researchers from a number of disciplines have offered competing theories about the effects oJ childbearing on parents 'postreproductive longevity. The "disposable soma theory" argues that investments in somatic maintenance increase longevity but reduce childbearing. "Maternal depletion" models suggest that the rigors of childrearing increase mortality in later years. Other researchers consider continued childbearing a sign of healthy aging and a predictor of future longevity. Empirical studies have produced inconsistent and contradictory results. Our focus is on the experience of widowhood, which has been ignored in previous studies. We hypothesize that the death of a spouse is a stressful event with long-term consequences for health, especially for women with small children. Data are drawn from historical sources in Sweden, Belgium, and the Netherlands from 1766 to 1980. Postreproductive mortality was highest among young widows with larger families in all three samples. Age at last birth had little or no effect. We conclude that raising children under adverse circumstances can have long-lasting, harmful effects on a mother's health.

Subject(s)

Family Characteristics , Longevity , Mothers/psychology , Stress, Psychological/mortality , Widowhood/psychology , Adult , Age Factors , Aged , Bereavement , Europe/epidemiology , Female , Health Status , Humans , Male , Maternal Age , Middle Aged , Mortality/trends , Parity , Pregnancy , Proportional Hazards Models , Registries

15.

Childhood conditions, migration, and mortality: migrants and natives in 19th-century cities.

Alter, George; Oris, Michel.

Soc Biol ; 52(3-4): 178-91, 2005.

Article in English | MEDLINE | ID: mdl-17619610

ABSTRACT

Migrants often have lower mortality than natives in spite of relatively unfavorable social and economic characteristics. Although migrants have a short-run advantage due to the selective migration of healthy workers, persistent health and mortality differences between migrants and natives may be long-run effects of different experiences in childhood. We made use of a natural experiment resulting from rural-to-urban migration in the mid-19th century. Mortality was much higher in urban areas, especially in rapidly growing industrial cities. Migrants usually came from healthier rural origins as young adults. Data used in this study is available from 19th-century Belgian population registers describing two sites: a rapidly growing industrial city and a small town that became an industrial suburb. We found evidence of three processes that lead to differences between the mortality of migrants and natives. First, recent migrants had lower mortality than natives, because they were self-selected for good health when they arrived. This advantage decreased with time spent in the destination. Second, migrants from rural backgrounds had a disadvantage in epidemic years, because they had less experience with these diseases. Third, migrants from rural areas had lower mortality at older (but not younger) ages, even if they had migrated more than 10 years earlier. We interpret this as a long-run consequence of less exposure to disease in childhood.

Subject(s)

Mortality , Transients and Migrants/history , Urban Health/history , Adult , Belgium/epidemiology , Child , Female , History, 19th Century , Humans , Male , Middle Aged , Proportional Hazards Models , Risk Factors , Socioeconomic Factors

16.

Height, frailty, and the standard of living: modelling the effects of diet and disease on declining mortality and increasing height.

Alter, George.

Popul Stud (Camb) ; 58(3): 265-79, 2004.

Article in English | MEDLINE | ID: mdl-15513283

ABSTRACT

Explanations of historical trends in both mortality and human height differ over the relative contributions of better nutrition and reduced exposure to disease. This paper explores theoretical models in which interactions between diet and disease determine both mortality and height. One model assumes that adult height is directly related to frailty, the relative risk of dying. The second model links frailty to differences between attained and potential height. Diet plays a small role in the transition to low mortality in the first model. The second model assigns a large role to diet in historical mortality trends, but implies that mortality will be unrelated to height in the future.

Subject(s)

Body Height , Diet , Mortality/trends , Anthropometry , Disease Susceptibility , Humans , Models, Theoretical , Socioeconomic Factors

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL