Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
1.
Int J Eat Disord ; 55(2): 282-284, 2022 02.
Article in English | MEDLINE | ID: mdl-34984704

ABSTRACT

Burnette et al. aimed to validate two eating disorder symptom measures among transgender adults recruited from Mechanical Turk (MTurk). After identifying several data quality issues, Burnette et al. abandoned this aim and instead documented the issues they faced (e.g., demographic misrepresentation, repeat submissions, inconsistent responses across similar questions, failed attention checks). Consequently, Burnette et al. raised concerns about the use of MTurk for psychological research, particularly in an eating disorder context. However, we believe these claims are overstated because they arise from a single study not designed to test MTurk data quality. Further, despite claiming to go "above and beyond" current recommendations, Burnette et al. missed key screening procedures. In particular, they missed procedures known to prevent participants who use commercial data centers (i.e., server farms) to hide their true IP address and complete multiple surveys for financial gain. In this commentary, we outline key screening procedures that allow researchers to obtain quality MTurk data. We also highlight the importance of balancing efforts to increase data quality with efforts to maintain sample diversity. With appropriate screening procedures, which should be preregistered, MTurk remains a viable participant source that requires further validation in an eating disorder context.


Subject(s)
Crowdsourcing , Feeding and Eating Disorders , Adult , Attention , Crowdsourcing/methods , Crowdsourcing/standards , Feeding and Eating Disorders/diagnosis , Humans , Surveys and Questionnaires
2.
Int J Eat Disord ; 55(2): 276-277, 2022 02.
Article in English | MEDLINE | ID: mdl-34931338

ABSTRACT

In this commentary, we respond to Burnette et al.'s (2021) paper, which gives significant practical recommendations to improve data quality and validity while gathering data via Amazon's Mechanical Turk (MTurk). We argue that it is also important to acknowledge and review the specific ethical issues that might arise when recruiting MTurk workers as participants. We particularly raise three main ethical concerns that need to be addressed when recruiting research participants from participant recruitment platforms: participants' economic vulnerability, participants' sensitivity, and power dynamics between participants and researchers. We elaborate on these issues by discussing the ways in which they may appear and be responded to. We conclude that considering the ethical aspects of data collection and the potential impacts of data collection on those involved would complement Burnette et al.'s recommendations. Consequently, data collection processes should be transparent as well, in addition to data screening processes.


Subject(s)
Crowdsourcing , Crowdsourcing/standards , Data Collection/standards , Humans
3.
J Glob Health ; 11: 09001, 2021 Mar 01.
Article in English | MEDLINE | ID: mdl-33791099

ABSTRACT

BACKGROUND: Crowdsourcing was recognized as having the potential to collect information rapidly, inexpensively and accurately. U-Report is a mobile empowerment platform that connects young people all over the world to information that will change their lives and influence decisions. Previous studies of U-Report's effectiveness highlight strengths in the timeliness, low cost and high credibility for collecting and sending information, however they also highlight areas to improve on concerning data representation. EquityTool has developed a simpler approach to assess the wealth quintiles of respondents based on fewer questions derived from large household surveys such as Multiple Indicators Cluster Surveys (MICS) and Demographic and Health Surveys (DHS). METHODS: The methodology of Equity Tool was adopted to assess the socio-economic profile of U-Reporters (ie, enrolled participants of U-Report) in Bangladesh. The RapidPro flow collected the survey responses and scored them against the DHS national wealth index using the EquityTool methodology. This helped placing each U-Reporter who completed all questions into the appropriate wealth quintile. RESULTS: With 19% of the respondents completing all questions, the respondents fell into all 5 wealth quintiles, with 79% in the top-two quintiles and only 21% in the lower-three resulting in an Equity Index of 53/100 where 100 is completely in line with Bangladesh equity distribution and 1 is the least in line. An equitable random sample of 1828 U-Reporters from among the regular and frequent respondents was subsequently created for future surveys and the sample has an Equity Index of 98/100. CONCLUSIONS: U-Report in Bangladesh does reach the poorest quintiles while the initial recruitment skews to respondents towards better off families. It is possible to create an equitable random sub-sample of respondents from all five wealth quintiles and thus process information and data for future surveys. Moving forward, U-Reporters from the poorly represented quintiles may be incentivized to recruit peers to increase equity and representation. In times of COVID-19, U-Report in combination with the EquityTool has the potential to enhance the quality of crowdsourced data for statistical analysis.


Subject(s)
Crowdsourcing/standards , Surveys and Questionnaires/standards , Bangladesh , Female , Forecasting , Humans , Male , Socioeconomic Factors
4.
PLoS One ; 16(4): e0249580, 2021.
Article in English | MEDLINE | ID: mdl-33886587

ABSTRACT

Measuring airways in chest computed tomography (CT) scans is important for characterizing diseases such as cystic fibrosis, yet very time-consuming to perform manually. Machine learning algorithms offer an alternative, but need large sets of annotated scans for good performance. We investigate whether crowdsourcing can be used to gather airway annotations. We generate image slices at known locations of airways in 24 subjects and request the crowd workers to outline the airway lumen and airway wall. After combining multiple crowd workers, we compare the measurements to those made by the experts in the original scans. Similar to our preliminary study, a large portion of the annotations were excluded, possibly due to workers misunderstanding the instructions. After excluding such annotations, moderate to strong correlations with the expert can be observed, although these correlations are slightly lower than inter-expert correlations. Furthermore, the results across subjects in this study are quite variable. Although the crowd has potential in annotating airways, further development is needed for it to be robust enough for gathering annotations in practice. For reproducibility, data and code are available online: http://github.com/adriapr/crowdairway.git.


Subject(s)
Algorithms , Crowdsourcing/statistics & numerical data , Crowdsourcing/standards , Lung/diagnostic imaging , Machine Learning , Radiography, Thoracic/methods , Tomography, X-Ray Computed/methods , Humans
5.
J Clin Epidemiol ; 133: 130-139, 2021 05.
Article in English | MEDLINE | ID: mdl-33476769

ABSTRACT

BACKGROUND AND OBJECTIVES: Filtering the deluge of new research to facilitate evidence synthesis has proven to be unmanageable using current paradigms of search and retrieval. Crowdsourcing, a way of harnessing the collective effort of a "crowd" of people, has the potential to support evidence synthesis by addressing this information overload created by the exponential growth in primary research outputs. Cochrane Crowd, Cochrane's citizen science platform, offers a range of tasks aimed at identifying studies related to health care. Accompanying each task are brief, interactive training modules, and agreement algorithms that help ensure accurate collective decision-making.The aims of the study were to evaluate the performance of Cochrane Crowd in terms of its accuracy, capacity, and autonomy and to examine contributor engagement across three tasks aimed at identifying randomized trials. STUDY DESIGN AND SETTING: Crowd accuracy was evaluated by measuring the sensitivity and specificity of crowd screening decisions on a sample of titles and abstracts, compared with "quasi gold-standard" decisions about the same records using the conventional methods of dual screening. Crowd capacity, in the form of output volume, was evaluated by measuring the number of records processed by the crowd, compared with baseline. Crowd autonomy, the capability of the crowd to produce accurate collectively derived decisions without the need for expert resolution, was measured by the proportion of records that needed resolving by an expert. RESULTS: The Cochrane Crowd community currently has 18,897 contributors from 163 countries. Collectively, the Crowd has processed 1,021,227 records, helping to identify 178,437 reports of randomized controlled trials (RCTs) for Cochrane's Central Register of Controlled Trials. The sensitivity for each task was 99.1% for the RCT identification task (RCT ID), 99.7% for the RCT identification task of trials from ClinicalTrials.gov (CT ID), and 97.7% for the identification of RCTs from the International Clinical Trials Registry Platform (ICTRP ID). The specificity for each task was 99% for RCT ID, 98.6% for CT ID, and 99.1% for CT ICTRP ID. The capacity of the combined Crowd and machine learning workflow has increased fivefold in 6 years, compared with baseline. The proportion of records requiring expert resolution across the tasks ranged from 16.6% to 19.7%. CONCLUSION: Cochrane Crowd is sufficiently accurate and scalable to keep pace with the current rate of publication (and registration) of new primary studies. It has also proved to be a popular, efficient, and accurate way for a large number of people to play an important voluntary role in health evidence production. Cochrane Crowd is now an established part of Cochrane's effort to manage the deluge of primary research being produced.


Subject(s)
Biomedical Research/methods , Biomedical Research/standards , Crowdsourcing/methods , Crowdsourcing/standards , Patient Selection , Randomized Controlled Trials as Topic/methods , Randomized Controlled Trials as Topic/standards , Adult , Aged , Aged, 80 and over , Algorithms , Biomedical Research/statistics & numerical data , Crowdsourcing/statistics & numerical data , Female , Humans , Male , Middle Aged , Randomized Controlled Trials as Topic/statistics & numerical data , Sensitivity and Specificity
6.
Int J Aging Hum Dev ; 93(2): 700-721, 2021 09.
Article in English | MEDLINE | ID: mdl-32683886

ABSTRACT

A growing number of studies within the field of gerontology have included samples recruited from Amazon's Mechanical Turk (MTurk), an online crowdsourcing portal. While some research has examined how younger adult participants recruited through other means may differ from those recruited using MTurk, little work has addressed this question with older adults specifically. In the present study, we examined how older adults recruited via MTurk might differ from those recruited via a national probability sample, the Health and Retirement Study (HRS), on a battery of outcomes related to health and cognition. Using a Latin-square design, we examined the relationship between recruitment time, remuneration amount, and measures of cognitive functioning. We found substantial differences between our MTurk sample and the participants within the HRS, most notably within measures of verbal fluency and analogical reasoning. Additionally, remuneration amount was related to differences in time to complete recruitment, particularly at the lowest remuneration level, where recruitment completion required between 138 and 485 additional hours. While the general consensus has been that MTurk samples are a reasonable proxy for the larger population, this work suggests that researchers should be wary of overgeneralizing research conducted with older adults recruited through this portal.


Subject(s)
Crowdsourcing/statistics & numerical data , Research Subjects/statistics & numerical data , Aged , Aged, 80 and over , Crowdsourcing/standards , Female , Humans , Male , Middle Aged , Patient Selection , Research Subjects/psychology , United States
7.
JAMA Netw Open ; 3(10): e2021684, 2020 10 01.
Article in English | MEDLINE | ID: mdl-33104206

ABSTRACT

Importance: Despite major differences in their health care systems, medical crowdfunding is increasingly used to finance personal health care costs in Canada, the UK, and the US. However, little is known about the campaigns designed to raise monetary donations for medical expenses, the individuals who turn to crowdfunding, and their fundraising intent. Objective: To examine the demographic characteristics of medical crowdfunding beneficiaries, campaign characteristics, and their association with funding success in Canada, the UK, and the US. Design, Setting, and Participants: This cross-sectional study extracted and manually reviewed data from GoFundMe campaigns discoverable between February 2018 and March 2019. All available campaigns on each country domain's GoFundMe medical discovery webpage that benefitted a unique patient(s) were included from Canada, the UK, and the US. Data analysis was performed from March to December 2019. Exposures: Campaign and beneficiary characteristics. Main Outcomes and Measures: Log-transformed amount raised in US dollars. Results: This study examined 3396 campaigns including 1091 in Canada, 1082 in the UK, and 1223 in the US. Campaigns in the US (median [IQR], $38 204 [$31 200 to $52 123]) raised more funds than campaigns in Canada ($12 662 [$9377 to $19 251]) and the UK ($6285 [$4028 to $12 348]). In the overall cohort per campaign, Black individuals raised 11.5% less (95% CI, -19.0% to -3.2%; P = .006) than non-Black individuals, and male individuals raised 5.9% more (95% CI, 2.2% to 9.7%; P = .002) than female individuals. Female (39.4% of campaigns vs 50.8% of US population; difference, 11.3%; 95% CI, 8.6% to 14.1%; P < .001) and Black (5.3% of campaigns vs 13.4% of US population; difference, 8.1%; 95% CI, 6.8% to 9.3%; P < .001) beneficiaries were underrepresented among US campaigns. Campaigns primarily for routine treatment expenses were approximately 3 times more common in the US (77.9% [272 of 349 campaigns]) than in Canada (21.9% [55 of 251 campaigns]; difference, 56.0%; 95% CI, 49.3-62.7%; P < .001) or the UK (26.6% [127 of 478 campaigns]; difference, 51.4%; 95% CI, 45.5%-57.3%; P < .001). However, campaigns for routine care were less successful overall. Approved, inaccessible care and experimental care raised 35.7% (95% CI, 25.6% to 46.7%; P < .001) and 20.9% (95% CI, 13.3% to 29.1%; P < .001), respectively, more per campaign than routine care. Campaigns primarily for alternative treatment expenses (16.1% [174 of 1079 campaigns]) were nearly 4-fold more common for cancer (23.5% [144 of 614 campaigns]) vs noncancer (6.5% [30 of 465 campaigns]) diagnoses. Conclusions and Relevance: Important differences were observed in the reasons individuals turn to medical crowdfunding in the 3 countries examined that suggest racial and gender disparities in fundraising success. More work is needed to understand the underpinnings of these findings and their implications on health care provision in the countries examined.


Subject(s)
Crowdsourcing/methods , Health Care Costs/trends , Adolescent , Adult , Aged , Canada , Child , Child, Preschool , Cross-Sectional Studies , Crowdsourcing/standards , Crowdsourcing/trends , Delivery of Health Care/economics , Female , Fund Raising/methods , Fund Raising/standards , Fund Raising/trends , Health Care Costs/standards , Humans , Infant , Male , Middle Aged , United Kingdom , United States
8.
Ann Rheum Dis ; 79(9): 1139-1140, 2020 09.
Article in English | MEDLINE | ID: mdl-32527863

ABSTRACT

The COVID-19 pandemic forces the whole rheumatic and musculoskeletal diseases community to reassemble established treatment and research standards. Digital crowdsourcing is a key tool in this pandemic to create and distil desperately needed clinical evidence and exchange of knowledge for patients and physicians alike. This viewpoint explains the concept of digital crowdsourcing and discusses examples and opportunities in rheumatology. First experiences of digital crowdsourcing in rheumatology show transparent, accessible, accelerated research results empowering patients and rheumatologists.


Subject(s)
Biomedical Research/methods , Coronavirus Infections/therapy , Crowdsourcing/methods , Pneumonia, Viral/therapy , Rheumatology/methods , Betacoronavirus , Biomedical Research/standards , COVID-19 , Coronavirus Infections/virology , Crowdsourcing/standards , Humans , Pandemics , Pneumonia, Viral/virology , Rheumatology/standards , SARS-CoV-2
10.
Addiction ; 115(10): 1960-1968, 2020 10.
Article in English | MEDLINE | ID: mdl-32135574

ABSTRACT

AIMS: Amazon Mechanical Turk (MTurk) provides a crowdsourcing platform for the engagement of potential research participants with data collection instruments. This review (1) provides an introduction to the mechanics and validity of MTurk research; (2) gives examples of MTurk research; and (3) discusses current limitations and best practices in MTurk research. METHODS: We review four use cases of MTurk for research relevant to addictions: (1) the development of novel measures, (2) testing interventions, (3) the collection of longitudinal use data to determine the feasibility of longer-term studies of substance use and (4) the completion of large batteries of assessments to characterize the relationships between measured constructs. We review concerns with the platform, ways of mitigating these and important information to include when presenting findings. RESULTS: MTurk has proved to be a useful source of data for behavioral science more broadly, with specific applications to addiction science. However, it is still not appropriate for all use cases, such as population-level inference. To live up to the potential of highly transparent, reproducible science from MTurk, researchers should clearly report inclusion/exclusion criteria, data quality checks and reasons for excluding collected data, how and when data were collected and both targeted and actual participant compensation. CONCLUSIONS: Although on-line survey research is not a substitute for random sampling or clinical recruitment, the Mechanical Turk community of both participants and researchers has developed multiple tools to promote data quality, fairness and rigor. Overall, Mechanical Turk has provided a useful source of convenience samples despite its limitations and has demonstrated utility in the engagement of relevant groups for addiction science.


Subject(s)
Behavioral Research/standards , Crowdsourcing/standards , Data Collection/standards , Behavior, Addictive , Data Accuracy , Humans , Patient Selection
11.
PLoS One ; 14(12): e0226394, 2019.
Article in English | MEDLINE | ID: mdl-31841534

ABSTRACT

Mechanical Turk (MTurk) is a common source of research participants within the academic community. Despite MTurk's utility and benefits over traditional subject pools some researchers have questioned whether it is sustainable. Specifically, some have asked whether MTurk workers are too familiar with manipulations and measures common in the social sciences, the result of many researchers relying on the same small participant pool. Here, we show that concerns about non-naivete on MTurk are due less to the MTurk platform itself and more to the way researchers use the platform. Specifically, we find that there are at least 250,000 MTurk workers worldwide and that a large majority of US workers are new to the platform each year and therefore relatively inexperienced as research participants. We describe how inexperienced workers are excluded from studies, in part, because of the worker reputation qualifications researchers commonly use. Then, we propose and evaluate an alternative approach to sampling on MTurk that allows researchers to access inexperienced participants without sacrificing data quality. We recommend that in some cases researchers should limit the number of highly experienced workers allowed in their study by excluding these workers or by stratifying sample recruitment based on worker experience levels. We discuss the trade-offs of different sampling practices on MTurk and describe how the above sampling strategies can help researchers harness the vast and largely untapped potential of the Mechanical Turk participant pool.


Subject(s)
Behavioral Research/standards , Crowdsourcing , Patient Selection , Practice Guidelines as Topic , Adult , Behavioral Research/methods , Bias , Crowdsourcing/methods , Crowdsourcing/standards , Data Accuracy , Data Collection/methods , Data Collection/standards , Datasets as Topic/standards , Female , Humans , Male , Middle Aged , Sample Size , Sampling Studies , Selection Bias , Work , Young Adult
12.
BMC Infect Dis ; 19(1): 112, 2019 Feb 04.
Article in English | MEDLINE | ID: mdl-30717678

ABSTRACT

BACKGROUND: Crowdsourcing method is an excellent tool for developing tailored interventions to improve sexual health. We evaluated the implementation of an innovation contest for sexual health promotion in China. METHODS: We organized an innovation contest over three months in 2014 for Chinese individuals < 30 years old to submit images for a sexual health promotion campaign. We solicited entries via social media and in-person events. The winning entry was adapted into a poster and distributed to STD clinics across Guangdong Province. In this study, we evaluated factors associated with images that received higher scores, described the themes of the top five finalists, and evaluated the acceptability of the winning entry using an online survey tool. RESULTS: We received 96 image submissions from 76 participants in 10 Chinese provinces. Most participants were youth (< 25 years, 85%) and non-professionals (without expertise in medicine, public health, or media, 88%). Youth were more likely to submit high-scoring entries. Images from professionals in medicine, public health, or media did not have higher scores compared to images from non-professionals. Participants were twice as likely to have learned about the contest through in-person events compared to social media. We adapted and distributed the winning entry to 300 STD clinics in 22 cities over 2 weeks. A total of 8338 people responded to an acceptability survey of the finalist entry. Among them, 79.8% endorsed or strongly endorsed being more willing to undergo STD testing after seeing the poster. CONCLUSIONS: Innovation contests may be useful for soliciting images as a part of comprehensive sexual health campaigns in low- and middle-income countries.


Subject(s)
Health Education/organization & administration , Health Promotion , Organizational Innovation , Quality Improvement , Sexual Health/education , Adolescent , Adult , Aged , Aged, 80 and over , China , Crowdsourcing/methods , Crowdsourcing/standards , Evaluation Studies as Topic , Female , Health Education/methods , Health Education/standards , Health Promotion/methods , Health Promotion/organization & administration , Health Promotion/standards , Humans , Male , Middle Aged , Public Health/methods , Public Health/standards , Quality Improvement/organization & administration , Quality Improvement/standards , Sexual Behavior/physiology , Young Adult
13.
J Clin Epidemiol ; 107: 77-88, 2019 03.
Article in English | MEDLINE | ID: mdl-30500405

ABSTRACT

OBJECTIVES: The Consolidated Standards of Reporting Trials extension for the stepped-wedge cluster randomized trial (SW-CRT) is a recently published reporting guideline for SW-CRTs. We assess the quality of reporting of a recent sample of SW-CRTs. STUDY DESIGN AND SETTING: Quality of reporting was asssessed according to the 26 items in the new guideline using a novel crowd sourcing methodology conducted independently and in duplicate, with random assignment, by 50 reviewers. We assessed reliability of the quality assessments, proposing this as a novel way to assess robustness of items in reporting guidelines. RESULTS: Several items were well reported. Some items were very poorly reported, including several items that have unique requirements for the SW-CRT, such as the rationale for use of the design, description of the design, identification and recruitment of participants within clusters, and concealment of cluster allocation (not reported in more than 50% of the reports). Agreement across items was moderate (median percentage agreement was 76% [IQR 64 to 86]). Agreement was low for several items including the description of the trial design and why trial ended or stopped for example. CONCLUSIONS: When reporting SW-CRTs, authors should pay particular attention to ensure clear reporting on the exact format of the design with justification, as well as how clusters and individuals were identified for inclusion in the study, and whether this was done before or after randomization of the clusters, which are crucial for risk of bias assessments. Some items, including why the trial ended, might either not be relevant to SW-CRTs or might be unclearly described in the statement.


Subject(s)
Crowdsourcing , Systematic Reviews as Topic , Humans , Cluster Analysis , Crowdsourcing/methods , Crowdsourcing/standards , Randomized Controlled Trials as Topic , Research Design/standards
14.
J Behav Addict ; 7(4): 1122-1131, 2018 Dec 01.
Article in English | MEDLINE | ID: mdl-30522339

ABSTRACT

BACKGROUND AND AIMS: To date, no research has examined the viability of using behavioral tasks typical of cognitive and neuropsychology within addiction populations through online recruitment methods. Therefore, we examined the reliability and validity of three behavioral tasks of impulsivity common in addiction research in a sample of individuals with a current or past history of problem gambling recruited online. METHODS: Using a two-stage recruitment process, a final sample of 110 participants with a history of problem or disordered gambling were recruited through MTurk and completed self-report questionnaires of gambling involvement symptomology, a Delay Discounting Task (DDT), Balloon Analogue Risk Task (BART), Cued Go/No-Go Task, and the UPPS-P. RESULTS: Participants demonstrated logically consistent responding on the DDT. The area under the empirical discounting curve (AUC) ranged from 0.02 to 0.88 (M = 0.23). The BART demonstrated good split-third reliability (ρs = 0.67 to 0.78). The tasks generally showed small correlations with each other (ρs = ±0.06 to 0.19) and with UPPS-P subscales (ρs = ±0.01 to 0.20). DISCUSSION AND CONCLUSIONS: The behavioral tasks demonstrated good divergent validity. Correlation magnitudes between behavioral tasks and UPPS-P scales and mean scores on these measures were generally consistent with the existing literature. Behavioral tasks of impulsivity appear to have utility for use with problem and disordered gambling samples collected online, allowing researchers a cost efficient and rapid avenue for conducting behavioral research with gamblers. We conclude with best-practice recommendations for using behavioral tasks using crowdsourcing samples.


Subject(s)
Behavior Rating Scale/standards , Behavior, Addictive/diagnosis , Crowdsourcing/standards , Gambling/diagnosis , Impulsive Behavior/physiology , Internet , Neuropsychological Tests/standards , Adult , Female , Humans , Male , Middle Aged , Reproducibility of Results , Self Report , Young Adult
15.
Int J Cardiovasc Imaging ; 34(11): 1725-1730, 2018 Nov.
Article in English | MEDLINE | ID: mdl-30128849

ABSTRACT

Quality in stress echocardiography interpretation is often gauged against coronary angiography (CA) data but anatomic obstructive coronary disease on CA is an imperfect gold standard for a stress induced wall motion abnormality. We examined the utility of crowd-sourcing a "majority-vote" consensus as an alternative 'gold standard' against which to evaluate the accuracy of an individual echocardiographer's interpretation of stress echocardiography studies. Participants independently interpreted baseline and post-exercise stress echocardiographic images of cases that had undergone follow up CA within 3 months of the stress echo in two surveys, 2 years apart. We examined the agreement of consensus on survey (survey participant response (> 60%) for one decision) with the stress echocardiography clinical read and with CA results. In the first survey, 29 participants reviewed and independently interpreted 14 stress echo cases. Consensus was reached in all 14 cases. There was good agreement between clinical and consensus (kappa = 0.57), survey participant response and consensus (kappa = 0.68) and consensus and CA results (kappa = 0.40). In the validation survey, the agreement between clinical reads and consensus (kappa = 0.75) and survey participant response and consensus (kappa = 0.81) remained excellent. Independent consensus is achievable and offers a fair comparison for stress echocardiographic interpretation. Future validation work, in other laboratories, and against hard outcomes, is necessary to test the feasibility and effectiveness of this approach.


Subject(s)
Coronary Artery Disease/diagnostic imaging , Crowdsourcing/methods , Echocardiography, Stress/methods , Consensus , Coronary Angiography , Crowdsourcing/standards , Echocardiography, Stress/standards , Feasibility Studies , Humans , Observer Variation , Pilot Projects , Predictive Value of Tests , Quality Assurance, Health Care , Quality Improvement , Quality Indicators, Health Care , Reproducibility of Results
16.
Eval Program Plann ; 71: 68-82, 2018 12.
Article in English | MEDLINE | ID: mdl-30165260

ABSTRACT

At its core, evaluation involves the generation of value judgments. These evaluative judgments are based on comparing an evaluand's performance to what the evaluand is supposed to do (criteria) and how well it is supposed to do it (standards). The aim of this four-phase study was to test whether criteria and standards can be set via crowdsourcing, a potentially cost- and time-effective approach to collecting public opinion data. In the first three phases, participants were presented with a program description, then asked to complete a task to either identify criteria (phase one), weigh criteria (phase two), or set standards (phase three). Phase four found that the crowd-generated criteria were high quality; more specifically, that they were clear and concise, complete, non-overlapping, and realistic. Overall, the study concludes that crowdsourcing has the potential to be used in evaluation for setting stable, high-quality criteria and standards.


Subject(s)
Crowdsourcing/methods , Crowdsourcing/standards , Program Evaluation/methods , Program Evaluation/standards , Public Opinion , Data Accuracy , Humans , Research Design
17.
Annu Rev Public Health ; 39: 335-350, 2018 04 01.
Article in English | MEDLINE | ID: mdl-29608871

ABSTRACT

Environmental health issues are becoming more challenging, and addressing them requires new approaches to research design and decision-making processes. Participatory research approaches, in which researchers and communities are involved in all aspects of a research study, can improve study outcomes and foster greater data accessibility and utility as well as increase public transparency. Here we review varied concepts of participatory research, describe how it complements and overlaps with community engagement and environmental justice, examine its intersection with emerging environmental sensor technologies, and discuss the strengths and limitations of participatory research. Although participatory research includes methodological challenges, such as biases in data collection and data quality, it has been found to increase the relevance of research questions, result in better knowledge production, and impact health policies. Improved research partnerships among government agencies, academia, and communities can increase scientific rigor, build community capacity, and produce sustainable outcomes.


Subject(s)
Community-Based Participatory Research/methods , Community-Based Participatory Research/organization & administration , Environmental Health , Community-Based Participatory Research/standards , Crowdsourcing/methods , Crowdsourcing/standards , Decision Making , Health Policy , Humans
18.
Syst Biol ; 67(1): 49-60, 2018 Jan 01.
Article in English | MEDLINE | ID: mdl-29253296

ABSTRACT

Scientists building the Tree of Life face an overwhelming challenge to categorize phenotypes (e.g., anatomy, physiology) from millions of living and fossil species. This biodiversity challenge far outstrips the capacities of trained scientific experts. Here we explore whether crowdsourcing can be used to collect matrix data on a large scale with the participation of nonexpert students, or "citizen scientists." Crowdsourcing, or data collection by nonexperts, frequently via the internet, has enabled scientists to tackle some large-scale data collection challenges too massive for individuals or scientific teams alone. The quality of work by nonexpert crowds is, however, often questioned and little data have been collected on how such crowds perform on complex tasks such as phylogenetic character coding. We studied a crowd of over 600 nonexperts and found that they could use images to identify anatomical similarity (hypotheses of homology) with an average accuracy of 82% compared with scores provided by experts in the field. This performance pattern held across the Tree of Life, from protists to vertebrates. We introduce a procedure that predicts the difficulty of each character and that can be used to assign harder characters to experts and easier characters to a nonexpert crowd for scoring. We test this procedure in a controlled experiment comparing crowd scores to those of experts and show that crowds can produce matrices with over 90% of cells scored correctly while reducing the number of cells to be scored by experts by 50%. Preparation time, including image collection and processing, for a crowdsourcing experiment is significant, and does not currently save time of scientific experts overall. However, if innovations in automation or robotics can reduce such effort, then large-scale implementation of our method could greatly increase the collective scientific knowledge of species phenotypes for phylogenetic tree building. For the field of crowdsourcing, we provide a rare study with ground truth, or an experimental control that many studies lack, and contribute new methods on how to coordinate the work of experts and nonexperts. We show that there are important instances in which crowd consensus is not a good proxy for correctness.


Subject(s)
Classification/methods , Crowdsourcing/standards , Phylogeny , Animals , Phenotype , Professional Competence , Reproducibility of Results
19.
Eval Program Plann ; 66: 183-194, 2018 02.
Article in English | MEDLINE | ID: mdl-28919291

ABSTRACT

This exploratory study examines a novel tool for validating program theory through crowdsourced qualitative analysis. It combines a quantitative pattern matching framework traditionally used in theory-driven evaluation with crowdsourcing to analyze qualitative interview data. A sample of crowdsourced participants are asked to read an interview transcript and identify whether program theory components (Activities and Outcomes) are discussed and to highlight the most relevant passage about that component. The findings indicate that using crowdsourcing to analyze qualitative data can differentiate between program theory components that are supported by a participant's experience and those that are not. This approach expands the range of tools available to validate program theory using qualitative data, thus strengthening the theory-driven approach.


Subject(s)
Crowdsourcing/methods , Crowdsourcing/standards , Data Accuracy , Internet , Research Design/standards , Humans , Minority Groups/psychology , Qualitative Research , Reproducibility of Results , Student Dropouts/psychology
20.
Brain ; 140(6): 1680-1691, 2017 Jun 01.
Article in English | MEDLINE | ID: mdl-28459961

ABSTRACT

There exist significant clinical and basic research needs for accurate, automated seizure detection algorithms. These algorithms have translational potential in responsive neurostimulation devices and in automatic parsing of continuous intracranial electroencephalography data. An important barrier to developing accurate, validated algorithms for seizure detection is limited access to high-quality, expertly annotated seizure data from prolonged recordings. To overcome this, we hosted a kaggle.com competition to crowdsource the development of seizure detection algorithms using intracranial electroencephalography from canines and humans with epilepsy. The top three performing algorithms from the contest were then validated on out-of-sample patient data including standard clinical data and continuous ambulatory human data obtained over several years using the implantable NeuroVista seizure advisory system. Two hundred teams of data scientists from all over the world participated in the kaggle.com competition. The top performing teams submitted highly accurate algorithms with consistent performance in the out-of-sample validation study. The performance of these seizure detection algorithms, achieved using freely available code and data, sets a new reproducible benchmark for personalized seizure detection. We have also shared a 'plug and play' pipeline to allow other researchers to easily use these algorithms on their own datasets. The success of this competition demonstrates how sharing code and high quality data results in the creation of powerful translational tools with significant potential to impact patient care.


Subject(s)
Algorithms , Crowdsourcing/methods , Electrocorticography/methods , Equipment Design/methods , Seizures/diagnosis , Adult , Animals , Crowdsourcing/standards , Disease Models, Animal , Electrocorticography/standards , Equipment Design/standards , Humans , Prostheses and Implants , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...