Asymptotic Confidence Interval, Sample Size Formulas and Comparison Test for the Agreement Intra-Class Correlation Coefficient in Inter-Rater Reliability Studies.

Bourredjem, Abderrahmane; Cardot, Hervé; Devilliers, Hervé

Bourredjem, Abderrahmane; Cardot, Hervé; Devilliers, Hervé.

Affiliation

Bourredjem A; Inserm CIC1432, Clinical Epidemiology Unit, Dijon, France.
Cardot H; Centre d'investigation Clinique, Module Epidémiologie Clinique/Essais Cliniques, Dijon-Bourgogne University Hospital, Dijon, France.
Devilliers H; Institut de Mathématiques de Bourgogne, UMR 5584, CNRS, Université de Bourgogne, Dijon, France.

Stat Med ; 2024 Sep 16.

Article in En | MEDLINE | ID: mdl-39285135

ABSTRACT

ABSTRACT

The agreement intra-class correlation coefficient (ICCa) is a suitable statistical index for inter-rater reliability studies. With balanced Gaussian data, we prove the explicit form of ICCa asymptotic normality (ASN), valid both with analysis of variance (ANOVA), maximum likelihood (ML), or restricted ML (REML) estimates. An asymptotic confidence interval is then derived and its performances are examined by simulation compared to the most commonly used methods, under small, moderate and large sample size designs. Then, we deduce sample size calculation formulas, for the number of subjects and observers needed, to achieve a desired confidence interval width or an acceptable ICCa value test power and give concrete examples of their use. Finally, we propose a likelihood ratio test (LRT) to compare two ICCa's from two distinct subpopulations of patients (or raters) and study by simulation its first order risk and power properties. These methods are illustrated using data from two inter-rater reliability studies, one in physiotherapy with 42 patients and 10 raters and the second in neonatology with 80 subjects and 14 raters. In conclusion, we made recommendations to employ the proposed confidence interval for medium to large samples combined with the quantification of the minimal required sample size at the planning step, or the posterior-power at the analysis step, using simple dedicated formulas. Furthermore, with sufficient sizes, the proposed LRT seems suitable to compare inter-rater reliability between two patient subpopulations. Used wisely, this proposed methods toolbox can remedy common current issues in inter-rater reliability studies.

Key words

agreement intraclass correlation coefficient; asymptotic normality (ASN); balanced twoway crossed random effect model; confidence interval (CI); likelihood ratio test (LRT); sample size

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Stat Med / Stat. med / Statistics in medicine Year: 2024 Document type: Article Affiliation country: France Country of publication: United kingdom

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google