Avoid significance tests for background variables in randomised controlled trials

Stian Lydersen

doi:10.4045/tidsskr.19.0684

Medicine and numbers

Avoid significance tests for background variables in randomised controlled trials

Norwegian

Stian Lydersen

See All Articles

Stian Lydersen

Orcid

E-mail: stian.lydersen@ntnu.no

Stian Lydersen, dr.ing. and professor of medical statistics at the Regional Centre for Child and Youth Mental Health and Child Welfare (RKBU Central Norway) at the Department of Mental Health, Norwegian University of Science and Technology.

The author has completed the ICMJE form and declares no conflicts of interest.

Article

In a randomised controlled trial there should be no systematic differences between the groups before treatment. Still, some researchers choose to significance test for possible differences in background variables. This is superfluous, and can be misleading.

An important strength of randomised controlled trials is the fact that background variables, for example age and sex, is randomly distributed between the treatment groups. According to the CONSORT guidelines, baseline demographic and clinical characteristics should be reported separately for each group (1). Table 1 shows an example which is an extract from a larger table with 23 variables, from a study comparing two treatment pathways for hip fractures. The authors write 'Baseline characteristics did not differ between the groups' ((2)page 1629). This is based on clinical judgement. The authors did not carry out significance tests for possible differences, and this is in line with the CONSORT guidelines (1).

Table 1

Extract from a table with background variables in a randomised controlled trial (2). Reproduced with permission from Elsevier. Number (%) unless otherwise specified.

	Comprehensive geriatric care (N=198)	Orthopaedic care (N=199)
Age in years, mean (standard deviation)	83.4 (5.4)	83.2 (6.4)
Female	145 (73)	148 (74)
Living alone	115 (58)	124 (62)
Intracapsular fracture	119 (60)	127 (64)

There will always be some differences in background variables in a randomised controlled trial. These differences are usually small. And following randomisation, these differences are known to be random. If significance tests were carried out for the background variables, one would expect to find statistically significant differences in about 5 % of the cases, that is, for about one variable out of twenty. In the study mentioned above, such a 'significant' finding could thus be expected in 1 of the 23 background variables.

However, such significance testing is still reported in some randomised controlled trials. What could the motivation be? We can think of two reasons: to test whether randomisation was performed properly, and to identify unbalanced background variables.

Was randomisation performed properly?

If there are reasons to suspect that randomisation was not performed properly, this can be tested. However, a significance level of considerably less than 5 % should be used. Fayers and King describe a trial with an overrepresentation of younger participants in one of the groups. The difference was highly significant, with p<0.0005. A closer look revealed a breach of adherence to the randomisation protocol (3).

Unbalanced background variables?

A more common motivation is probably to identify possibly unbalanced background variables between the groups. Thereafter, one can adjust for these in the analyses. But statistical significance depends on both sample size and the degree of imbalance, so this approach is discouraged (4, 5). In a small study, a variable can be very unbalanced without causing the imbalance to be statistically significant. It could be more sensible to judge the degree of observed imbalance, and adjust for variables being unbalanced and judged as clinically important, than to use p-value driven selection. But this is also controversial, since this again would be a data-driven selection of variables for the analysis ((6) p. 417–8). And if carried out, these should be sensitivity analyses performed after the primary analysis.

No reason to significance test

There is no reason to significance test for differences in background variables in a randomised controlled study, unless there is reason to believe that randomisation was not performed properly. De Boer and colleagues call this an unhealthy research behaviour that is hard to eradicate (5). In some randomised controlled trials it may be sensible to adjust for certain background variables, but these must be predefined before data are studied. This will be discussed in the next article in this series.

Literature

1.
Moher D, Hopewell S, Schulz KF et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ 2010; 340: c869. [PubMed][CrossRef]
2.
Prestmo A, Hagen G, Sletvold O et al. Comprehensive geriatric care for patients with hip fractures: a prospective, randomised, controlled trial. Lancet 2015; 385: 1623–33. [PubMed][CrossRef]
3.
Fayers PM, King M. A highly significant difference in baseline characteristics: the play of chance or evidence of a more selective game? Qual Life Res 2008; 17: 1121–3. [PubMed][CrossRef]
4.
Pocock SJ, Assmann SE, Enos LE et al. Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practice and problems. Stat Med 2002; 21: 2917–30. [PubMed][CrossRef]
5.
de Boer MR, Waterlander WE, Kuijper LD et al. Testing for baseline differences in randomized controlled trials: an unhealthy research behavior that is hard to eradicate. Int J Behav Nutr Phys Act 2015; 12: 4. [PubMed][CrossRef]
6.
Vittinghoff E, Glidden DV, Shiboski SC et al. Regression methods in biostatistics linear, logistic, survival, and repeated measures models. 2nd utg. New York: Springer, 2012.

Comments ( 1 )

Dette kommentarfeltet modereres, men kommentarer blir ikke redaksjonelt behandlet ut over å sikre at de følger retningslinjer for vårt kommentarfelt.

11.03.2020:

Flere tidsskrift har i sin forfatterveiledning anbefalt testing av bakgrunnsvariable i randomiserte kontrollerte studier. New England Journal of Medicine hadde dette i sin forfatterveiledning per januar 2019: «For tables comparing treatment groups at baseline in a randomized trial (usually the first table in the manuscript), significant differences between or among groups (i.e., P<0.05) should be identified in a table footnote and the P value should be provided in the format specified above.»

Jeg gjorde redaksjonen oppmerksom på at denne anbefalingen strider mot CONSORT-retningslinjene, der det står: «Unfortunately significance tests of baseline differences are still common; they were reported in half of 50 RCTs trials published in leading general journals in 1997. Such significance tests assess the probability that observed baseline differences could have occurred by chance; however, we already know that any differences are caused by chance. Tests of baseline differences are not necessarily wrong, just illogical. Such hypothesis testing is superfluous and can mislead investigators and their readers. Rather, comparisons at baseline should be based on consideration of the prognostic strength of the variables measured and the size of any chance imbalances that have occurred.»

På bakgrunn av dette fikk jeg ideen om å skrive om dette emnet i Medisin og tall i Tidsskriftet (1). Og i mellomtiden ser jeg at ovennevnte anbefaling tatt ut fra den reviderte forfatterveiledningen til New England Journal of Medicine. Men vi må dessverre regne med at slike anbefalinger fremdeles i noen andre tidsskrift.

Litteratur:

1. Lydersen S. Unngå signifikanstesting av bakgrunnsvariable i randomiserte kontrollerte studier. Tidsskrift for Den norske legeforening 2020.

This article was published more than 12 months ago and we have therefore closed it for new comments.

Published: 9 March 2020

Tidsskr Nor Legeforen 9 March 2020 Vol. 140.

doi:

10.4045/tidsskr.19.0684

Published: 9 March 2020

Tidsskr Nor Legeforen 2020 Vol. 140.

doi: 10.4045/tidsskr.19.0684

PDF

Print

Avoid significance tests for background variables in randomised controlled trials

Table 1

Was randomisation performed properly?

Unbalanced background variables?

No reason to significance test

Endret forfatterveiledning i New England Journal of Medicine

Recent Articles