In a longitudinal study, data on one or more time points may be missing for some participants. ‘Last observation carried forward’ is a simple method for imputing missing values, but it has serious weaknesses.
Missing data may arise from participants dropping out of the study, or temporarily failing to appear at one or more time points. One method for handling missing data in longitudinal studies is called ‘Last observation carried forward’ (LOCF). In studies with one measurement at baseline and one after, the method can be termed ‘Baseline observation carried forward’ (BOCF). These methods are applied as follows: If an observation is missing, the last observed value is imputed at future time points where it is missing. This is illustrated in Figure 1. In this example, the outcome variable is increasing with time for the participants, and the missing values will be systematically underestimated. But what is generally the case?
Figure 1 «Last observation carried forward»: A shows a complete data set for four participants at three time points. If the values marked X are missing, the last observed value is imputed (marked with an arrow in B).
Is the method conservative?
The method has been used in many studies, including randomised controlled trials. The US Food and Drug Administration (FDA) used to recommend LOCF, considering it to be conservative (
1, pp. 16–17). Conservative means bias, such that the treatment effect is underestimated. However, it turns out that the method can yield bias in both directions, and it may even yield bias if data are missing completely at random ( 2, pp. 47–50). In 2010, the National Research Council’s Expert Panel on the prevention and treatment of missing data in clinical trials, which had been convened at the request of the FDA, concluded that neither LOCF nor BOCF should be used to handle missing data, unless the assumptions that underlie them are scientifically justified ( 3, p. 77).
Handbook of Missing Data Methodology (2015) states that ‘As LOCF is neither valid under general assumptions nor based on statistical principles, it is not a sensible method, and should not be used’. Other recent books on missing data also recommend against using LOCF. ( 1, p.16), ( 5, p. 11) ( 6, p. 59). Finally, I quote Vickers & Altman ( 7): ‘This method (LOCF) is attractive because it is simple, but it has little else to recommend it.’
Better alternatives for handling missing data are available. In a longitudinal study, for example, a linear mixed model regression analysis can be well suited. With this method, there is no need to impute missing data. All data will be included in the analysis, also from participants missing data at one or more time points. And the results are unbiased if data are missing at random (
5, p. 130) ( 8).