Measurements below the detection limit

Medicine and numbers
    ()

    sporsmal_grey_rgb
    Article

    Many methods for measuring levels of a substance in a sample have a lower detection limit. Data from these measurements must be handled in a way that avoids systematic errors.

    Let's start with an example: Figure 1 shows a fictional dataset of measurements of serum levels of a substance in two patient groups. The crosses indicate the actual values, but only those above the lower detection limit can be measured. We know how many values are below the detection limit, but not the actual values. These data are missing not at random (MNAR) because the probability depends on the non-observed values being below the detection limit (1).

    Bias

    Bias

    Can the missing values simply be ignored and just the measurable values be analysed? That would be bad practice because it would cause bias in the results and overestimation of the median (as well as the mean) of the measurements. In Figure 1, this would apply mostly to group 1, which has the largest proportion of values below the detection limit.

    Measuring several substances

    Measuring several substances

    In many studies, it is not just one substance being measured, but a number of related substances in a sample. For example, this may be an analysis of proteins, hormones, metabolites or suchlike. In these studies, the substances that have a large number of values below the detection limit will contribute little information and are often left out of further analyses. This limit is typically set somewhere between 30 % and 50 % missing values. However, first of all, it is important to investigate whether the missing values are evenly distributed between the groups being studied, or whether they mainly occur in one of the groups. If the values for a substance are below the detection limit in samples in group 1, but are detectable in group 2, this substance may well be an excellent biomarker to differentiate between group 1 and 2. An example of this is PSA measurements in patients who have undergone surgery for prostate cancer, where PSA will generally only be detectable in patients who have had disease recurrence.

    Single imputation of values

    Single imputation of values

    In many cases, it will be appropriate to impute the missing values. A simple approach is to replace the missing values with a particular value, which will usually be the detection limit, half the limit or zero. However, simulation studies have demonstrated that imputation with zero is generally not advisable, and that imputation with half the detection limit is the best of these alternatives because it introduces the least bias into the estimates (2).

    Advanced methods

    Advanced methods

    In cases where a large proportion of values are below the detection limit, it may be necessary to use more advanced methods than imputing the same value for all observations below the detection limit. Examples of these are methods based on multiple imputation, which take account of variance structures in the dataset (3). Other widely used methods use maximum likelihood estimation, based on the assumed multivariate probability distribution (4). It is also possible to use regression models for censored data, e.g. tobit models, to analyse the data without needing to impute values.

    Choice of method

    Choice of method

    There is no general and universal method for handling data below the detection limit. Imputation with half the detection limit may work well in many cases. Many will argue that advanced methods are needed when more than 10 % or 20 % of observations fall below the detection limit. However, this depends on which statistical analysis methods are planned. If non-parametric analysis methods are to be used, e.g. a Wilcoxon-Mann-Whitney test, the result will not be much affected by how data below the limit are handled, even with higher proportions. Single imputation can also work well if parametric analysis methods are to be used, e.g. a t-test, although the standard deviation may be downward biased (5).

    PDF
    Print
    Reply to article

    Recent Articles

    Made by Ramsalt Using Ramsalt Media