Test validity
Validity concerns whether the test results say something about what we want to know. Does a mammogram say anything about the occurrence of breast cancer? Does a tuberculosis test say anything about whether a person is infected with Mycobacterium tuberculosis?
Validity is expressed by sensitivity and specificity. Sensitivity is an expression of the ability of the test to yield a positive result if the disease in question is present or if doping has taken place. A sensitivity of 0.9 will mean that one of ten persons with the disease is not observed or identified. This is called a «false negative». Specificity is an expression of the ability of the test to yield a negative result for persons who do not have the disease or who are not doped. A specificity of 0.9 will mean that one of ten tests will conclude that the disease/substance is present in individuals who are not sick or doped. These are called «false positives».
It is important that both sensitivity and specificity are as close to 100 per cent as possible. But it is important to recognise that attempts to increase the sensitivity of a test will reduce specificity, and vice versa. There is no a priori optimal balance between the two. The optimal balance depends on the nature of the disease or of the condition and the relative costs associated with them. A very high sensitivity is important if there is an effective treatment for a serious disease. In doping work, it is specificity that is important. For some doped athletes to slip through the net is a less serious matter than the trauma occasioned by the conviction of an athlete who is not doped. A very high specificity and the possibility of documenting it with statistically satisfactory empiri are therefore crucial legal safeguards. Without knowing the specificity, it is fundamentally impossible to conclude whether disease (or in this case a forbidden substance) is present.
If, for example, 90 per cent of the sick persons test positive, can we then conclude that 90 per cent of those testing positive are sick? It is not uncommon to hear a conclusion of this kind. But it is not correct, and we demonstrate this in Table 1. Let us assume we have the following values: sensitivity 0.9, specificity 0.9, pre-test probability (prevalence) 0.01.
Table 1
Relationship between sensitivity (0.9), specificity (0.9), prevalence (0.01) and positive predictive value in a population of 10 000
|
Test + |
Test – |
Total |
Condition + |
90 |
10 |
100 |
Condition – |
990 |
8 910 |
9 900 |
Total |
1 080 |
8 920 |
10 000 |
Positive predictive value |
0.083 |
|
|
The pre-test probability of a disease is the assumed prevalence of the disease in the population being tested, a probability based on previous experience and knowledge. Given these figures, 100 persons out of a population of 10 000 would be sick. Of those who are sick, 90 test positive. Of those who are healthy, 990 test positive. Of the 1 080 who test positive, there are 90 who are sick – i.e., only 8.3 per cent of those who test positive are sick.
What happens if we assume a pre-test probability that is lower, for example 0.005? Table 2 shows that the positive predictive value of the test is then reduced from 8.3 per cent to 4.3 per cent.
Table 2
Relationship between sensitivity (0.9), specificity (0.9), prevalence (0.005) and positive predictive value in a population of 10 000
|
Test + |
Test – |
Total |
Condition + |
45 |
5 |
50 |
Condition – |
995 |
8 955 |
9 950 |
Total |
1 040 |
8 960 |
10 000 |
Positive predictive value |
0.043 |
|
|
The probability of a positive test indicating disease is thus dependent on the prevalence of the disease in the population being tested. This means, for example, that the probability of a given mammogram shadow indicating breast cancer for a randomly selected group of women in their 50s is different from that for a group of women who have a family history of breast cancer.