An explanatory variable must be explicable
The turning point for diabetes was when the role of insulin and the pancreas was discovered in the early 1900s. Armed with the new knowledge of the importance of insulin regulation, doctors could now intervene and thereby influence survival. "Good or poor insulin regulation" is more than a mere predictor for death, the way "sweet urine" is. It is an explanatory variable for death, because it holds information about the physiological mechanisms involved in diabetes.
In 1954, the Danish doctor Jørgen Pedersen (1914–78) suggested that diabetic mothers give birth to large babies due to increased transfer of glucose from the pregnant mother to the foetus (1). In 2008, the Pedersen hypothesis was extended to include non-diabetic mothers, when a regression analysis found an almost linear association between the mother's blood glucose level and the child's birth weight (2). Thanks to meticulous research, we now know that the associations between maternal blood glucose levels and pregnancy outcomes like macrosomia and neonatal hypoglycaemia are not merely correlations; they are causal relations.
When chasing causal relations between the mother's blood glucose levels and adverse birth outcomes, we must use all our physiological and clinical expertise, and evaluate which variables to include in the accompanying regression models.
In contrast to the case of a prediction models, we cannot choose freely what variables to include in regression models used for the estimation of mechanistic effects and causal relations. Whereas a predictor may be any quantitative trait, explanatory variables must be part of the presumed causal chain, and we must specify main exposure, confounders, mediators and colliders. Also, unlike in the case of a prediction model, the mathematical equation should be fairly simple. There is little use in an inexplicable explanatory model.