Paper: Causality and Causal Inference in Epidemiology: the Need for a Pluralistic Approach

Delighted to announce the online publication of this paper in International Journal of Epidemiology, with Jan Vandenbroucke and Neil Pearce: ‘Causality and Causal Inference in Epidemiology: the Need for a Pluralistic Approach

This paper has already generated some controversy and I’m really looking forward to talking about it with my co-authors at the London School of Hygiene and Tropical Medicine on 7 March. (I’ll also be giving some solo talks while in the UK, at Cambridge, UCL, and Oxford, as well as one in Bergen, Norway.)

The paper is on the same topic as a single-authored paper of mine published late 2015, ‘Causation and Prediction in Epidemiology: a Guide to the Methodological Revolution.‘ But it is much shorter, and nonetheless manages to add a lot that was not present in my sole-authored paper – notably a methodological dimension that, as a philosopher by training, I was ignorant. The co-authoring process was thus really rich and interesting for me.

It also makes me think that philosophy papers should be shorter… Do we really need the first 2500 words summarising the current debate etc? I wonder if a more compressed style might actually stimulate more thinking, even if the resulting papers are less argumentatively airtight. One might wonder how often the airtight ideal is achieved even with traditional length paper… Who was it who said that in philosophy, it’s all over by the end of the first page?

Is consistency trivial in randomized controlled trials?

Here are some more thoughts on Hernan and Taubman’s famous 2008 paper, from a chapter I am finalising for the epidemiology entry in a collection on the philosophy of medicine. I realise I have made a similar point in an earlier post on this blog, but I think I am getting closer to a crisp expression. The point concerns the claimed advantage of RCTs for ensuring consistency. Thoughts welcome!

Hernan and Taubman are surely right to warn against too-easy claims about “the effect of obesity on mortality”, when there are multiple ways to reduce obesity, each with different effects on mortality, and perhaps no ethically acceptable way to bring about a sudden change in body mass index from say 30 to 22 (Hernán and Taubman 2008, 22). To this extent, their insistence on assessing causal claims as contrasts to well-defined interventions is useful.

On the other hand, they imply some conclusions that are harder to accept. They suggest, for example, that observational studies are inherently more likely to suffer from this sort of difficulty, and that experimental studies (randomized controlled trials) will ensure that interventions are well-specified. They express their point using the technical term “consistency”:

consistency… can be thought of as the condition that the causal contrast involves two or more well-defined interventions. (Hernán and Taubman 2008, S10)

They go on:

…consistency is a trivial condition in randomized experiments. For example, consider a subject who was assigned to the intervention group … in your randomized trial. By definition, it is true that, had he been assigned to the intervention, his counterfactual out- come would have been equal to his observed outcome. But the condition is not so obvious in observational studies. (Hernán and Taubman 2008, s11)

This is a non-sequitur, however, unless we appeal to a background assumption that an intervention—something that an actual human investigator actually does—is necessarily well-defined. Without this assumption, there is nothing to underwrite the claim that “by definition”, if a subject actually assigned to the intervention had been assigned to the intervention, he would have had the outcome that he actually did have.

Consider the intervention in their paper, one hour of strenuous exercise per day. “Strenuous exercise” is not a well-defined intervention. Weightlifting? Karate? Swimming? The assumption behind their paper seems to be that if an investigator “does” an intervention, it is necessarily well-defined; but on reflection this is obviously not true. An investigator needs to have some knowledge of which features of the intervention might affect the outcome (such as what kind of exercise one performs), and thus need to be controlled, and which don’t (such as how far west of Beijing one lives). Even randomization will not protect against confounding arising from preference for a certain type of exercise (perhaps because people with healthy hearts are predisposed both to choose running and to live longer, for example), unless one knows to randomize the assignment of exercise-types and not to leave it to the subjects’ choice.

This is exactly the same kind of difficulty that Hernan and Taubman press against observational studies. So the contrast they wish to draw, between “trivial” consistency in randomized trials and a much more problematic situation in observational studies, is a mirage. Both can suffer from failure to define interventions.

Potential Outcomes: Separating Insight from Ideology

I’m in Anchorage, preparing for the World Congress of Epidemiology. One of the sessions I’m speaking at is a consultation for the next edition of the Dictionary of Epidemiology. It’s a strange and delightful document, this Dictionary: since it sets out to define not only individual words but also the discipline of epidemiology as a whole. Thus it contains both mundane and metaphysics entries, from “death certificate” to “causality”. I’m billed to talk about “Defining Measures of Causal Strength”. There’s a lot to say: the current entries under causal-related terms could use some disciplining. But I’m particularly interested in orienting myself with regards to the “potential outcomes” view of causation, which seems to be the current big thing among epidemiologists.

The potential outcomes view is associated in particular with Miguel Hernan, a very smart epidemiologist at Harvard, and he has a number of nice papers on it. (I hope I don’t need to say that what follows is not a personal attack: I have great respect for Hernan, and am stimulated by his work. I’m just taking his view as exemplary of the potential-outcomes approach, in the way that philosophers typically do.)

In particular I’ve been engaged in a close reading of a paper on obesity by Hernan and Taubman (2008). Their view, as expressed in that paper, is an interesting mix of pragmatism and idealism. On the one (pragmatic) hand, they argue that causal questions are often ill-formed, and thus unanswerable. There is no answer to the question “What is the effect of body-mass index (BMI) on all-cause mortality?” because the different ways to intervene on BMI may result in different effects on mortality. Diet, exercise, a combination of diet and exercise, smoking, chopping off a limb – these are all ways to reduce BMI. Until we have specified which intervention we have in mind, we cannot meaningfully quantify the contribution of BMI to mortality.

This much is highly reminiscent of contrastivist theories of causation in philosophy. Contrastivist theories take causation to consist in counterfactual dependence, but differ from counterfactual theories in taking the form of causal statements to be implicitly contrastive: not “c causes e” but “c rather than C* causes e rather than E*”, where C* and E* are classes of events that could occur in the absence of c and e respectively. Against this background, Hernan and Taubman’s point is simply that, for an epidemiological investigator, it matters what contrast class we have in mind when we seek to estimate the size of an effect. This is a good point, especially in a context where one hopes to act on a causal finding. One had better be sure that one knows, not only that there is a causal connection between a given exposure and outcome, but also what will happen if a given intervention replaces the factor under investigation. I have called the failure to appreciate this point The Causal Fallacy and linked it to easy errors in prediction (see this previous post and Broadbent 2013, 82).

But there is another more troubling side to the view as it is expressed in this paper: that randomized controlled trials offer a protection against this error, and somehow force us to specify our interventions precisely. The argument for this claim is striking, but on reflection I fear it is specious.

Hernan and Taubman make a striking point: they say that an observational study might appear to be able to answer the question “What is the effect of BMI on all-cause mortality?” via a statistical analysis of data on BMI and mortality, while randomized controlled trials would not be able to answer this question directly: they would only be able to answer questions like: “What is the effect of reducing BMI via dietary interventions? / via exercise? / via both?” This apparent shortcoming of RCTs is, of course, a strength in disguise: the observational study is in fact not so informative, since it does not distinguish the effects of different ways of reducing BMI; while the RCTs do give us this information.

This argument is fallacious, however, for the following reasons.

  1. An observational study that includes the same information as the RCTs on the methods of reducing BMI would also be able to distinguish between the effects of these interventions.
  2. It is true that one could conduct an observational study which ignored the possibility that different methods of reducing BMI might themselves have affect mortality. But that would be a bad study, since it would ignore the effects of known confounders. A good study would take these things into account.
  3. Conversely, it is a mistake to suppose that RCTs offer protection against this sort of error. The BMI case is a special one, precisely because there are so many ways to intervene to reduce BMI and we know that these could affect mortality. In truth, there are many ways to make any intervention. One may take a pill or a capsule or a suppository, on the equator or in the tropics, before or after a meal, and so on. Even in an RCT, the intervention is not fully specified. Rather, we simply assume that the differences don’t matter, or that if they do, they are “cancelled out” by the randomisation process.
  4. Randomized controlled trials are not controlled in the manner of true controlled experiments; rather, randomization is a surrogate for controlling. We hope that all the many differences between the circumstances of each intervention in the treatment group will either have no effect or, if they do, will have effects that are randomly distributed so as not to obscure the effect of the treatment. But in principle, it is still possible that this hope is not fulfilled. At a p-value of 0.05 this will happen in one RCT in 20; and perhaps more often in published RCTs, given publication bias (i.e. the fact that null results are harder to publish).

These are familiar points in the philosophical literature on randomised controlled trials (see esp. Worrall 2002). The point I wish to pull out is this. On the one hand, Hernan’s emphasis on getting a well-defined contrastive question is insightful and important. But on the other hand, it is wrong to think that RCTs solve the problem. True, in an RCT you must make an intervention. But it does not follow that one’s intervention is well-specified. There might be all sorts of features of the particular way that you intervene that could skew the results. And conversely, plug the corresponding “how it happened” info into a cohort study, and you will be able to obtain the same sorts of discrimination between these methods.

On top of all this, the focus on the methods of individual studies obscures the most important point of all: that convincing evidence comes from a multitude of studies. Just as an RCT allows us to assume that differences between individuals are evenly distributed and thus ignorable, so a multitude of methodologically inferior studies can provide very strong evidence if their methodological shortcomings are different. This is the kind of situation Hill responded to with his guidelines (NOT criteria!) for inferring causality (Hill 1965). Similarly, ad hoc arguments against each possible alternative explanation can add up to a compelling case, as in the classic paper by Cornfield and colleagues on smoking and lung cancer (Cornfield et al 1959). The recent insights of the potential outcomes approach are valuable and important, but they augment rather than replace these familiar, older insights.


Broadbent, A. 2013. Philosophy of Epidemiology. Basingstoke and New York: Palgrave Macmillan.

Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB and Wynder EL. 1959. Smoking and lung cancer: recent evidence and a discussion of some questions. Journal of the National Cancer Institute 22: 173-203.

Hernan, MA and Taubman, SL. 2008. Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity 32: S8-S14.

Hill, Austin Bradford. 1965. The environment and disease: association or causation? Proceedings of the Royal Society of Medicine 58: 259-300.

Worrall, J. 2002. What Evidence in Evidence-Based Medicine? The British Journal of the Philosophy of Science 58: 451-488.