Pipeline Designs and Equipoise: How Can They Go Together?

I am writing about phase-in / pipeline designs. Again. I’ve already done it here. and more here. but.

The premise of a pipeline or phase-in design is that groups will be randomized or otherwise experimentally allocated to receive a given intervention earlier or later. The ‘later’ group can then serve as the comparison for the ‘early’ group, allowing for a causal claim about impact to be made. I am specifically talking about phase-in designs premised on the idea that the ‘later’ group is planned (and has perhaps been promised) to receive the intervention later. I take this to be a ‘standard’ approach to phase-in designs.

I’d like to revisit the issue of phase-in designs from the angle of equipoise, which implies some sense of uncertainty about the causal impact of a given intervention. This uncertainty provides the justification for studying making use of an ex ante impact evaluation. Equipoise literally translates to equal weight / force / interest. Here, the force in question is the force of argument about the impact of an intervention and which direction it will go (or whether there will be one at all).

There have already been some great conversations, if not decisive answers, as to whether, in social science research, the justification for using experimental allocation of an intervention needs to meet the standards of clinical equipoise or policy equipoise.* The key difference is the contrast between ‘a good impact’ (clinical equipoise) and ‘the best impact achievable the resources’ (policy equipoise). In either case, it is clear that some variant of equipoise is considered a necessary justification. For theoretical and/or empirical reasons, it just isn’t clear whether an intervention is (a) good (investment).

Whichever definition of equipoise you pursue, the underlying premise is one of a genuine uncertainty and an operational knowledge gap about how well a certain intervention will work in a certain setting at a certain point in time and at what degree of relative resource efficiency. This uncertainty is what lends credibility to an ex ante impact evaluation (IE) and the ethical justification for a leave-out (‘business as usual’ or perhaps ‘minimal/basic package’) comparison group. Hence, no RCTs on parachutes.

Uncertainty implies that the impact results could plausibly, if not with fully equal likelihood, come back positive, negative, null or mixed. At least some of those outcomes imply that a program is not a good use of resources, if not actually generating adverse effects. Such a program, we might assume, should be stopped or swapped for some alternative intervention (see Berk’s comments here).

To move forward from the idea of uncertainty, the following two statements simply do not go together despite often being implicitly paired:

  1. We are uncertain about the effectiveness impact our intervention will bring about / cause, so we are doing an (any type of ex ante) IE.
  2. We plan to scale this intervention for everyone (implicitly, at least, because we believe it works – that is, the impacts are largely in the desired direction). Because of resource constraints, we will have to phase it in over time to the population.

Yes, the second point could be and is carried on to say, ‘this offers a good opportunity to have a clean identification strategy and therefore to do IE.’ But this doesn’t actually square the circle between the two statements. It still requires the type of sleight of hand around the issue of uncertainty that I raised here about policy champions..

Unless there are some built-in plans to modify (or even cancel) the program along the phase-in process, the ethics of statement 2 rests solely on the resource constraint (relative to actual or planned demand), not on any variant of equipoise. This is an important point when justifying the ethics of ex ante IE. And it is worth noting how few development programs have been halted because of IE results. It would be a helpful global public good if someone would start compiling a list of interventions that have been stopped, plausibly, because of IE outcomes, perhaps making note of the specific research design used. Please and thank you.

Moreover, unless there is some built-in planning about improving, tweaking or even scrapping the program along the way, it is not clear that the ex ante IE based on a phase-in design can fully claim to be policy relevant. This is a point I plan to elaborate in a future post but, for now, suffice it to say that I am increasingly skeptical that being about a policy (being ‘policy adjacent’ by situating a study in a policy) is the same as informing decisions about that policy (being ‘decision relevant’).

To me, the latter has stronger claims on being truly policy relevant and helping making wise and informed decisions about the use of scarce resources – which I think is the crux of this whole IE game anyway. IEs of phase-in designs without clear potential for mid-course corrections (i.e. genuine decision points) seem destined for policy adjacency, at best. Again, the underlying premise of a phase-in design is that it is a resource constraint, not an evidence constraint, which is dictating the roll-out of the program. But the intention to make a decision at least partly based on the evidence generated by an IE again rests on the premise of ex ante uncertainty about the potential for (the most cost-efficient) impact.

To come back to the issue of equipoise and phase-in designs: if the ethics of much of the work we do rests on a commitment to equipoise, then more needs to be done to clarify how we assess it and whether IRB/ethics review committees take it seriously when considering research designs. What information does a review board need to make that assessment?

Moreover, it requires giving a good think on what types of research designs align with the agreed concept of equipoise (whichever that may be). My sense is that phase-in designs can only be commensurate with the idea of equipoise if they are well-conceived, with well-conceived indicating that uncertainty about impact is indeed recognized and contingencies planned for in a meaningful way – that is, that the intervention can be stopped or altered during the phase-in process.

* I don’t propose to settle this debate between clinical and policy equipoise here, though I am sympathetic to the policy equipoise argument (and would be more so if more ex ante IEs tended towards explicitly testing two variants of an intervention against one another to see which proves the better use of resources moving forward – because forward is the general direction people intend to move in development).

Advertisements
Aside

How NOT to Interpret p-values

Pinned to my the bulletin board above my desk 🙂

Berkeley Initiative for Transparency in the Social Sciences

Your dose of BITSS humor, via xkcd.


Source: xkcd.com (PhOTO CREDIT: xkcd.com)

View original post