not long back, i put down my thoughts (here) about pipeline or phase-in designs. my basic premise is that while they may allow for causal inference, it is not clear that they are usually designed to allow generated evidence to be used where it is most relevant — to that program itself. that seems bad from an evidence-informed decision-making point of view and potentially questionable from an ethical point of view.
i raised this issue during a recent conversation on the development impact blog about the ethics of randomization. i reproduce my comment and berk ozler‘s kind reply, below.
usually, the appealing premise of a phased-in design is that there is some resource constraint that would prevent simultaneous scale-up in any case. in this scenario, no matter how heavy the burden of waiting, there will be to be some rationing. In which case, why not randomization rather than something else, like patronage?
then things get odd. the suggestion seems to be that we may know, ex ante, that at least some types of people (elderly, immune-compromised) will benefit greatly from immediate receipt of the treatment. In which case, we are not in equipoise and whether an RCT (or at least unconditional randomization) is appropriate in any case. things, of course, get trickier when a resource constraint is not binding simultaneous scale-up.
second, I feel we should reflect on the purpose and ethics of a phased-in design, especially one with full information. again, a resource constraint may make it politically acceptable for a governor to say that she will roll-in health insurance randomly across the state, which can allow an opportunity to learn something about the impact of health insurance. so, she stands up and says everyone will get (this) health insurance at some point and here’s the roll-out schedule.
but the reason for making use of this randomization is to learn if something works (because we genuinely aren’t sure if it will, hence needing the experiment) and maybe to have ‘policy impact’. so what if what is learnt from comparing the Phase I and Phase II groups is that there is no impact, the program is rubbish or even harmful? or, at a minimum, it doesn’t meet some pre-defined criterion of success. is the governor in a position to renege on rolling out the treatment/policy because of these findings? does the fine print for everyone other than those in Phase I say “you’ll either get health insurance, or, if the findings are null, a subscription to a jelly-of-the-month club”? in some ways, a full-disclosure phased roll-in seems to pre-empt and prevent policy learning and impact *in the case under study* because of the pre-commitment of the governor.
i find that phased roll-in designs without a plan to pause, analyse, reassess and at least tweak the design between Phases I and II to be ethically troubling. i’d be interested in your thoughts.
in economics, unlike in medicine, many times the programs we have involve transferring something to individuals, households, or communities (assets, information, money, etc.). without negative spillovers, we don’t think of these as ever not increasing individual welfare, at least temporarily: if i give you a cow, this is great for you. if you don’t like it, sell it: your individual welfare will increase (would have been even higher if i just gave you the cash).
but, what if my program’s goal is not a temporary jump in your welfare, but you escaping poverty as close to permanently as possible? the program could be deemed unsuccessful even though it raised welfare of its beneficiaries for a short period.
the point is, it does seem wrong to break your promise to give something (something people would like to have) to people who drew Phase II in the lottery because you deemed your program unsuccessful for reaching its goals. you promised people that you’d give them the treatment at the outset, so i’d argue that if you’ll break your promise you have to give them something at least as good if not better. if you can come up with this (and the phase II group is happy with your decision), perhaps they can even become your phase I group in a new experiment — in a process where you experiment, tweak, experiment again, … kind of like what Pritchett et al. argue we should do: lot more experiments not less…
thinking of your examples. with the Oregon healthcare reform, it would be hard to push a stop or pause button with legislation. government action takes time and there is the credibility of your policymakers at stake. i don’t think you could really argue for a stop/pause because those impacts (even if unequivocal) are considered too small to treat the lottery losers.
in the case of a project that is giving cows, i am more optimistic: it might be possible for the project to find an alternative treatment that is of equal or higher value, that is acceptable to the phase II group, and that is feasible to roll out quickly. in such cases, i could see a tweak of the intervention between the two phases.
One thought on “further thoughts on phase-in/pipeline designs for causal inference”