Aside

Gem From the Anti-Politics Machine: They Only Seek the Kind of Advice They Can Take

I am starting to re-read the Anti-Politics Machine after some time… and, of course, started with the epilogue — the closest Ferguson comes to giving advice from his vivisection. here’s a gem that remains relevant ten-plus years later, in spite of major political changes in southern Africa:

Certainly, national and international ‘development’ agencies do constitute a large and ready market for advice and prescriptions, and it is the promise of real ‘input’ that makes the ‘development’ form of engagement such a tempting one for many intellectuals. These agencies seem hungry for good advice, and ready to act on it. Why not give it?

But as I have tried to show, they only seek the kind of advice they can take. One ‘developer’ asked my advice on what his country could do to ‘help these people.’ When I suggested that his government might contemplate sanctions against apartheid, he replied, with predictable irritation, ‘No, no! I mean development!

The only ‘advice’ that is in question here is advice about how to ‘do development’ better. There is a ready ear for criticisms of ‘bad development projects,’ so long as these are followed up with calls for ‘good development projects.’

Thinking About Stakeholder Risk and Accountability in Pilot Experiments

This post is also cross-posted here in slightly modified form.

Since I keep circling around issues related to my dissertation in this blog, I decided it was time to start writing about some of that work. As anyone who has stood or sat near to me for more than 5 minutes over the past 4.25 years will know, in my thesis I examine the political-economy of adopting and implementing a large global health program (the affordable medicines facility – malaria or “AMFm”). This program was designed at the global level (meaning largely in D.C. and Geneva with tweaking workshops in assorted African capitals). Global actors invited select Sub-Saharan African countries to apply to pilot the AMFm for two years before any decision was made to continue, modify, scale-up, or terminate. It should also be noted from the outset that it was not fully clear what role the evidence would play in the board’s decision and how the evidence would be interpreted. As I highlight below, this lack of clarity helped to foster feelings of risk as well as a resistance among some of the national-level stakeholders about participating in the pilot. . . as  . .

To push the semantics a bit, several critics have (e.g.) noted that scale and scope and requisite new systems and relationships involved in the AMFm disqualify it from being considered a ‘pilot,’ though i use that term for continuity with most other AMFm-related writing. . .

In my research, my focus is on the national and sub-national processes of deciding to participate in the initial pilot (‘phase I’) stage, focusing specifically on Ghana. Besides the project scale and resources mobilized, one thing that stood out about this project is that there was a reasonable amount of resistance to piloting this program among stakeholders in several of the invited countries. I have been very fortunate that my wonderful committee and outside supporters like Owen Barder have continued to push me over the years (and years) to try to explain this resistance to an ostensibly ‘good’ program. Moreover, I have been lucky and grateful that a set of key informants in Ghana that have been willing to converse openly with me over several years as I have tried to untangle the reasons behind the support and resistance and to try to get the story ‘right’. . .

The set-up of the global health pilot experiment, from the global perspective, the set-up was a paragon of planning for evidence-informed decision-making: pilot first, develop benchmarks for success and commission an independent evaluation (a well-monitored before and after comparison) — and make decisions later. . .

In my work, through a grounded qualitative analysis, I distil the variety of reasons for supporting and resisting Ghana’s participation in the AMFm pilot to three main types: those related to direct policy goals (in this case, increasing access to malaria medication and lowering malaria mortality), indirect policy goals (indirect insofar as they are not the explicit goals of the policy in question, such as employment and economic growth), and finally those related to risk and reputation (individual, organizational, and national). I take the latter as my main focus for the rest of this post. . . . .

A key question on which I have been pushed is the extent to which resistance to participation (which meant resisting an unprecedented volume of highly subsidized, high-quality anti-malarial treatments entering both the public and the private sector) emerges from the idea of the AMFm versus the idea of piloting the AMFm with uncertain follow-up plans. . ..

Some issues, such as threats to both direct and indirect policy goals often related to the AMFm mechanism itself, including the focus on malaria prevention rather than treatment as well as broader goals related to national pride and the support of local businesses. The idea of the AMFm itself, as well as it a harbinger of approaches (such as market-based approaches) to global health, provoked both support and resistance. . . .

But some sources of resistance stemmed more directly from the piloting process itself. By evidence-informed design, the global fund gave “no assurance to continue [AMFm] in the long-term,” so that the evaluation of the pilot would shape their decision. This presented limited risks to them. At the national level, this uncertainty proved troubling, as many local stakeholders felt it posed national, organizational, and personal risks for policy goals and reputations. Words like ‘vilification‘ and ‘chastisement‘ and ‘bitter‘ came up during key informant interviews. in a point of opposing objectives (if not a full catch-22, a phrase stricken from my thesis), some stakeholders may have supported the pilot if they knew the program would not be terminated (even if modified), whereas global actors wanted the pilot to see if the evidence suggested the program should (not) be terminated. Pilot-specific concerns related to uncertainties around the sunk investments of time in setting up the needed systems and relationships, which have an uncertain life expectancy. also, for a stakeholder trying to decide whether to support or resist a pilot, it doesn’t help when the reputation and other pay-offs from supporting are uncertain and may only materialize should the pilot prove successful and be carried to the next stage. . . .

A final but absolutely key set of concerns for anyone considering working with policy champions is what, precisely, the decision to continue would hinge upon. Would failure to meet benchmarks be taken as a failure of the mechanism and concept? A failure of national implementation capacity and managerial efforts in Ghana (in the face of a key donor)? A failure of individual efforts and initiatives in Ghana? .

Without clarity on these questions about how accountability and blame would be distributed, national stakeholders were understandably nervous and sometimes resistant (passively of actively) to Ghana’s applying to be a phase I pilot country. To paraphrase one key informant’s articulation of a common view, phase I of the AMFm should have been an experiment on how to continue, not whether to continue, the initiative. . . .

How does this fit in with our ideas of ideal evidence-informed decision-making about programs and policies? The experience recorded here raises some important questions when we talk about wanting policy champions and wanting to generate rigorous evidence about those policies. Assuming that the policies and programs under study adhere to one of the definitions of equipoise, the results from a rigorous evaluation could go either way.

What risks does the local champion(s) of a policy face in visibly supporting a policy?

Is clear accountability established for evaluation outcomes?

Are there built-in buffers for the personal and political reputation of champions and supporters in the evaluation design?

The more we talk about early stakeholder buy-in to evaluation and the desire for research uptake on the basis of evaluation results, the more we need to think about the political economy of pilots and those those stepping up to support policies and the (impact) evaluation of them. Do they exist in a learning environment where glitches and null results are considered part of the process? Can evaluations help to elucidate design and implementation failures in a way that has clear lines of accountability among the ‘ideas’ people, the champions, the managers, and the implementer’s? These questions need to be taken seriously if we expect government officials to engage in pilot research to help decide the best way to move a program or policy forward (including not moving it forward at all).

have evidence, will… um, erm (6 of 6, enforcing accountability in decision-making)

this is a joint post with suvojit, continuing from 5 of 6 in the series. it is also cross-posted here.

 

a recent episode reminded us of why we began this series of posts, of which is this is the last. we recently saw our guiding scenario for this series play out: a donor was funding a pilot project accompanied by a rigorous evaluation, which was intended to inform further funding decisions.

in this specific episode, a group of donors discussed an on-going pilot program in Country X, part of which was evaluated using a randomized-control trial. the full results and analyses were not yet in; the preliminary results, marginally significant, suggested that there ought to be a larger pilot taking into account lessons learnt.

along with X’s government, the donors decided to scale-up. the donors secured a significant funding contribution from the Government of X — before the evaluation yielded results. indeed, securing government funding for the scale-up and a few innovations in the operational model had already given this project a sort-of superstar status, in the eyes of both the donor as well as the government. it appeared the donors in question had committed to the government that the pilot would be scaled-up before the results were in. moreover, a little inquiry revealed that the donors did not have clear benchmarks or decision-criteria going into the pilot about key impacts and magnitudes — that is, the types of evidence and results — that would inform whether to take the project forward.

there was evidence (at least it was on the way) and there was a decision but it is not clear how they were linked or how one informed the other.

 

reminder: scenario

we started this series of posts by admitting the limited role evidence plays in decision-making — even when an agency commissions evidence specifically to inform a decision. the above episode illustrates this, as well as the complex and, sometimes, messy way that (some) agencies, like (some) donors, approach decision-making. we have suggested that, given that resources to improve welfare are scarcer than needs, this approach to decision-making is troubling at best and irresponsible at worst. note that it is the lack of expectations and a plan for decision-making that are troublesome as the limited use of outcome and impact evidence.

in response to this type of decision-making, we have had two guiding goals in this series of posts. first, are there ways to design evaluations that will make the resultant outcomes more useable and useful (addressed here and here)? second, given all the factors that influence decisions, including evidence, can the decision-making process be made more fair and consistent across time and space?

to address the second question, we have drawn primarily on the work of Norm Daniels, to consider whether and how decisions can be made through a fair, deliberative process that, under certain conditions, can generate outcomes that a wide range of stakeholders can accept as ‘fair’.

Daniels suggests that achieving four key criteria, these “certain conditions” for fair deliberation can be met, including deliberation about which programs to scale after receiving rigorous evidence and other forms of politically relevant feedback.

 

closing the loop: enforceability

so far, we have reviewed three of these conditions: relevant reasons, publicity, and revisibility. in this post, we examine the final condition, enforceability (regulation or persuasive pressure).

meeting the enforceability criterion means providing mechanisms to ensure that the processes set by the other criteria are adhered to. this is, of course, easier said than done. in particular, it is unclear who should do the enforcing.*

we identify two key questions about enforcement:

  • first, should enforcement be external to or strictly internal to the funding and decision-making agency?
  • second, should enforcement rely on top-down or bottom-up mechanisms?

 

underlying these questions is a more basic, normative question: In which country should these mechanisms reside — the donor or the recipient? the difficulty of answer this question is compounded by the fact that many donors are not nation-states.

we don’t have clear answers to these questions, which themselves likely need to be subjected to a fair, deliberative process. Here, we lay out some of our own internal debates on two key questions, in hopes that they point to topics for productive conversation.

 

  1. should enforcement of agency decision making be internal or external to the agency?

this is a normative question but it links with a positive one: can we rely on donors to self-regulate when it comes to adopted decision-making criteria and transparency commitments?

internal, self-regulation is the most common model we see around us, in the form of internal commitments such as multi-year strategies, requests for funds made to the treasury, etc. in addition, most agencies have an internal but-independent ‘results’ or ‘evaluation’ cell, intended to make sure that M&E is carried out. in the case of DFID for instance, the Independent Commission for Aid Impact (ICAI) seems to have a significant impact on DFID’s policies and programming. it also empowers the British parliament to hold DFID to account over a variety of funding decisions, as well as future strategy.

outside the agency, oversight and enforcement of achieving relevancy, transparency, and revisibility could come from multiple sources. from above, it could be a multi-lateral agency/agreement or a global INGO, similar to a Publish What You Pay(?). laterally, the government in which a program is being piloted could play an enforcing role. finally, oversight and enforcement could come from below, through citizens or civic society organizations, both in donor and recipient countries. this brings us to our next question.

 

  1. should enforcement flow top-down or bottom-up?

while this question could be answered about internal agency functioning and hierarchy, we focus on the potential for external enforcement from one direction or the other. and, again, the question is a normative one but there are positive aspects related to capacity to monitor and capacity to enforce.

enforcement from ‘above’ could come through multilateral agencies or through multi- or bi-lateral agreements. one possible external mechanisms is where more than one donor come together to make a conditional funding pledge to a program – contingent on achieving pre-determined targets.however, as we infer from the opening example, it is important that such commitments should be based on a clear vision of success, not just on political imperatives or project visibility.

enforcement from below can come from citizens in donor and/or recipient countries, including through CSOs and the media. one way in which to introduce bottom-up pressure is if donors adhere to the steps we have covered in our previous posts – agreement on relevant reasons, transparency and revisibility – and thereby involve a variety of external stakeholders, including media, citizens, CSOs. these can contribute to a mechanism where there is pressure from the ground on donors in living up to their own commitments.

media are obviously important players in these times. extensive media reporting of donor commitments is a strong mechanism for informing and involving citizens – in both donor and recipient countries; media are also relevant to helping citizens understand limits and how decisions are made in face of resource constraints.

 

our combined gut feeling, though, is that in the current system of global aid and development, the most workable approach will probably include a mixture of formal top-down and informal bottom-up pressure. from a country-ownership point of view, we feel that recipient country decision-makers should have a (strong) role to play here (more than they seem to have currently), as well as citizens in those countries.

however, bilateral donors, will probably continue to be more accountable to their own citizens (directly and via representative legislatures) and, therefore, a key task is to consider how to bolster their capacity to ensure ‘accountability for reasonableness’ in the use of evidence and decision-making more generally. at the same time multilateral donors may have more flexibility to consider other means of enforcement, since they don’t have a narrow constituency of citizens and politicians to be answerable to. however, we worry that the prominent multilateral agencies we know are also bloated bureaucracies with unclear chains of accountability (as well as a typical sense of self-perpetuation).

while there is no clear blueprint for moving forward, we hope the above debate has gone a small step towards asking the right questions.

 

in sum

in this final post, we have considered how to enforce decision-making and priority-setting processes that are ideally informed by rigorous and relevant evidence but also, more importantly, in line with principles of fairness and accountability for reasonableness. these are not fully evident in the episode that opened this post.

through this series of posts, we have considered how planning for decision-making can help in the production of more useful evidence and can set up processes to make fairer decisions. for the latter, we have relied on Norm Daniel’s framework for ensuring ‘accountability for reasonableness’ in decision-making. this is, of course, only one guide to decision-making, but one that we have found useful in broaching questions of not only how decisions are made but how they should be made.

in it, Daniels proposes that deliberative processes should be based on relevant reasons and commitments to transparency and revisibility that are set ex ante to the decision-point. we have focused specifically on decision-making relating to continuing, scaling, altering, or scrapping pilot programs, particularly those for which putatively informative evidence has been commissioned.

we hope that through these posts, we have been able to make a case for designing evaluations to generate evidence useful decision-making as well as for facilitating fair, deliberative processes for decision-making that can take account of evidence generated.

at the very least, we hope that evaluators will recognize the importance of a fair process and will not stymie them in the pursuit of the perfect research design.

*in Daniels’s work, which primarily focuses on national or large private health insurance plans, the regulative role of the state is clear. in cases of global development, involving several states and agencies, governance and regulation become less clear. noting this lack of clarity in global governance is hardly a new point; however, the idea of needing to enforce the conditions of fair processes and accountability for reasonableness provides a concrete example of the problem.

have evidence, will… um, erm (5 of 6, revisibility)

this is part of a series of joint posts with suvojit. it is also cross-posted at people, spaces, deliberation.

throughout this series of posts (1, 2, 3, 4), we have considered two main issues. first, how can evidence and evaluation be shaped to be made more useful – that is, directly useable – in guiding decision-makers to initiate, modify, scale-up or drop a program? or, as recently pointed out by Jeff Hammer, how can we better evaluate opportunity costs between programs, to aid in making decisions. second, given that evidence will always be only part of policy/programmatic decision, how can we ensure that decisions are made (and perceived to be made) fairly?

for such assurance, we primarily rely on Daniels’ framework for promoting “accountability for reasonableness” (A4R) among decision-makers. if the four included criteria are met, Daniels argues, it brings legitimacy to deliberative processes and, he further argues, consequent fairness to the decision and coherence to decisions over time.

the first two criteria set us up for the third: first, decision-makers agree ex ante to constrain themselves to relevant reasons (determined by stakeholders) in deliberation and, second, make public the grounds for a decision after the deliberation. these first two, we argue, can aid organizational learning and coherence in decision-making over time by setting and using precedent over time – an issue that has been bopping around the blogosphere this week.

these criteria, and an approach ensuring A4R more generally, are also a partial response to increasing calls for donor transparency, made loudly in Mexico City this week via the Global Partnership for Effective Development Co-operation. these calls focus on the importance of public availability of data as the key ingredient of donor (and decision-maker) transparency. we concur on their importance. but we argue that it is incomplete without an inclusive process of setting relevant reasons on how those data are used (recognizing that they will always only be part of the process) and making the decision criteria as well public.

the publicity and transparency around decision-making opens the door for A4R’s third criterion (and the subject of this post): the possibility to appeal and revise decisions. as Daniels notes, this condition “closes the loop between decision-makers and those who are affected by their policies.”

as a quick reminder of our guiding scenario: we specifically focus on the scenario of an agency deciding whether to sustain, scale, or shut-down a given program after piloting it with an accompanying evaluation — commissioned explicitly to inform that decision.

in most decision-making of this kind, some stakeholders — often would-be beneficiaries — will not agree with the decision and even feel or be adversely affected. while we suggest that stakeholders be involved in the earlier process of setting relevant reasons, a grievance-redressal or dispute-resolution mechanism, as provided by the revisibility criterion, gives these stakeholders an opportunity to voice their perspectives, based on the original grounds of the decision.

they can do this because the decision-criteria are made public, via criterion 2. this “visible and public” space for further deliberation provides stakeholders have a route “back into the policy formulation process.” stakeholders can use evidence available to them to advocate a certain way forward; it also allows for stakeholders to revisit the decision-making criteria and the decisions they fostered. stakeholders therefore have the opportunity to make a case for a change in the decision.

why might past decisions be questioned? since the appeals process is largely based on the original decision criteria, appeals come if circumstances around those reasons changed. for example, in considering relevant reasons, feasibility was one category of criteria we proposed, such as government’s capacity to scale a program or their interest in the program. one can imagine that over time, over changes in regime, and over changes in politics and policy, the original answers to these criteria could change, opening space for appeals. an additional set of proposed relevant reasons related to cost, effectiveness, and cost-effectiveness. the costs of technologies and materials may change over time or fresh evidence could come out about long-term benefits of programs. this alters the original cost-benefit ratio, again, opening a space for appeals against the original decision.

such appeals may come from members of civil society (or government) that would like to see the program brought back to life (or to see it go away). these may also come from donors themselves wanting to look at their decision-making over time and implement changes in line with the changing context.

Daniels is careful to note, and we emphasize, that the power and purpose of this criterion is not that citizens will always overturn prior decisions.* decisions on limits are requisite, as needs generally outstrip resources. rather, the revisability criterion allows for reconsideration and reflection on those decisions by those knowledgeable about the topic and empowered to alter decisions, if seen fit and feasible. this can, Daniels notes, bring further legitimacy to decision-making processes and, again, improved decision-making over time.

we want to stress that these deliberations over decision-making and their ‘revisibility’ have to be situated in a rational and ethical decision-making framework, predicated on meeting needs fairly when not all can be met (distinct from, say, a legal framework). appeals will have to be judged on the original merits of the arguments as well as with recognition that aid resources have limits (although obviously, a different argument can be made that aid budgets should simply be bigger).  moreover, appeals need to be judged by people who understand the original decision and have the power to change it, if that is the decision taken. when decision-making criteria are set, they set the roadmap for a possible appeals process and should be accordingly discussed and agreed upon.

we started this series of posts by admitting the limited role evidence plays in decision-making — even when those commissioning evidence intend specifically to inform that decision. we considered how planning for decision-making can help in the production of more useful evidence and also how decisions can be made fairly, through the delineation of relevant reasons, the publicity of the decision criteria ultimately used, and now, the possibility of revisiting through criteria and revising decisions.

our thoughts in this series of posts should not make fair decision-making seem like an impossible task. not all aspects of each of these considerations can be taken into account – the constraints of the real world are not lost on us and A4R remains an ideal, though we think one that can be approached. in our final post of this series, we therefore attempt to close the loop by looking at enforcement – asking how these ideas can be enforced and decision-makers held accountable.

*see. e.g., Richard Horton’s recent slide about the limit-breaking decisions by courts and the effects on health care systems, as in cases like Colombia. experiments with health courts may be instructive. picture via @fanvictoria, citing @richardhorton1.

have evidence, will… um, erm? (4 of 6, going public)

this is a joint post with suvojit. it is also posted on people, spaces, deliberation.

in our last post, we discussed how establishing “relevant reasons” for decision-making ex ante may enhance the legitimacy and fairness of deliberations on resource allocation. we also highlight that setting relevant decision-making criteria can inform evaluation design by highlighting what evidence needs to be collected.

we specifically focus on the scenario of an agency deciding whether to sustain, scale or shut down a given program after piloting it with an accompanying evaluation — commissioned explicitly to inform that decision. our key foci are both how to make evidence useful to informing decisions and how, recognizing that evidence plays a minor role in decision-making, to ensure decision-making is done fairly.

for such assurance, we primarily rely on Daniels’ framework for promoting “accountability for reasonableness” (A4R) among decision-makers. if the four included criteria are met, Daniels argues, it will bring legitimacy to deliberations and, he further argues, consequent fairness to the decision.

in this post, we continue with the second criterion to ensure A4R: the publicity of decisions taken drawing on the first criterion, relevant reasons. we consider why transparency – that is, making decision criteria public – enhances the fairness and coherence of those decisions. we also consider what ‘going public’ means for learning.

disclaimer: logistical uncertainties / room for conversation and experimentation

from the outset, we acknowledge the many unanswered questions about how much publicity or transparency suffice for fairness and how to carry it out.

  • should all deliberations be opened to the public? Made available ex post via transcripts or recordings? or, is semi-transparency — explicitly and publicly announcing ex post the criteria deemed necessary and sufficient to take the final decision — acceptable, while the deliberation remains behind closed doors?
  • who is the relevant public?
  • can transparency be passive – making the information available to those who seek it out – or does fairness require a more active approach?
  • what does ‘available’ or ‘public’ mean in contexts of low-literacy and limited media access?

we do not address these questions — which are logistical and empirical as well as moral — here. as the first-order concern, we consider why this criterion matters.

 

fairness in specific decisions

any decision about resource allocation and limit-setting will be contrary to the preferences of some stakeholders – both those at and not at the decision table. in our scenario, for example, some implementers will have invested some quantity of blood, sweat and tears into piloting a program and may, as a result, have opinions on whether the program should continue; or, those that were comfortable in their inaction (as a result of lack of directives or funds or just plain neglect) who will now have to participate in a scale-up. there will be participants who benefited during the pilot – and those who would have done so if the program were scaled – that may prefer to see the program maintained.

these types of unmet preferences shape Daniels’s central concern: what can an agency* say to those people whose preferences are not met by a decision to convince them that, indeed, the decision “seems reasonable and based on considerations that take… [their] welfare into account?”** being able to give acceptable explanations to stakeholders about a decision is central to fairness.

 

coherence across decisions

the acceptability of criteria for a given decision contribute to the fairness of that decision. But long-run legitimacy of decision-makers benefits from consistency and coherency in organizational policy. transparency, and the explicitness it requires, can foster this.

once reasons for a decision are made public, it is more difficult to not deal with similar cases similarly – the use of ‘precedent’ in judicial cases aptly illustrates this phenomenon. treating like as like is an important requirement of fairness. Daniels envisions that a series of explicated decisions can function as an organizational counterpart of ‘case law’. future decision-makers can draw on past deliberations to establish relevant reasons. deviations from past decisions would need to be justified by relevant reasons.

 

 

implications for learning, decision-making and evaluations

if all decision-makers acknowledge that, at least, the final reasons for their decisions will be publicly accessible, how might that change the way they commission an evaluation and set about using the evidence from it?

it should encourage a review of past deliberations to help determine currently relevant reasons. second, it might encourage decision-makers and evaluators to consider as relevant reasons and measures that will be explainable and understandable to the public(s) when justifying their decisions.

  • in planning evaluations, decision-makers and researchers will have to consider the clarity in methods of data collection and analysis — effectively, will it pass a ‘grandmother test’? moreover, does it pass such a test when that granny is someone affected by your allocative decision? remember the central question that makes this criterion necessary: what can an agency say to those whose preferences are not met by a decision that, indeed, the decision “seems reasonable and based on considerations that take… [their] welfare into account?”
  • there are reasons that decision-makers might shy away from transparency. in his work on health plans, Daniels notes that such organizations speculatively feared media and litigious attacks. in our pilot-and-evaluate scenario, some implementers may not be comfortable with publicizing pilots that may fail; or from raising expectations of beneficiaries that are part of pilots.
  • the fear of failure may influence implementers; this may lead to low-risk/low-innovation pilots. again, this is an important consideration raised above, in the questions we did not answer: when and how much transparency suffices for fairness?

 

in our last blog, we stressed on the importance of engaging stakeholders in setting ‘relevant reasons’ before a project begins, as a key step towards fair deliberative processes as well as a way of shaping evaluations to be useful for decision-making. ensuring publicity and transparency of the decision-making criteria strengthens the perception of a fair and reasonable process in individual cases and over time.

this also sets the stage for an appeals process, where stakeholders can use evidence available to them to advocate a certain way forward; it also allows for stakeholders to revisit the decision-making criteria and the decisions they fostered – the subject of our next post in this series.

***

*we note that donors don’t actually often have to answer directly to implementers and participants for their decisions. We do not, however, dismiss this as a terrible idea.

**we are explicitly not saying ‘broader’ welfare because we are not endorsing a strictly utilitarian view that the needs of some can be sacrificed if the greater good is enhanced, no matter where or how  that good is concentrated.

have evidence, will… um, erm? (1 of 2)

this is a joint post with suvojit chattopadhyay, also cross-posted here.

commissioning evidence

among those who talk about development & welfare policy/programs/projects, it is tres chic to talk about evidence-informed decision-making (including the evidence on evidence-informed decision-making and the evidence on the evidence on…[insert infinite recursion]).

this concept — formerly best-known as evidence-based policy-making — is contrasted with faith-based or we-thought-really-really-hard-about-this-and-mean-well-based decision-making. it is also contrasted with the (sneaky) strategy of policy-based evidence-making. using these approaches may lead to not-optimal decision-making, adoption of not-optimal policies and subsequent not-optimal outcomes.

in contrast, proponents of the evidence-informed decision-making approach believe that through approach, decision-makers are able to make more sound judgments between those policies that will provide the best way forward, those that may not and/or those that should maybe be repealed or revised. this may lead them to make decisions on policies according to these judgments, which, if properly implemented or rolled-back may, in turn, improve development and welfare outcomes. it is also important to bear in mind however that it is not evidence alone that drives policymaking. we discuss this idea in more detail in our next post.

in this post, we work with a scenario where evidence is accepted as an important determinant of decision-making and this is acknowledged at least broadly by stakeholders who make explicit (or implicit) commitments to ‘use’ the evidence generated to drive their decisions. as good as this may sound, there are barriers to making decisions informed by evidence. one is the stock of accessible well-considered data and rigorous analyses, including the stock in readable-yet-appropriately-nuanced, relevant, timely forms. several organizations’ raison d’etre is to increase this supply of ‘much needed’ evidence. another barrier is lack of demand among decision-makers for (certain types of rigorous) evidence (not just for per diems that come with listening about evidence) – including evidence that could have positive or negative outcomes.

we don’t disagree that both supply and demand for high-quality evidence are important issues. but these two posts are not about those scenarios. rather, we focus on a scenario in which there is, at least, the demand for commissioning evidence.

key examples are donor agencies, big (I)NGOs (BINGOs, if we must) or even government ministries that engage in evidence-generating activities, particularly when the stated goal is to make decisions about piloted programs (continue funding, scale-up, scrap, etc) or make significant tweaks to on-going programs. this should be the ‘easiest’ case of using evidence to inform a decision, where demand for evidence leads to the generation of a supply of by-definition-relevant evidence.

and yet, from what we have seen and experienced, even agencies that have made it to this seeming enlightened precipice of evidence-informed decision-making don’t know, at a practical level, what to do with that evidence once they’ve got it. we are not suggesting that those inside such agencies are not skilled at reading and interpreting evidence. rather, we suggest that so much attention has been given to supplying and demanding evidence that use has been overlooked.

absent attention on use, how generated evidence informs decision-making, if it does at all, is something of a mystery. absent a plan for use, it can also be mysterious (or, at least, not transparent) as to why the agency bothered to commission the evidence-generation at all. we suspect that better considered evidence and better plans for use can improve the use of evidence. our hunches drive these two blog posts.

in this post, we make two main points.

one, we hold that that a careful formative stage during which stakeholders are engaged to help develop research questions while remaining mindful of the policy process can help generate evidence that those stakeholders will know how to use. there is overlap and complementarity between our suggestions and the recent ideas of Monitoring, Structured experiential Learning & Evaluation (MeE; Pritchett, Samji & Hammer) and Problem-Driven Iterative Adaptation (PDIA; Andrews, Pritchett & Woolcock). however, here, we remain focused on planning for evaluation and setting the questions.

two, and relatedly, we advocate for more careful planning of how the generated evidence will be used in decision-making, regardless of the outcomes. in our next post, we take seriously that evidence is far from the only decision-making criterion. we discuss how evidence might be fit into a fair, deliberative process of decision-making by agencies and what such a process might entail.

at the outset, we recognize that there is a poor one-to-one mapping of the results of a single rigorous study or paper with policy changes (e.g. and also fun). in these two posts, however, we stay focused on studies that are set up specifically to guide future decisions and thus *should*, by definition, be immediately relevant to policy/programmatic funding/scaling decisions.

formative work: assessing needs and interests of decision-makers and other stakeholders

an early and wise step, we think, in planning evaluation that is not only policy-associated (we looked at a real, live policy!) but explicitly policy-relevant in terms of decision-making is to identify what kinds of decisions may be made at the end of the evaluation (i.e. what will be informed) and who may be involved. ‘involved’ includes elite decision-makers and possible policy champions and heroes; it also includes middle- and street-level bureaucrats who will implement the policy/program if that is the decision taken (see, e.g. here and here on getting buy-in beyond visible leaders).

among those who talk about demand-generation for evidence, there’s increasing recognition that stakeholder buy-in for the process of using evidence (not just for the policy under investigation) is required early on. but there seems to be less talk on actually asking stakeholders what they want to know to make decisions. we don’t suggest that what stakeholders deem most interesting should define the limits of what will be collected, analyzed and presented. many decision-makers won’t spontaneously crave rigorous impact evaluation.

there is plenty of evidence that decision-makers are heavily influenced by stories, images, even immersive experiences. this is not categorically bad and it certainly should not be ignored or discounted. rather, in addition to the types of data and analyses readily labelled as rigorous in the impact evaluation arena, we can be creative about collecting and analyzing additional types of data in more rigorous and positioned within a counterfactual framework. because, in the end, incorporating stakeholder preference for the kinds of evidence they need to drive policy change would enhance the quality of the evidence generation process.

another consideration relates to asking what magnitude of impacts decision-makers feel they need to see to be confident in making their decisions. we don’t suggest this is an easy question to ask — nor to answer. we only suggest that it could be a useful exercise to undertake (as with all our suggestions, empirical evidence from process data about decision-making would be very helpful) .

a related exercise is to honestly assess reasoned expectations for the elapsed time between introducing an intervention and the potential expression of relevant impacts. the evaluation should be planned accordingly, as a shorter evaluation period may not generate outcomes related to the issues .

planning to use evidence

it often seems that commissioners of evidence (and even those who generate the evidence) don’t actively consider how the evidence will actually be used in design or funding or whatever decisions will be made. there seems to be that there seems to be even less consideration of how the evidence will be used regardless of what the outcome is – positive, negative, mixed, null (a point made by, among others in other fora, Jeannie Annan, here). this may be one reason null and negative results seem to go unaddressed.

if there is a (potentially imposed) desire to commission rigorous evidence, one might assume there is genuine equipoise (or uncertainty, also here) about the efficacy, effectiveness or cost-effectiveness of a policy/program. yet many talks about early buy-in are actually about the program and the potential to validate a flagship programme and justify related spending through evaluation — not about the value of the evaluation process itself for learning. we don’t think this represents the best use of evaluation resources.

an exercise early in the formative phase during which decision-makers consider how the evidence help them make a decision may be useful – if they are asked to consider scenarios in which the evidence is clearly positive, clearly negative or null, mixed, fuzzy or indeterminant. this might also help to clarify research questions that should be asked as part of an evaluation.

in a recent blog post, dr. ian goldman suggests getting decision-maker buy-in by asking “departments to submit proposals for evaluations so that they will want to use the findings.” this is an important step. but it does not mean that proposal-submitters have considered how they will use the evidence if it comes back anything but unequivocally positive for the policy/program/project in question.

dr. goldman also proposes asking departments to design “improvement plans” after their evaluations are complete. we’d like to hear more about this process. but we suspect that drafting such a plan early in the formative stage might actually inform some of the research questions, thus better linking the evaluation to action plans for improvement. for example, sophie oxfam has written about IE results that left them with an “evidence puzzle” rather than a clear idea of how to improve the program. we don’t know if an early exercise in drafting an “improvement plan” would have yielded less puzzling outcomes — but that is an empirical question.

we hope that agencies doing such formative work will document and share the processes and their experiences.

be honest about the full theory of change for using evidence

in a good evaluation, positive validation is not the only possible outcome. therefore, the commissioning agency should honestly consider whether, if the results come back null or negative, the agency would actually be willing to pull or roll-back the policy. in many cases, programs have political cache and entitlement value regardless of objective welfare benefits delivered. rolling-back will not be a politically viable option in such cases. while it is important to build the general evidence base about policy/program cost/effectiveness, when an agency asks for evidence towards a particular decision that it isn’t actually willing to make, we are not sure the eval should go forward.

or, at least, we are uncertain if it should go forward as a yes/no question, where a negative result implies stopping the program. we suspect that evaluation will start to be more appreciated by decision-makers if designed to compare the effectiveness of option A or option B in delivering the favored program, rather than only examining whether option A works (and why). the former set-up provides ways forward regardless of the outcome; the latter may, in the political sense, not.

moving forward

in sum, we think that careful formative and needs-assessment work on what decision-makers (and potential implementers) want to see to be convinced and what types of evidence will inform decision-making may lead to the generation of evidence that is not only policy-related but genuinely policy-relevant.  when an agency or ministry specifically commissions an evaluation with the stated goal of using it in decision-making, this seems particularly important. doing this work well will require collaboration between commissioners, implementers and evaluators.

in the next post, we (humbly) consider the overall role evidence plays in decision-making and consider how it might fit into an overall fair and deliberative process.