i feel like an #oddeven party pooper (reducing and working are not the same)

there are two nice, evidence-informed op-ed pieces out today on delhi’s odd-even scheme to try to reduce air pollution (here and here). the results are heartening because i didn’t have a good sense of whether a two week window of implementing a policy — to which there were many exceptions — was long enough to potentially detect a statistically significant change in meaningful measures of pollution. nor, admittedly, did i feel that i was breathing cleaner air the past two weeks. as one the articles points out, much of the anecdotal chatter has been about clearer roads, not about clearer skies.

.

since i live in delhi, am certainly affected by the air quality, and worried about my health accordingly (plume tells me every day that the situation is dire), i was pretty pleased to wake up to the headline “yes delhi, it worked.” and what has indeed happened is that good evidence (rigorously obtained, as laid out by suvojit) has been generated of a statistically significant reduction in nasty particulate matter (pm 2.5) (by 18%) during the hours the intervention was in effect.

.

this was a policy that i wanted to see work, so i am pleased that the evidence shows a reduction in the particulate matter that is driving many of my good friends out of the city (alongside many other woes). but we must be careful — whether something “worked” is more subjective than is the evidence of a reduction, which greenstone and colleagues have nicely and rapidly documented.

.

if models had predicted a 50% reduction, we wouldn’t have been so thrilled about 18%. if the government had said that every little bit counts and that even a 5% reduction would be counted by them as a success and a reason to commit to continuing the program, then indeed, 18% is quite impressive.

.

moving forward, as delhi tries to clean up its act and hopefully become a model for the rest of the country, clarifying up-front decision-points and definitions of success will be important. for the next pilots — because delhi desperately needs such measures — how will we declare, in a rigorous and defensible way, that a policy effort ‘worked’ well enough to be scaled and continued?  those of us interested in promoting the use of rigorous evidence and evaluation to inform decision-making need to be slightly cautious in our interpretations and celebrations of victory when we haven’t said up front what we’ll count as a triumph.

.

*as an addendum (31 jan 2016), it is not clear that the researchers themselves penned the title ‘yes delhi, it worked.’ for the benefit of the doubt, i am hoping that the researchers submitted something more along the lines of ‘yes delhi, odd-even reduced pollution’ and that the newspaper itself opted to change it. but the point holds that success is subjective and therefore requires a definition, preferentially ex ante.

Advertisements

thoughts from #evalcon on evidence uptake, capacity building

i attended a great panel today, hosted by the think take initiative and idrc and featuring representatives from three of tti’s cohort of think tanks. this is part of the broader global evaluation week (#evalcon) happening in kathmandu and focused on building bridges: use of evaluation for decision making and policy influence. the notes on evidence-uptake largely come from the session while the notes on capacity building are my own musings inspired by the event.

.

one point early-on was to contrast evidence-informed decision-making with opinion-informed decision-making. i’ve usually heard the contrast painted as faith-based decision-making and think the opinion framing was useful. it also comes in handy for one of the key takeaways from the session, which is that maybe the point (and feasible goal) isn’t to do away with opinion-based decision-making but rather to make sure that opinions are increasingly shaped by rigorous evaluative evidence. or to be more bayesian about it, we want decision-makers to continuously update their priors about different issues, drawing on evidence.

.

this leads to a second point. in focusing on policy influence, we may become too focused on influencing very specific decision-makers for very specific decisions. this may lead us to lose sight of the broader goal of (re-)shaping the opinions of a wide variety of stakeholders and decision-makers, even if not linked to the immediate policy or program under evaluation. so, again, the frame of shaping opinions and aiming for decision-maker/power-center rather than policy-specific influence may lead to altered approaches, goals, and benchmarks.

.

a third point that echoed throughout the panel is that policy influence takes time. new ideas need time to sink in and percolate before opinions are re-shaped. secretary suman prasad sharma of nepal noted that from a decision-maker point of view, evaluations are better and more digestible when they aim to build bit by bit. participants invoked a building blocks metaphor several times and contrasted it with “big bang” results. a related and familiar point about the time and timing required for evaluation to change opinions and shape decisions is that planning for the next phase of the program cycle generally begins midway through current programming. if evaluation is to inform this next stage of planning, it requires the communication of interim results — or a more thoughtful shift of the program planning cycle relative to monitoring and evaluation funding cycles in general.

.

a general point that came up repeatedly was what constitutes a good versus a bad evaluation. this leads to a key capacity-building point: we need more “capacity-building” to help decision-makers recognize credible, rigorous evidence and to mediate between conflicting findings. way too often, in my view, capacity-building ends up being about how particular methods are carried out, rather than on the central task of identifying credible methodologies and weighting the findings accordingly (or on broader principles of causal inference). that is, capacity-building among decision-makers needs to (a) understand how they currently assess credibility (on a radical premise that capacity-building exercises might generate capacity on both sides) and (b) help them become better consumers, not producers, of evidence.

.

a point that surfaced continuously about how decision-makers assess evidence was about objectivity and neutrality. ‘bad evaluations’ are biased and opinionated; ‘good evaluations’ are objective. there is probably a much larger conversation to be had about parsing objectivity from independence and engagement as well as further assessment of how decision-makers assess neutrality and how evaluators might establish and signal their objectivity. as a musing: a particular method doesn’t guarantee neutrality, which can also be violated in shaping the questions, selecting the site and sample, and so on.

.

other characteristics of ‘good evaluation’ that came out included those that don’t confuse being critical with only being negative. findings about what is working are also appreciated. ‘bad evaluation’ assigns blame and accountability to particular stakeholders without looking through a nuanced view of the context and events (internal and external) during the evaluation. ‘good evaluation’ involves setting eval objectives up front. ‘good evaluation’ also places the findings in the context of other evidence on the same topic; this literature/evidence review work, especially when it does not focus on a single methodology or discipline (and, yes, i am particularly alluding to RCT authors that tend to only cite other RCTs, at the expense of sectoral evidence and simply other methodologies), is very helpful to a decision-making audience, as is helping to make sense of conflicting findings.

..

a final set of issues related to timing and transaction costs. a clear refrain throughout the panel is the importance of the timing of sharing the findings. this means paying attention to the budget-making cycle and sharing results at just the right moment. it means seeing windows of receptivity to evidence on particular topics, reframing the evidence accordingly, and sharing it with decision-makers and the media. it probably means learning a lot more from effective lobbyists. staying in tune with policy and media cycles in a given evaluation context is hugely time consuming. a point was made and is well-taken that the transaction costs of this kind of staying-in-tune for policy influence is quite high for researchers. perhaps goals for influence by the immediate researchers and evaluators should be more modest, at least when shaping a specific decision was not the explicit purpose of the evaluation.

.

one is to communicate the findings clearly to and to do necessary capacity-building with naturally sympathetic decision-makers (say, parliamentarians or bureaucrats with an expressed interest in x issue) to become champions to keep the discussion going within decision-making bodies. to reiterate, my view is that a priority for capacity-building efforts should focus on helping decision-makers become evidence champions and good communicators of specific evaluation and research findings. this is an indirect road to influence but an important one, leveraging the credibility of decision-makers with one another. two, also indirect, is to communicate the findings clearly to and to do necessary capacity-building with the types of (advocacy? think tank?) organizations whose job is to focus on the timing of budget meetings and shifting political priorities and local events to which the evidence can be brought to bear.

.

the happy closing point was that a little bit of passion in evaluation, even while trying to remain neutral and objective, does not hurt.

Buffet of Champions: What Kind Do We Need for Impact Evaluations and Policy?

This post is also cross-posted here and here.

I realize that the thesis of “we may need a new kind of champion” sounds like a rather anemic pitch for Guardians of the Galaxy. Moreover, it may lead to inflated hopes that i am going to propose that dance-offs be used more often to decide policy questions. While I don’t necessarily deny that this is a fantastic idea (and would certainly boost c-span viewership), I want to quickly dash hopes that this is the main premise of this post.

Rather, I am curious why “we” believe that policy champions will be keen on promoting and using impact evaluation (and subsequent evidence syntheses of these) and to suggest that another range of actors, which I call “evidence” and “issue” champions may be more natural allies. there has been a recurring storyline in recent literature and musings on (impact) evaluation and policy- or decision-making:

  • First, The aspiration: the general desire of researchers (and others) to see more evidence used in decision-making (let’s say both judgment and learning) related to aid and development so that scarce resources are allocated more wisely and/or so that more resources are brought to bear on the problem.
  • Second, The dashed hopes: the realization that data and evidence currently play a limited role in decision-making (see, for example, the report on the evidence on evidence-informed policy-making as well as here).
  • Third, The new hope: the recognition that “policy champions” (also “policy entrepreneurs” and “policy opportunists”) may be a bridge between the two.
  • Fourth, The new plan of attack: bring “policy champions” and other stakeholders in to the research process much earlier in order to get up-take of evaluation results into the debates and decisions. this even includes bringing policy champions (say, bureaucrats) on as research PIs.

There seems to be a sleight of hand at work in the above formulation and it is somewhat worrying in terms of equipoise and the possible use of the range of results that can emerge from an impact evaluation study. Said another way, it seems potentially at odds with the idea that the answer to an evaluation is unknown at the start of the evaluation. .

While I am not sure that “policy champion” has been precisely defined (and, indeed, this may be part of the problem), this has been done for the policy entrepreneur concept. So far as I can tell, the first time to articulate the entrepreneurial (brokering, middle-man, risk-taking) role in policy-making comes from David E. Price in 1971. The idea was repeated and refined in the 1980s and then became more commonplace in 1990s’ discussions of public policy, in part through the work of John Kingdon. (There is also an formative and informative 1991 piece by Nancy Roberts and Paula King.)

Much of the initial discussion, it seems, came out of studying US national and state-level congressional politics but the ideas have been repeatedly shown to have merit in other deliberative settings. Much of the initial work also focused on agenda-setting — which problems and solutions gain attention — but similar functions are also important in the adoption and implementation of policy solutions. Kingdon is fairly precise about the qualities of a policy entrepreneur — someone who has, as Kingdon calls it, a pet policy that they nurture over years, waiting for good moments of opportunity to suggest their policy as the solution to a pressing problem.

  • First, such a person must have a “claim to a hearing” — that is, at least behind-the-scenes, people must respect and be willing to be listen to this person on this topic (especially if this person is not directly in a position with decision-making power).
  • Second, such a person must have networks and connections as well as an ability to bargain and negotiate within them. this is a person that can broker ideas across diverse groups of people, can “soften-up” people to the entrepreneur’s preferred policy solution, etc.
  • Third, such a person must have tenacity, persistence and a willingness to risk personal reputation and resources for a policy idea.

In Kingdon’s and others’ conception, a policy entrepreneur has to work at selling their idea over a long period of time (which is presumably why Weissert (1991) also introduced the idea of policy opportunists, who only start to champion ideas once they make it to the deliberating table and seem likely to move forward.) In short, policy entrepreneurs (and through the sloppy use of near-synonyms, policy champions,) believe strongly in a policy solution and for some reason and have put in time, effort, and reputation into moving the idea forward. Note the nebulous use of “some reason” — I have not found a definition that specifies that policy entrepreneurs must come to promote a policy through a particular impetus. Glory, gold, God, goodness, and (g’)evidence also seem to be viable motivators to fit the definition. .

My question is: is this what we need to support the use of research (and, specifically impact evaluations and syntheses thereof) on decision-making. It is not clear to me that we do. Policy entrepreneurs are people already sold on a particular policy solution, whereas the question behind much evaluation work is ‘is this the best policy solution for this context?’ (Recognizing the importance of contextual and policy, if not clinical, uncertainty about the answer in order for an evaluation to be worthwhile. It seems to me, then, that what we (researchers and evaluator’s) actually need, then, are people deeply committed to one of two things:

(1) The use of data and evidence, in general, (“evidence champions” or, at least loosely, technocrats) as an important tool in sound decision-making and/or

(2) a particular issue or problem (“issue champions” — no doubt a sexier phrase is available). i’ll spend more time on the second. .

An “issue champion,” for example, may be someone who has similar qualities of a policy entrepreneur but, rather than using claims to a hearing, a network, and tenacity to bring forward a policy solution, s/he uses these tools to bring attention to a problem — say, malaria mortality. This person feels that malaria is a problem that must be solved — and is open to finding the most (cost-) effective solution to the problem (or means to do a good job with implementing that solution).

S/He is not, by contrast, someone already committed to believing that prevention, diagnostics, or treatment in any particular form or at any particular price are the best way forward until s/he has seen evidence of this in a relevant context. This is different from a “policy champion” who has, for example, been pushing for universal bednet coverage for the past 20 years. This is not to say that you don’t want the bednet champion to be well aware of your study and to even have input into defining the research questions and approving the research design (in fact, this seems vital in lending credibility and usefulness to the results). But, the way the study is structured will be important to whether the bednet champion is open to taking up the range of possible results from your study.

If your question is: does approach A or approach B result in more efficient distribution of bednets, then yes, both sets of results will be interesting to the bednet champion.

But if the question is more of the type: are bednets the most cost-effective approach to addressing malaria mortality in our country? then the bednet champion is likely to only be particularly interested in trumpeting about one set of results: those that are significantly in favor of bednets as a solution to the malaria problem.

The malaria/issue champion (or general evidence enthusiast), on the other hand, may be more open to thinking about how to interpret and use the range of possible results from the study, which may also be mixed, inconclusive, or even negative. (Throughout this discussion, I recognize that malaria, like all problems in human and economic development, don’t have silver bullet answers and that, therefore, “A or not-A”-type evaluation questions will only get us so far in getting the right mix of tools in the right place at the right time. i.e. the answer is likely neither that bednets do not good nor that they are the only thing needed to tackle malaria.) .

The worrisomeness, then, of the policy champion is that they are already committed to a policy solution. Will they change their mind on the basis of one study? Probably not (nor, necessarily, should they. But a meta-analysis may not sway them either.) But insofar as “we” want decision-makers to learn about our evidence and to consider it in the deliberations, it may be issue, rather than policy, champions that are particularly important. They may make use of the results regardless of what they are. We cannot necessarily expect the same of the policy champion. Of course, a small army of evidence champions is also helpful. I do want to stress that it is critical to have policy champions and other stakeholders involved early in the research-design process, so that the right questions can be asked and the politically and contextually salient outcomes and magnitudes considered. But as an ally in the evaluation process and, say, a potential PI on an evaluation, it seems that the issue champions are the folks likely to stick with it. .

And, yes, issue champions should probably have some moves ready, in case of a dance-off (as there will always be factors beyond evidence and data influencing decisions).