some posts… no shit

some posts maybe you are never ready to write. nevertheless, the news of marc roberts‘s death over the weekend seems to warrant both an immediate response and the response that is just right. he seemed to usually be able to manage these simultaneously but, given the sense of time that follows a completion of a life well-lived and well-said, i’ll err on the side of the former.

i won’t claim to have known marc as well as many but i knew him well enough to respect him, which means, perhaps, seeing past rough first impressions. he pronounced himself a reformed economist at some point early enough to influence me: politics and the realities of implementation and the curves of ethics-in-real-life became the subject of his writing and his teaching and we are better for it.

marc had a standard line — a bit of a trap — that he would lead you  into (funnier to watch others go than to realize you had followed in). you might make a comment; maybe even one you thought useful. then he would start. he grew up in jersey. [fill in a few lines about the roughness of growing up in a steel town in jersey.] they had a saying back then, he’d say, that would apply to the point you’d just raised.

no shit.

familiar and biting each time (after the first, which was less pleasant). what always made it ok was the sense that he was, and wanted you to be, in pursuit of the right questions. he raised questions of distribution when everyone else was looking at average treatment effects. he was a reformed economist when the economist profession was booming. he wanted to know about implementation when everyone was looking at theoretical equations. and wanted to know about theory when everyone was looking at the sexy result of the moment.

we were through “pinning butterflies,” i was told indirectly, from marc. categorizing of treatments or results wasn’t what we needed — we needed to explain things and try to make sense of them.

and then to do better.

some posts you are never ready to write. but some some are scratched in before you even sit down to it and some give you a sense that you shouldn’t wait. with marc, the gist sank in early, so one doesn’t have to do much work to imagine he’s still around. which is quite a good thing.

we need his voice. it’ll be missed but, as with all good teachers, it, with its gruff accent, is hardly gone.

thank you, marc.

 

 

data systems strengthening

i have been saying for some time that my next moves will be into monitoring and vital registration (more specifically, a “poor richard” start-up to help countries to measure the certainties of life: (birth), death, and taxes. (if village pastors could get it done with ink and scroll in the 16th c across northern Europe, why aren’t we progressing with technology??! surely this is potentially solid application of the capacity of mobile phones as data collection and transmission devices?).

i stumbled onto a slightly different idea today, of building backwards from well-financed evaluation set-ups for specific projects to more generalized monitoring systems. this would be in contrast to the more typical approach of skipping monitoring all together or only working first to build monitoring systems (including of comparison groups), followed at some point by an (impact) evaluation, when monitoring is adequately done.

why don’t more evaluations have mandates to leave behind data collection and monitoring systems ‘of lasting value,’ following-on an impact or other extensive, academic (or outsider)-led evaluation? in this way, we might also build from evaluation to learning to monitoring. several (impact) evaluation organisations are being asked to help set up m&e systems for organizations and, in some cases, governments. moreover, many donors talk about mandates for evaluators to leave behind built-up capacity for research as part of the conditions for their grant. but maybe it is time to start to talking about mandates to leave behind m&e (and MeE) systems — infrastructure, plans, etc.

a potentially instructive lesson (in principle if not always in practice) is of ‘diagonal’ health interventions, in which funded vertical health programs (e.g. disease-specific programs, such as an HIV-treatment initiative) be required to also engage in overall health systems strengthening (e.g.).

still a nascent idea but i think one worth having more than just me thinking about how organisations that have developed (rightly or not) reputations for collecting and entering high-quality data for impact evaluation could build monitoring systems backwards, as part of what is left behind after an experiment.

(also, expanding out from DSS sites an idea worth exploring.)

have evidence, will… um, erm (6 of 6, enforcing accountability in decision-making)

this is a joint post with suvojit, continuing from 5 of 6 in the series. it is also cross-posted here.

 

a recent episode reminded us of why we began this series of posts, of which is this is the last. we recently saw our guiding scenario for this series play out: a donor was funding a pilot project accompanied by a rigorous evaluation, which was intended to inform further funding decisions.

in this specific episode, a group of donors discussed an on-going pilot program in Country X, part of which was evaluated using a randomized-control trial. the full results and analyses were not yet in; the preliminary results, marginally significant, suggested that there ought to be a larger pilot taking into account lessons learnt.

along with X’s government, the donors decided to scale-up. the donors secured a significant funding contribution from the Government of X — before the evaluation yielded results. indeed, securing government funding for the scale-up and a few innovations in the operational model had already given this project a sort-of superstar status, in the eyes of both the donor as well as the government. it appeared the donors in question had committed to the government that the pilot would be scaled-up before the results were in. moreover, a little inquiry revealed that the donors did not have clear benchmarks or decision-criteria going into the pilot about key impacts and magnitudes — that is, the types of evidence and results — that would inform whether to take the project forward.

there was evidence (at least it was on the way) and there was a decision but it is not clear how they were linked or how one informed the other.

 

reminder: scenario

we started this series of posts by admitting the limited role evidence plays in decision-making — even when an agency commissions evidence specifically to inform a decision. the above episode illustrates this, as well as the complex and, sometimes, messy way that (some) agencies, like (some) donors, approach decision-making. we have suggested that, given that resources to improve welfare are scarcer than needs, this approach to decision-making is troubling at best and irresponsible at worst. note that it is the lack of expectations and a plan for decision-making that are troublesome as the limited use of outcome and impact evidence.

in response to this type of decision-making, we have had two guiding goals in this series of posts. first, are there ways to design evaluations that will make the resultant outcomes more useable and useful (addressed here and here)? second, given all the factors that influence decisions, including evidence, can the decision-making process be made more fair and consistent across time and space?

to address the second question, we have drawn primarily on the work of Norm Daniels, to consider whether and how decisions can be made through a fair, deliberative process that, under certain conditions, can generate outcomes that a wide range of stakeholders can accept as ‘fair’.

Daniels suggests that achieving four key criteria, these “certain conditions” for fair deliberation can be met, including deliberation about which programs to scale after receiving rigorous evidence and other forms of politically relevant feedback.

 

closing the loop: enforceability

so far, we have reviewed three of these conditions: relevant reasons, publicity, and revisibility. in this post, we examine the final condition, enforceability (regulation or persuasive pressure).

meeting the enforceability criterion means providing mechanisms to ensure that the processes set by the other criteria are adhered to. this is, of course, easier said than done. in particular, it is unclear who should do the enforcing.*

we identify two key questions about enforcement:

  • first, should enforcement be external to or strictly internal to the funding and decision-making agency?
  • second, should enforcement rely on top-down or bottom-up mechanisms?

 

underlying these questions is a more basic, normative question: In which country should these mechanisms reside — the donor or the recipient? the difficulty of answer this question is compounded by the fact that many donors are not nation-states.

we don’t have clear answers to these questions, which themselves likely need to be subjected to a fair, deliberative process. Here, we lay out some of our own internal debates on two key questions, in hopes that they point to topics for productive conversation.

 

  1. should enforcement of agency decision making be internal or external to the agency?

this is a normative question but it links with a positive one: can we rely on donors to self-regulate when it comes to adopted decision-making criteria and transparency commitments?

internal, self-regulation is the most common model we see around us, in the form of internal commitments such as multi-year strategies, requests for funds made to the treasury, etc. in addition, most agencies have an internal but-independent ‘results’ or ‘evaluation’ cell, intended to make sure that M&E is carried out. in the case of DFID for instance, the Independent Commission for Aid Impact (ICAI) seems to have a significant impact on DFID’s policies and programming. it also empowers the British parliament to hold DFID to account over a variety of funding decisions, as well as future strategy.

outside the agency, oversight and enforcement of achieving relevancy, transparency, and revisibility could come from multiple sources. from above, it could be a multi-lateral agency/agreement or a global INGO, similar to a Publish What You Pay(?). laterally, the government in which a program is being piloted could play an enforcing role. finally, oversight and enforcement could come from below, through citizens or civic society organizations, both in donor and recipient countries. this brings us to our next question.

 

  1. should enforcement flow top-down or bottom-up?

while this question could be answered about internal agency functioning and hierarchy, we focus on the potential for external enforcement from one direction or the other. and, again, the question is a normative one but there are positive aspects related to capacity to monitor and capacity to enforce.

enforcement from ‘above’ could come through multilateral agencies or through multi- or bi-lateral agreements. one possible external mechanisms is where more than one donor come together to make a conditional funding pledge to a program – contingent on achieving pre-determined targets.however, as we infer from the opening example, it is important that such commitments should be based on a clear vision of success, not just on political imperatives or project visibility.

enforcement from below can come from citizens in donor and/or recipient countries, including through CSOs and the media. one way in which to introduce bottom-up pressure is if donors adhere to the steps we have covered in our previous posts – agreement on relevant reasons, transparency and revisibility – and thereby involve a variety of external stakeholders, including media, citizens, CSOs. these can contribute to a mechanism where there is pressure from the ground on donors in living up to their own commitments.

media are obviously important players in these times. extensive media reporting of donor commitments is a strong mechanism for informing and involving citizens – in both donor and recipient countries; media are also relevant to helping citizens understand limits and how decisions are made in face of resource constraints.

 

our combined gut feeling, though, is that in the current system of global aid and development, the most workable approach will probably include a mixture of formal top-down and informal bottom-up pressure. from a country-ownership point of view, we feel that recipient country decision-makers should have a (strong) role to play here (more than they seem to have currently), as well as citizens in those countries.

however, bilateral donors, will probably continue to be more accountable to their own citizens (directly and via representative legislatures) and, therefore, a key task is to consider how to bolster their capacity to ensure ‘accountability for reasonableness’ in the use of evidence and decision-making more generally. at the same time multilateral donors may have more flexibility to consider other means of enforcement, since they don’t have a narrow constituency of citizens and politicians to be answerable to. however, we worry that the prominent multilateral agencies we know are also bloated bureaucracies with unclear chains of accountability (as well as a typical sense of self-perpetuation).

while there is no clear blueprint for moving forward, we hope the above debate has gone a small step towards asking the right questions.

 

in sum

in this final post, we have considered how to enforce decision-making and priority-setting processes that are ideally informed by rigorous and relevant evidence but also, more importantly, in line with principles of fairness and accountability for reasonableness. these are not fully evident in the episode that opened this post.

through this series of posts, we have considered how planning for decision-making can help in the production of more useful evidence and can set up processes to make fairer decisions. for the latter, we have relied on Norm Daniel’s framework for ensuring ‘accountability for reasonableness’ in decision-making. this is, of course, only one guide to decision-making, but one that we have found useful in broaching questions of not only how decisions are made but how they should be made.

in it, Daniels proposes that deliberative processes should be based on relevant reasons and commitments to transparency and revisibility that are set ex ante to the decision-point. we have focused specifically on decision-making relating to continuing, scaling, altering, or scrapping pilot programs, particularly those for which putatively informative evidence has been commissioned.

we hope that through these posts, we have been able to make a case for designing evaluations to generate evidence useful decision-making as well as for facilitating fair, deliberative processes for decision-making that can take account of evidence generated.

at the very least, we hope that evaluators will recognize the importance of a fair process and will not stymie them in the pursuit of the perfect research design.

*in Daniels’s work, which primarily focuses on national or large private health insurance plans, the regulative role of the state is clear. in cases of global development, involving several states and agencies, governance and regulation become less clear. noting this lack of clarity in global governance is hardly a new point; however, the idea of needing to enforce the conditions of fair processes and accountability for reasonableness provides a concrete example of the problem.

have evidence, will… um, erm (5 of 6, revisibility)

this is part of a series of joint posts with suvojit. it is also cross-posted at people, spaces, deliberation.

throughout this series of posts (1, 2, 3, 4), we have considered two main issues. first, how can evidence and evaluation be shaped to be made more useful – that is, directly useable – in guiding decision-makers to initiate, modify, scale-up or drop a program? or, as recently pointed out by Jeff Hammer, how can we better evaluate opportunity costs between programs, to aid in making decisions. second, given that evidence will always be only part of policy/programmatic decision, how can we ensure that decisions are made (and perceived to be made) fairly?

for such assurance, we primarily rely on Daniels’ framework for promoting “accountability for reasonableness” (A4R) among decision-makers. if the four included criteria are met, Daniels argues, it brings legitimacy to deliberative processes and, he further argues, consequent fairness to the decision and coherence to decisions over time.

the first two criteria set us up for the third: first, decision-makers agree ex ante to constrain themselves to relevant reasons (determined by stakeholders) in deliberation and, second, make public the grounds for a decision after the deliberation. these first two, we argue, can aid organizational learning and coherence in decision-making over time by setting and using precedent over time – an issue that has been bopping around the blogosphere this week.

these criteria, and an approach ensuring A4R more generally, are also a partial response to increasing calls for donor transparency, made loudly in Mexico City this week via the Global Partnership for Effective Development Co-operation. these calls focus on the importance of public availability of data as the key ingredient of donor (and decision-maker) transparency. we concur on their importance. but we argue that it is incomplete without an inclusive process of setting relevant reasons on how those data are used (recognizing that they will always only be part of the process) and making the decision criteria as well public.

the publicity and transparency around decision-making opens the door for A4R’s third criterion (and the subject of this post): the possibility to appeal and revise decisions. as Daniels notes, this condition “closes the loop between decision-makers and those who are affected by their policies.”

as a quick reminder of our guiding scenario: we specifically focus on the scenario of an agency deciding whether to sustain, scale, or shut-down a given program after piloting it with an accompanying evaluation — commissioned explicitly to inform that decision.

in most decision-making of this kind, some stakeholders — often would-be beneficiaries — will not agree with the decision and even feel or be adversely affected. while we suggest that stakeholders be involved in the earlier process of setting relevant reasons, a grievance-redressal or dispute-resolution mechanism, as provided by the revisibility criterion, gives these stakeholders an opportunity to voice their perspectives, based on the original grounds of the decision.

they can do this because the decision-criteria are made public, via criterion 2. this “visible and public” space for further deliberation provides stakeholders have a route “back into the policy formulation process.” stakeholders can use evidence available to them to advocate a certain way forward; it also allows for stakeholders to revisit the decision-making criteria and the decisions they fostered. stakeholders therefore have the opportunity to make a case for a change in the decision.

why might past decisions be questioned? since the appeals process is largely based on the original decision criteria, appeals come if circumstances around those reasons changed. for example, in considering relevant reasons, feasibility was one category of criteria we proposed, such as government’s capacity to scale a program or their interest in the program. one can imagine that over time, over changes in regime, and over changes in politics and policy, the original answers to these criteria could change, opening space for appeals. an additional set of proposed relevant reasons related to cost, effectiveness, and cost-effectiveness. the costs of technologies and materials may change over time or fresh evidence could come out about long-term benefits of programs. this alters the original cost-benefit ratio, again, opening a space for appeals against the original decision.

such appeals may come from members of civil society (or government) that would like to see the program brought back to life (or to see it go away). these may also come from donors themselves wanting to look at their decision-making over time and implement changes in line with the changing context.

Daniels is careful to note, and we emphasize, that the power and purpose of this criterion is not that citizens will always overturn prior decisions.* decisions on limits are requisite, as needs generally outstrip resources. rather, the revisability criterion allows for reconsideration and reflection on those decisions by those knowledgeable about the topic and empowered to alter decisions, if seen fit and feasible. this can, Daniels notes, bring further legitimacy to decision-making processes and, again, improved decision-making over time.

we want to stress that these deliberations over decision-making and their ‘revisibility’ have to be situated in a rational and ethical decision-making framework, predicated on meeting needs fairly when not all can be met (distinct from, say, a legal framework). appeals will have to be judged on the original merits of the arguments as well as with recognition that aid resources have limits (although obviously, a different argument can be made that aid budgets should simply be bigger).  moreover, appeals need to be judged by people who understand the original decision and have the power to change it, if that is the decision taken. when decision-making criteria are set, they set the roadmap for a possible appeals process and should be accordingly discussed and agreed upon.

we started this series of posts by admitting the limited role evidence plays in decision-making — even when those commissioning evidence intend specifically to inform that decision. we considered how planning for decision-making can help in the production of more useful evidence and also how decisions can be made fairly, through the delineation of relevant reasons, the publicity of the decision criteria ultimately used, and now, the possibility of revisiting through criteria and revising decisions.

our thoughts in this series of posts should not make fair decision-making seem like an impossible task. not all aspects of each of these considerations can be taken into account – the constraints of the real world are not lost on us and A4R remains an ideal, though we think one that can be approached. in our final post of this series, we therefore attempt to close the loop by looking at enforcement – asking how these ideas can be enforced and decision-makers held accountable.

*see. e.g., Richard Horton’s recent slide about the limit-breaking decisions by courts and the effects on health care systems, as in cases like Colombia. experiments with health courts may be instructive. picture via @fanvictoria, citing @richardhorton1.

Allowing ‘revisibility’ in decisionmaking

Originally posted on Suvojit Chattopadhyay:

This is a joint post with Heather - fifth in our series on decisionmaking

***

Throughout this series of posts (1, 2, 3, 4), we have considered two main issues. First, how can evidence and evaluation be shaped to be made more useful – that is, directly useable – in guiding decision-makers to initiate, modify, scale-up or drop a program? Or, as recently pointed out by Jeff Hammer, how can we better evaluate opportunity costs between programs, to aid in making decisions. Second, given that evidence will always be only part of policy/programmatic decision, how can we ensure that decisions are made (and perceived to be made) fairly?

For such assurance, we primarily rely on Daniels’ framework for promoting “accountability for reasonableness” (A4R) among decision-makers. If the four included criteria are met, Daniels argues, it brings legitimacy to deliberative processes and, he…

View original 1,006 more words

i’m not sure that means what you think it means (gold standard)

some thoughts, from peter byass, here, for the next time you want to refer to a technique as the ‘gold standard’ and what may be behind such a guarantee:

The verbal autopsy literature has extensively used and abused the concept of “gold standards” for validating cause of death determination. Metallurgists would say that 100% pure gold is an impossibility; the highest possible quality is normally certified as being 99.9% gold, while most of the quality-assured gold we encounter on an everyday basis ranges from 37% to 75% purity. It is perhaps also worth reflecting that 99% pure gold is an extremely soft and somewhat impractical material. Cause of death, on the spectrum of measurable biomedical phenomena, is also a somewhat soft commodity. For that reason, any approach to assessing cause of death involves alloying professional expertise with the best evidence in order to generate robust outcomes.

h/t jq

have evidence, will… um, erm? (4 of 6, going public)

this is a joint post with suvojit. it is also posted on people, spaces, deliberation.

in our last post, we discussed how establishing “relevant reasons” for decision-making ex ante may enhance the legitimacy and fairness of deliberations on resource allocation. we also highlight that setting relevant decision-making criteria can inform evaluation design by highlighting what evidence needs to be collected.

we specifically focus on the scenario of an agency deciding whether to sustain, scale or shut down a given program after piloting it with an accompanying evaluation — commissioned explicitly to inform that decision. our key foci are both how to make evidence useful to informing decisions and how, recognizing that evidence plays a minor role in decision-making, to ensure decision-making is done fairly.

for such assurance, we primarily rely on Daniels’ framework for promoting “accountability for reasonableness” (A4R) among decision-makers. if the four included criteria are met, Daniels argues, it will bring legitimacy to deliberations and, he further argues, consequent fairness to the decision.

in this post, we continue with the second criterion to ensure A4R: the publicity of decisions taken drawing on the first criterion, relevant reasons. we consider why transparency – that is, making decision criteria public – enhances the fairness and coherence of those decisions. we also consider what ‘going public’ means for learning.

disclaimer: logistical uncertainties / room for conversation and experimentation

from the outset, we acknowledge the many unanswered questions about how much publicity or transparency suffice for fairness and how to carry it out.

  • should all deliberations be opened to the public? Made available ex post via transcripts or recordings? or, is semi-transparency — explicitly and publicly announcing ex post the criteria deemed necessary and sufficient to take the final decision — acceptable, while the deliberation remains behind closed doors?
  • who is the relevant public?
  • can transparency be passive – making the information available to those who seek it out – or does fairness require a more active approach?
  • what does ‘available’ or ‘public’ mean in contexts of low-literacy and limited media access?

we do not address these questions — which are logistical and empirical as well as moral — here. as the first-order concern, we consider why this criterion matters.

 

fairness in specific decisions

any decision about resource allocation and limit-setting will be contrary to the preferences of some stakeholders – both those at and not at the decision table. in our scenario, for example, some implementers will have invested some quantity of blood, sweat and tears into piloting a program and may, as a result, have opinions on whether the program should continue; or, those that were comfortable in their inaction (as a result of lack of directives or funds or just plain neglect) who will now have to participate in a scale-up. there will be participants who benefited during the pilot – and those who would have done so if the program were scaled – that may prefer to see the program maintained.

these types of unmet preferences shape Daniels’s central concern: what can an agency* say to those people whose preferences are not met by a decision to convince them that, indeed, the decision “seems reasonable and based on considerations that take… [their] welfare into account?”** being able to give acceptable explanations to stakeholders about a decision is central to fairness.

 

coherence across decisions

the acceptability of criteria for a given decision contribute to the fairness of that decision. But long-run legitimacy of decision-makers benefits from consistency and coherency in organizational policy. transparency, and the explicitness it requires, can foster this.

once reasons for a decision are made public, it is more difficult to not deal with similar cases similarly – the use of ‘precedent’ in judicial cases aptly illustrates this phenomenon. treating like as like is an important requirement of fairness. Daniels envisions that a series of explicated decisions can function as an organizational counterpart of ‘case law’. future decision-makers can draw on past deliberations to establish relevant reasons. deviations from past decisions would need to be justified by relevant reasons.

 

 

implications for learning, decision-making and evaluations

if all decision-makers acknowledge that, at least, the final reasons for their decisions will be publicly accessible, how might that change the way they commission an evaluation and set about using the evidence from it?

it should encourage a review of past deliberations to help determine currently relevant reasons. second, it might encourage decision-makers and evaluators to consider as relevant reasons and measures that will be explainable and understandable to the public(s) when justifying their decisions.

  • in planning evaluations, decision-makers and researchers will have to consider the clarity in methods of data collection and analysis — effectively, will it pass a ‘grandmother test’? moreover, does it pass such a test when that granny is someone affected by your allocative decision? remember the central question that makes this criterion necessary: what can an agency say to those whose preferences are not met by a decision that, indeed, the decision “seems reasonable and based on considerations that take… [their] welfare into account?”
  • there are reasons that decision-makers might shy away from transparency. in his work on health plans, Daniels notes that such organizations speculatively feared media and litigious attacks. in our pilot-and-evaluate scenario, some implementers may not be comfortable with publicizing pilots that may fail; or from raising expectations of beneficiaries that are part of pilots.
  • the fear of failure may influence implementers; this may lead to low-risk/low-innovation pilots. again, this is an important consideration raised above, in the questions we did not answer: when and how much transparency suffices for fairness?

 

in our last blog, we stressed on the importance of engaging stakeholders in setting ‘relevant reasons’ before a project begins, as a key step towards fair deliberative processes as well as a way of shaping evaluations to be useful for decision-making. ensuring publicity and transparency of the decision-making criteria strengthens the perception of a fair and reasonable process in individual cases and over time.

this also sets the stage for an appeals process, where stakeholders can use evidence available to them to advocate a certain way forward; it also allows for stakeholders to revisit the decision-making criteria and the decisions they fostered – the subject of our next post in this series.

***

*we note that donors don’t actually often have to answer directly to implementers and participants for their decisions. We do not, however, dismiss this as a terrible idea.

**we are explicitly not saying ‘broader’ welfare because we are not endorsing a strictly utilitarian view that the needs of some can be sacrificed if the greater good is enhanced, no matter where or how  that good is concentrated.