gratitude.

though it feels far less monumental than perhaps it should, i have done the electronic submission of my thesis, which is a big milestone in calling the thing done or, more accurately, me degree-ed, regardless of how much more work there is to do.

.

i am sure i have forgotten many people but just in case some people don’t actually get around to checking out the thesis itself — a profound but simple ‘thank you’. here are my acknowledgments:

.

a thesis seems like a lone and lonely process, with only data and tea (or stronger) to keep you company, right up until you realize how many people you have to thank. no matter how i’ve tried to keep tabs, i am sure i have forgotten people – if you know you played a role, please give yourself a pat on the back.

 .

this is an empirical dissertation based almost entirely on primary data, which would not exist without willing respondents. in tamale, this includes many private-sector retailers who gave their time to answer a lot of tiresome questions. these answers, in turn, would not have materialized without the long-standing support of a core survey team, with special thanks to abass adam yidana, damba mohammed majeed, and alidu osman tuunteya. n tuma. in accra, many people not only consented to be interviewed but have been patient guides and kept in touch and helped this thesis over its long trajectory. these include: george amofah, kwabena asante, dennis sena awitty, frank boateng, samuel boateng, alex dodoo, keziah malm, yuniwo nfor, louis nortey, daniel norgbedzie, elianne oei, ellen sam, sylvester segbaya. alex dodoo, and daniel norgbedzie have gone above and beyond. there would literally be no words (or numbers) without you.

.

i would have not been in ghana without the trust and support of günther fink and julia raifman and I would not have survived ghana without the moral, emotional, intellectual, and nutritional support and levity of becky antwi, slawa rokicki, mollie barnathan, liz schultz venable, pace phillips, suvojit chattopadhyay, usamatu salifu, salifu amidu, abubakari bukari, lindsey o’shaughnessy, lolo dessein, aqil esmail, michael polansky, sam polley, emmanuel okyere, and rachel strohm. innovations for poverty action-ghana provided much needed infrastructural support and connections; jeff mosenkis has egged me on from headquarters. nathan blanchet has been a guide on ghana and to this whole process.

 .

this thesis as a completed product would not exist without michael reich. from inspiring the ideas that went in to providing a (mostly) patient guide and forcing me to articulate my own ideas beyond, ahem, “a fucking mess” to something that is hopefully readable and possibly even, with time, enjoyable: thank you. you’ve pulled me back from the brink more than once and words don’t suffice. i know sometimes your papers take up to thirty drafts; this has taken many more and you’ve been there throughout.

.

günther fink, jessica cohen, and barbara heil: thank you for keeping me in line and inspired. günther, your enthusiasm, and barbara (mom #2), your persistence, have made a huge difference.

 .

to the swapportive team of shahira ahmed, corrina moucheraud, pamela scorza, and elif yavuz: thanks for keeping me going on so many levels. corrina moucheraud, in particular, has listened to and read many ideas and drafts that constitute what follows, though with far less brevity than her council. elif, you’ve been there, reminding me that they don’t teach kingdon in europe and that anything i do with it better be good.

 .

to an assortment of men in cambridge — thank you. john quattrochi, who helped me survive a wide variety of the perils of working abroad to early engagement with ideas (“is that what you’re trying to say?”), to getting my defense in place, to making sure the final touches were set. peter rockers, for your early skepticism and patience. jeremy barofsky, for encouragement, even sometimes by example. guy harling, for answering every stupid question i could think of while only occasionally reminding me that there are no stupid questions, only stupid people. zubin shroff, for listening and read-throughs.

 .

victoria fan, livia montana, rifat hasan, and jen manne-goehler have been sounding boards of one sort of another at various times.

 .

to the team at the center for geographic analysis, in particular jeff blossom (near and far!) and sumeeta srinivasan: i would have been lost without you.

 .

jesse bump and ashley fox have constituted a political economy crisis unit and have pulled me together and pushed me forward on more than one occasion. thanks for being key stakeholders.

 .

thank you to an intellectually and emotionally supportive community in delhi, with particular thanks to payal hathi, james pickett, and suvojit chattopadhyay for suffering through chapter drafts. bhuvana anand, shreya ray, sangita vyas, urmy shukla, jessica pickett, diane coffey, dean spears, shagun sabarwal, and markus olapade have all engaged with these ideas and the ideas are better for it. subha ganguly shahi and avi kishore have come in with key moral support.

 .

michael schulman, ian reiley, and liz richardson contributed to this being readable. nikolaos zahariadis and owen barder strengthened ideas. catherine goodman, sarah tougher, melisse murray, prashant yadav, and nora petty have been stand-by and stand-up amfm resources. marcia inhorn and norm daniels have been important mentors and models.

 .

several coffeeshops and restaurants have provided clean, well-lighted places over the years: trident and render in boston; andala and voltage in cambridge; mike’s and swad in tamale; loulou’s beignets in the woodlands; and maison des desserts, coast café, and latitude in delhi. thank you for the tea refills and unhurried surface area. and seventh heaven in rishikesh for an extended stay and support.

 ..

for my family, thanks for understanding this whole ‘abroad’ thing as best as possible and, in particular, to aunt janet for patient engagement with early drafts of the manuscript.

 .

finally, a huge thank you to my parents for absolutely everything from the mundane to the massive, from the decision to travel to details to debates to disasters (real and imagined) to deadlines to drafts-upon-drafts to the defense — even though you almost certainly never wanted know a thing about malaria policy in ghana. tusen takk.

.

chapter I, for the curious about this thing we’ve built (all mistakes my own).

strategy testing: a start

thanks to craig valters, i was recently pointed towards a new case study in the asia foundation’s working politically in practice series, focused on a ‘new’ approach called strategy testing. overall, i am sympathetic to much of the approach, though since i believe it has much in common with prototyping, product design and refinement, reasonable service delivery, etc, i am not sure if it is a wildly innovative new way of what i think many people would already see as good practice (as also acknowledged on p. 14 of the paper). it is, nevertheless, on its way to being practical.

.

the approach and what i like

as i understand it, the approach has three key features.

  1. a commitment to a theory of change as truly a set of hypotheses or best-guesses at a strategy, and therefore a living product. embedded in this is a greater commitment to humility.
  2. better individual tracking (daily? weekly?) of both external events, challenges faced, information received, and decisions taken.
  3. regular meetings (quarterly) of ‘program staff’ to review the theory of change and program approach and to refine as needed.

.

my sense is that the authors feel that the third point is the most radical of the suggestions they put forward. i disagree. i think it is point 2, having people take time out of their daily (“good”) work to document and reflect that would represent a much bigger and helpful change in the way development is practiced and will probably require more intensive skill development. future work that documents this more subtle but fundamental shift and makes suggestions to improve practice would be very useful. it shouldn’t be ignored because it is more mundane than the quarterly meetings at which an overhaul might happen.

.

overall, the approach represents an important commitment to continual learning as well as accountability in doing work that gets better and closer to success over time. it also moves a theory of change approach much more central to practice, taking it down off the dusty shelf. the approach also raises important questions about funding cycles and the power of the program team to make adjustments (see p. 14 but this should be explored more). one of the most difficult things about adaptive programming, which i do not take up in this post, will be how to make available adaptive budgeting.

.

what needs refinement

  • no matter how flexible-iterative-adaptive-dynamic-intractable-complex-unpredictable-otherbuzzwords are the problem and the program and the management approach, there seems to be nothing in this paper to suggest that, say these strategy testing meetings could not happen on a regular, (gasp) planned basis. let’s push the anti-planning reaction only as far as it needs to go (more on this below).

.

  • be clear about what is flexible; not everything is or should be. with an approach like strategy testing, it will be important to not make it too easy to redefine successful results (talked about as ‘ultimate outcomes’ in the paper). this matters not just from an accountability perspective (achieving what you said you were going to achieve, even if by a different route or on a different timeline) but also because, presumably, there was some real conviction and merit behind the goals in the first place vis-a-vis development and world-a-better-place-ness (if there wasn’t, then it is an entirely different type of problem with which we are dealing).

.

this is a key concern i have with the ‘adaptation’ movement in general: indicators, pathways, strategies, understandings of the problems, and the goals are often problematized and discounted in one breath, which glosses over too much. if all goalposts are movable, it will be quite difficult to deem any programs or strategy as simply unworthy of large resource outlay and let them go extinct.

.

in different parts of the paper, the authors say that “it is not possible to identify the outcomes and indicators at the outset of the program,” that “programs start with a broad articulation of the ultimate outcome,” and that “a precise plan of activities that will achieve results cannot be defined from the beginning.” i am more sympathetic to the framing of the second and third of these statements. the first statement seems to confuse humility with tabula rasa ignorance, which i don’t think helps move the conversation forward about how to do program planning better while also putting (structured) adaptation into practice.

.

  • define “program teams.” this term is used throughout the paper but it is hard to figure out who it includes, which has implications for how i feel about the approach, as it has implications for whose evidence and insight is deemed important. does it include front-line workers? office-based staff in the capitalif only the latter, the approach currently does not suggest how roadblocks and experiences and suggestions and feedback will be collected from the street-levelyet surely this is critical to a holistic picture of events, roadblocks, and accomplishments — and therefore choosing the path forward. the absence of the semi-systematic feedback from front-line implementers, from intended beneficiaries, from other stakeholders is problematic (distinct from saying all these people need to be physically in the room during strategy testing meetings).

.

  • the timeline and the ‘new information,’ ‘external changes,’ and ‘accomplishments and roadblocks’ seem out of sync. if the timeline is to be the key tool for daily or weekly reflection, it needs to move far beyond the sample provided in table 2 (acknowledging the potential for burdening program staff), which focuses on big-P political and high-level eventsone question is who (and how) will be put in charge of documenting such changes, through more regular interaction with stakeholders or more careful monitoring of the news as part of a monitoring strategy. a second and possibly more important question is how a timeline-type tool can be better aligned with the theory of change and require staff to engage with the assumptions therein on a more regular basis. can some of the burden on program staff be relived if m&e (or mel or merl or whatever) teams do regular debriefing interviews with staff? drilling in on these practical, small details of how one might put strategy testing into practice would be hugely useful.

.

  • at times, ‘traditional monitoring’ (which itself could be better defined so it is even clearer what strategy testing is being contrasted with or being appended onto) is painted as anachronistic; yet it must still be used in a strategy testing approach. for example, on page 11, the authors note that “by taking multiple small bets and continuously monitoring results, program teams are able to adjust and refine” (emphasis added). this suggests to me that a core set of indicators that measure progress/results towards some ultimate outcome (traditional monitoring?) are likely  in place for much of the project, a reality that sometimes gets lost in the thrust to position strategy testing as an alternative approach to monitoring. it seems like response-to-monitoring rather than monitoring itself is the bigger contribution of strategy testing and, again, sometimes this gets lost in the paper and buzzword barrage.

 

  • a key challenge raised on page 11 is not adequately addressed; the authors note: “whether a program strategy is worthy of continued investment may not be easy to decide.” more in-depth, ex ante discussion of just such decision points (see my series of blogs with suvojit, starting here) and what information will be needed to take such decisions are needed. these would need to be built into any monitoring plan, as part of the information needs for successful strategy testing. as is acknowledged in the paper, “it may be difficult for a team to accept that their strategy is not working and move on to something new, especially when they have invested heavily in that strategy.” this will make it all the more important to have up-front discussions about how to determine when something is not working (which relates to having clear, somewhat steady definitions of success).

    .

i take away from this paper that being flexible requires planning and commitments, even though at times these are painted in a negative and out-of-sync tone. it requires more managerial planning and commitment to finding time and tools and skills for reflection, to agreeing early on as to how strategic decisions will be made on the basis of evidence gathered, who will weigh in on them, on how success will be defined even if different strategic approaches to achieving it are adopted. this is acknowledged at the end of the paper, in discussing the need for structure and discipline within (and to promote) flexibility. but it should be made much more central to marketing, refining, and disseminating the approach.

.

more generally, in the movement towards adaptive and flexible development work, we need to be careful about noting where the changes really need to happen (e.g. on monitoring itself, or on better tailoring monitoring to fit with decision-making needs, or on allowing time and scope to respond to monitoring findings) and where structure and planning are needed, making flexibility/planning and structure/planning complementary rather than contrasting ideas.

more from #evalcon: program planning

disclaimer: i always get quite frustrated when people seem to be reinventing the wheel, especially when at least the contours of the wheel could be found with a reasonable literature review that was somewhat cross-disciplinary (i am pretty sure this is still a reasonable expectation… perhaps part of the problem is that literature is insufficiently open-access?)

.

i’ll be blunt: everyone should read just a little more before they speak, realize that they are not necessarily entering uncharted territory (including the realms of program planning, product design, & evaluation), are not great pioneers until they have assessed that for themselves from amongst the existing literature, and need to cite their sources.

.

program planning

from a lot of different corners, it seems that people involved in evaluation are suddenly ‘discovering’ that they may have a role to play in programming planning and design, whether facilitating it or doing it more directly. this ranges from frequent topics of conversation at the recent #evalcon in kathmandu to smart policy design by ben olken and others.

.

it is a natural enough ‘discovery’ when involved in an evaluation that it may have been helpful if the evaluation team had been involved earlier — say, before the program was designed. that makes sense: folks doing an evaluation tend to get hung up on details that turn out to matter, like operationalizing key concepts and goalposts, clarifying who will do what, what that will look like and how long it will take, and so on. a lot of these details would show up in a well-done theory of change.

.

not only do people planning an evaluation ask these types of questions, they also fill a useful role as outsiders, clarifying language and ideas that insiders may take for granted (which raises interesting questions about the promises and pitfalls of internal evaluators, even well trained, especially those taking on a learning as well as an accountability function).

.

it’s just that this is link and role to planning not a new discovery. i’ll give the example of the precede-proceed model, because i am familiar with it, but there are assuredly lots of models linking planning and evaluation in useful ways. i admittedly like some of the older illustrations of the precede-proceed model but respect that larry green has updated his figures and that i should move on (but if you’re curious, you can see the old ones if you search for images ‘green precede proceed’).

.

precede-proceed starts as too few programs and evaluations do: with a need assessment, based on objective indicators (wealth, disease, etc) as well as subjective indicators and intereststhis helps to form both a statement of the problem as well as setting targets for the evaluation to assess. this is an excellent time for those interested in participatory methods to employ them (rather than just employing the term ‘participatory’ whenever it makes you feel good) because this (and for program design itself) is when it really counts: getting the focus right.

.

from here, a series of diagnostics can be carried out to look for the factors (facilitating and blocking) that perpetuate the current, unsatisfactory state of the world but also allow for positive deviance. this can be a process of asking why 5 times or other tools to look for the points on which a program or policy might intervene.

.

this can then be followed by a process of assess the landscape of extant programs and policies and designing a new one, taking cues from product design, including the use of personae.

.

the evaluation may be broader than tracing these points backwards — the elements of the program or policy, the points of intervention, the different types of need identified — but this is effectively the building blocks for a well-aligned monitoring and evaluation strategy.

.

two points before moving on from the basic point that merging planning, design, and evaluation is charted territory:

  1. all of this suggests that people wanting to do good evaluation need to be better trained in the kinds of facilitating, mediating, needs assessing, and creative tasks implicated above.
  2. recognizing that design, implementation, & evaluation can be all part of the same processes is not somehow the same as saying that it is magically/conveniently unimportant to report on implementation details in an evaluation. if anyone outside the core implementation team of a project (a government agency, say, or an NGO) assists in planning, training, facilitating, framing, or any component of implementation, this needs to be reported for the sake of transparency, proper interpretation, and potential reproducibility.

 .

.

questions about independence

one of the major points echoed in the #evalcon session that i covered in my last post is that independence and unbiasedness of evaluations are hugely important in enhancing evaluative effort’s credibility among policy makers. a key challenge for anyone involved in the shifts considered in the first bit of this blog — evaluative folks thinking about getting involved early on in program design — is going to be how to instill and project integrity and trustworthiness of evaluation while letting go a bit on strict independence, in the sense of remaining arms’ length from the evaluation subject. to the extent that decision-makers and other stakeholders are a key audience, evaluators will be well-served by taking the time to understand what they see as credible and convincing evidence.

.

thoughts from #evalcon on evidence uptake, capacity building

i attended a great panel today, hosted by the think take initiative and idrc and featuring representatives from three of tti’s cohort of think tanks. this is part of the broader global evaluation week (#evalcon) happening in kathmandu and focused on building bridges: use of evaluation for decision making and policy influence. the notes on evidence-uptake largely come from the session while the notes on capacity building are my own musings inspired by the event.

.

one point early-on was to contrast evidence-informed decision-making with opinion-informed decision-making. i’ve usually heard the contrast painted as faith-based decision-making and think the opinion framing was useful. it also comes in handy for one of the key takeaways from the session, which is that maybe the point (and feasible goal) isn’t to do away with opinion-based decision-making but rather to make sure that opinions are increasingly shaped by rigorous evaluative evidence. or to be more bayesian about it, we want decision-makers to continuously update their priors about different issues, drawing on evidence.

.

this leads to a second point. in focusing on policy influence, we may become too focused on influencing very specific decision-makers for very specific decisions. this may lead us to lose sight of the broader goal of (re-)shaping the opinions of a wide variety of stakeholders and decision-makers, even if not linked to the immediate policy or program under evaluation. so, again, the frame of shaping opinions and aiming for decision-maker/power-center rather than policy-specific influence may lead to altered approaches, goals, and benchmarks.

.

a third point that echoed throughout the panel is that policy influence takes time. new ideas need time to sink in and percolate before opinions are re-shaped. secretary suman prasad sharma of nepal noted that from a decision-maker point of view, evaluations are better and more digestible when they aim to build bit by bit. participants invoked a building blocks metaphor several times and contrasted it with “big bang” results. a related and familiar point about the time and timing required for evaluation to change opinions and shape decisions is that planning for the next phase of the program cycle generally begins midway through current programming. if evaluation is to inform this next stage of planning, it requires the communication of interim results — or a more thoughtful shift of the program planning cycle relative to monitoring and evaluation funding cycles in general.

.

a general point that came up repeatedly was what constitutes a good versus a bad evaluation. this leads to a key capacity-building point: we need more “capacity-building” to help decision-makers recognize credible, rigorous evidence and to mediate between conflicting findings. way too often, in my view, capacity-building ends up being about how particular methods are carried out, rather than on the central task of identifying credible methodologies and weighting the findings accordingly (or on broader principles of causal inference). that is, capacity-building among decision-makers needs to (a) understand how they currently assess credibility (on a radical premise that capacity-building exercises might generate capacity on both sides) and (b) help them become better consumers, not producers, of evidence.

.

a point that surfaced continuously about how decision-makers assess evidence was about objectivity and neutrality. ‘bad evaluations’ are biased and opinionated; ‘good evaluations’ are objective. there is probably a much larger conversation to be had about parsing objectivity from independence and engagement as well as further assessment of how decision-makers assess neutrality and how evaluators might establish and signal their objectivity. as a musing: a particular method doesn’t guarantee neutrality, which can also be violated in shaping the questions, selecting the site and sample, and so on.

.

other characteristics of ‘good evaluation’ that came out included those that don’t confuse being critical with only being negative. findings about what is working are also appreciated. ‘bad evaluation’ assigns blame and accountability to particular stakeholders without looking through a nuanced view of the context and events (internal and external) during the evaluation. ‘good evaluation’ involves setting eval objectives up front. ‘good evaluation’ also places the findings in the context of other evidence on the same topic; this literature/evidence review work, especially when it does not focus on a single methodology or discipline (and, yes, i am particularly alluding to RCT authors that tend to only cite other RCTs, at the expense of sectoral evidence and simply other methodologies), is very helpful to a decision-making audience, as is helping to make sense of conflicting findings.

..

a final set of issues related to timing and transaction costs. a clear refrain throughout the panel is the importance of the timing of sharing the findings. this means paying attention to the budget-making cycle and sharing results at just the right moment. it means seeing windows of receptivity to evidence on particular topics, reframing the evidence accordingly, and sharing it with decision-makers and the media. it probably means learning a lot more from effective lobbyists. staying in tune with policy and media cycles in a given evaluation context is hugely time consuming. a point was made and is well-taken that the transaction costs of this kind of staying-in-tune for policy influence is quite high for researchers. perhaps goals for influence by the immediate researchers and evaluators should be more modest, at least when shaping a specific decision was not the explicit purpose of the evaluation.

.

one is to communicate the findings clearly to and to do necessary capacity-building with naturally sympathetic decision-makers (say, parliamentarians or bureaucrats with an expressed interest in x issue) to become champions to keep the discussion going within decision-making bodies. to reiterate, my view is that a priority for capacity-building efforts should focus on helping decision-makers become evidence champions and good communicators of specific evaluation and research findings. this is an indirect road to influence but an important one, leveraging the credibility of decision-makers with one another. two, also indirect, is to communicate the findings clearly to and to do necessary capacity-building with the types of (advocacy? think tank?) organizations whose job is to focus on the timing of budget meetings and shifting political priorities and local events to which the evidence can be brought to bear.

.

the happy closing point was that a little bit of passion in evaluation, even while trying to remain neutral and objective, does not hurt.

Aside

what i lost / terror

here’s a post that i’ve been half-meaning to write for awhile. for some time, i thought i had said all i needed to say in writing some words for her memorial.

.

if the enormity of our – my – loss truly ever hits me, it will be through the small shared moments that can no longer be accumulated, even though skype tells me elif is only offline for now and gmail suggests that I may have meant to include her on my emails. with every absurd statement or mannerism over which we can’t exchange glances and snarky giggles or looks of outright disgust; with every annoyance or potentiality that can no longer be re-enacted and analysed over tea or wine; for every internet chat that no longer comes through filled with “hey lady”s “:-)”s and “;-)”s and exclamation marks at precisely the needed moment and in precisely the needed amount; and with every glass of wine i order at grafton’s knowing that she won’t be pedaling up soon in 4-inch heels to join me. maybe in this succession of elif-shaped voids I will begin to grasp what has stolen from me and from the world — through intolerance, the antithesis of all that elif believed.

.

she was fearless in her approach to life, fiercely loyal in her friendships, focused in her work and infectious when she laughed. she *is* a fiercely loyal friend, appreciative and incisively honest, a yogi with a sharp tongue but a sharper wit, short-tempered but with a heart big enough to always make it OK, a perfectionist wrapped up in layers of clashing-but-considered clothes and scarves and flowers and hats. she is one of the finest partners-in-crime anyone could ask for.

.

it’s been two years. but from time to time i still find myself etching a sentence or two in my mind. two 21st septembers have passed since the westgate mall shooting and i only managed to take a few sentences from my head and put them in a draft blog. i spilled little red wine out in remembrance on the appropriate dates and at a recent wedding that i know would have pleased her. a few weeks ago an (academic) article made me cry, resulting in some of the writing below.

.

but i didn’t press ‘publish’ until watching the horrors of beirut and paris unfold across social media while sitting alone in a hotel in abuja, too connected and too separated and possibly with one too many heinekens. more dates. 9-11 and and 9-21 and 12/11 and 13/11 and 26/11. too many dates. ‘a calendar’ as the noun of accumulation of ‘terror.

.

which i guess is when it hit me. it isn’t about the dates or the symbols or even really the cities. it’s about what is lost every day, for all of us, because of acts of terror. i don’t walk into a mall anywhere in the world without thinking about elif and wanting to walk out immediately. every time i hear someone use one of the words or phrases elif and i deemed as terrible, like “leaf peeping” (which people in new england insist on saying when they are going on a perfectly good outing to admire the autumn foliage) and “nibble” and “sequelae” (which particularly alarmed elif and she sketched once as a fearsome and carnivorous caterpillar-being) i want to write her immediately. i cannotfor the very specific reason of someone else’s hate and retribution. or statement.

.

wikipedia tells me there is no agreed upon definition of terrorism but that pre-french revolution usages relate to a spreading mind-set of terror or dread, before questions about being state-sponsored or not cluttered up contemporary efforts at pinning down the idea. i’m actually not sure whether a visceral, sensory definition lies in the subtle sense of dread and suspicion of people that results from such acts or the small dead space in your brain, like an amputation, that still tries to light up when you think of someone you can no longer write. a hyper-sensitivity and a numbness.

.

the best i can do now, or ever, is to remind the world what has been taken from them.

.

elif and i bonded over the sort of humor that does not amuse everyone and downright offends some people. our first day of class together, in foundations of global health, the professor announced that some percentage of the world’s children would not enjoy their 5th birthdays. this is a euphemistic way of describing inequitable and horrifying under-5 mortality rates around the world, mostly from infectious disease, unhygienic surroundings, and poorly attended births. elif and i would not have been in a school of public health if we thought the underlying subject matter humorous. but the phrasing still tickled us. the birthdays wouldn’t be enjoyed because of insufficiently grand party hats? not enough party guests? somehow the subject of the joke became timmy and timmy and his failed birthday party were a recurring touchstone that got us through the two years till qualifying exams and three more years of school after that. and, hell, through elif (and ross) being on the verge of having their own child, traveling to nairobi from dar for just that purpose

.

and so it was a few weeks ago, on a random day, that i found myself sobbing when reading lant pritchett’s blog on the end of kinky development, in which he declares that “no one has ever held an ‘i am over $1.25 a day’ party.” which seems liked just the sort of party i would want to plan for timmy with elif.

.

which, again, i guess, is the point, if there is one, which i am never sure there is. my grief isn’t eiffel-tower shaped or cedar-tree shaped or red or white or green or blue. it’s pink and teal and elif-shaped. it doesn’t come on a particular date. it comes any time of the day or night when i want to write “elif, you won’t believe…” and can only think ‘fuck you’ to people i have never met.

teaching qualitative analysis: an intro

teaching qualitative analysis is not easy for several reasons. first, an awful lot of material on doing qualitative research focuses on data collection. relatedly, then, a lot of academic papers that draw on qualitative data and analytic methods focus on data collection and organization. too often the use of an analytic software stands in for an explanation of how analysis was done.

.

second, lecturing on qualitative analysis is much like a powerpoint lecture on riding a bicycle. it sounds very easy (right foot down, then left foot down). it only gets hard when you try to do it.

.

nevertheless, a lecture must begin somewhere. i hope my notes, below, may prove useful to someone else.

.

despite my impulse to start with lincoln and guba’s paper, since this wasn’t an audience that spends all the their time reading academic papers or thinking about theory, i started with an example published qualitative piece. i found one that focused on a similar data source and level (interviews with high-level stakeholders as opposed to, say, a focus groups in a village or historical document review) as well as stated analytic approach (in this case, this paper by smit et al. was a good match).

.

the intuition was that — even though this was not an audience entirely used to reading research outputs — it would be helpful to get a handle on the type of research product toward which we wanted to build before getting lost in the nitty gritty of analysis. with a slightly different audience, i probably would have made the lincoln & guba piece mandatory to provide a touchstone for considering and critiquing the paper and then for storyboarding our own paper.

.

after reviewing a few key terms central to doing qualitative work (sources of qualitative data (talk, text, observations, images), positionality, inductive reasoning, deductive reasoning, codes and coding), we spent much of the first day discussing and critiquing the paper

.

first, individually and then in pairs, and then finally as a big group, we explored these questions:

  • what are the goals set out by the researchers for this project and paper
  • why did a qualitative approach make sense to answer these questions?
  • what are the key conclusions the researchers draw from their analysis?
  • what types of data do the researchers use to support their conclusions?
  • what types of analyses do the researchers use to support their conclusions?
  • what is convincing about the link between the researchers’ results and their conclusions? could anything have been done to make this more convincing?
  • do the researchers achieve the goals they set out for themselves? why or why not? what could have been done differently?

.

then, as a larger group, we explored these additional questions (which are mostly notes to myself of topics to cover rather than and handout distributed to to participants; the first set of questions i did distribute):

  • methods: data collection, organization, analysis
    • who were the data collectors? can we tell from the paper? how?
    • what was the positionality of the interviewers vis-à-vis the informants? what difference does this make?
    • what data collection strategy was used?
    • why do key informant interviews make sense as a data source given the research questions and goals?
    • were any other types of data used? how?
    • how were key informants chosen?
      • how many interviews were completed?
      • what does purposive sampling mean? snowball sampling?
      • how do the authors signal that the sample is representative of the relevant interests (i.e., what is thematic saturation or redundancy? what does this imply about the relationship between data collection, entry, and analysis in qualitative research?)? is this convincing? could it have been more convincing and if so, how?
    • how do the authors display their sample? is it helpful? what characteristics do they highlight and why? could it have been done better or differently?
    • what is a semi-structured interview guide? how does it differ from a completely structured or unstructured questionnaire or guide?
      • what types of questions did the researchers ask? how do we know? what else might we have liked to have known?
    • is there anything else we would have liked to have known about how the data were collected?
    • how were the data converted into transcripts? do the authors provide all the information we want on this?
    • what does it mean in this case that an inductive approach was used?
      • how did the researchers set about their induction? is this convincing?
        • what does it mean that key themes “emerged”?
      • what would have been different if the researchers had used a deductive approach? how would the analysis have changed? what would have been the trade-offs?
      • what did the authors actually do in analysis? do they provide us enough information to know?

.

  • results
    • how do the authors reassure us that the information from different stakeholders is used and presented in a balanced way? could this have been done differently or better? if so, how?
    • figure 1 is the main display of the (descriptive) results.
      • did you look at it carefully when reading the paper or did you skip over it?
      • where did the figure come from? what do the bullet points in each box represent?
      • is this figure meant to be descriptive or analytic?
      • what is helpful about this display? what could have been done differently?
    • how are the results in this paper organized?
      • how does the presentation of results relate to the research questions?
      • how are quotations used to communicate the results? is this effective? convincing?
      • how were the quotes selected? are they meant to be representative or exceptional? how do you know?
      • were conflicting or diverging viewpoints represented? how do you know?
      • do you feel the researchers have drawn reasonable inferences from the data?
      • do the conclusions follow from the data?
      • do you feel that the researchers already had the conclusions in mind before they analyzed the data? does this affect the convincing-ness of the analysis?

.

  • interpretations
    • what did the researchers do to make the present paper credible?
    • what did the researchers do to make the present paper balanced?
    • what did the researchers do to enhancing transparency?
    • is the paper ultimately convincing? why/not?

Aside

revisiting maximum city amid delhi’s air pollution

earlier this week, a friend responded to this article on delhi’s pollution levels by reporting to facebook:

in the last week, 2 of my friends have moved back (one permanently & the other temporarily) to the states because of peak pollution levels. others are booking flights to leave the city for portions of the winter

it seems that most of the adaptations we strive towards are restricted to creating healthy spaces for ourselves amongst the pollution that most of the city’s residents cannot escape. 

.

what she is saying, and is right, is that those of us that can and are staying in delhi are partially creating an air-istocracy. some of us are able to refine the very most public of goods — the air — for ourselves. a public good is by definition non-rivalrous and non-excludable. and yet we are working to make breathable air exclusive: in our flats, in our enclosed vehicles, in our office spaces, behind our masks.

.

this needs to change, lest we become confined to these bubbles and delhi becomes even less friendly to pedestrians and cyclists and generally to taking a stroll or letting in a bit of fresh air through the windows.

.

what i wrote on facebook, and i stand by at risk of being offensive, is this: in a considered and intentional, if provocative, turn of phrase to indicate violation or abuse without consent, delhi rapes my lungs on a daily basis.

.

this intended to play on one of the major threats the outside world sees about living in delhi. the point is not to belittle violence against women experienced in delhi — which i have been merciful in not experiencing but which is a reality — or other forms of structural violence coped with on a daily basis. worrying about and living with these forms of violence wear people down to the point that they feel they can’t deal with something like the air. and so it goes undiscussed. but clean air is not ignorable. it is a form of violence and it needs to be addressed.

.

to tackle a problem of the common or public good is a challenge anywhere; it is deeply bound up with ideas of citizenship and the social contract, of paying taxes and the role of government and the space for activism. it requires a government that can impose regulations to protect public goods and it requires citizens to expect and demand this of their government, though it is not a commodity that can be handed out. it is about far more than putting up ‘clean city, green city’ signs (as, incidentally, tackling violence against women is about more than hanging up coasters in taxis and autos declaring (in english) that the vehicle respects women).

.

delhi and india, perhaps in particular, have a lot of work to do. i hope the world stays tuned and that india rises to the challenge. to close, i’ll allow someone else —  mehtu in maximum city — to muse on public goods in india (a passage that, ironically, i first read in the much cleaner air of rishikesh):

.

the flats in my building are spotlessly clean inside; they are swept and mopped every day, or twice every day. the public spaces – hallways, stairs, lobby, the building compound – are stained with betel spit; the ground is littered with congealed wet garbage, plastic bags, and dirt of human and animal origin. it is the same all over bombay, in rich and poor areas alike. this absence of a civic sense is something that everyone from the british to the hindu nationalists have drawn attention to, the national defect in the indian character (p. 138).

.

.

avoiding perversions of evidence-informed decision-making

*this is a joint post with suvojit, here.

.

avoiding “we saw the evidence and we made a decision…”

“…and that decision was: given that the evidence didn’t confirm our priors or show a program to be a success, to try to downplay and hide the evidence.”

.

before we dig into that statement (based-on-a-true-story-involving-people-like-us), we start with a simpler, obvious one: many people are involved in evaluations. we use the word ‘involved’ rather broadly. our central focus for this post is people who may block the honest presentation of evaluation results.

.

in any given evaluation, there are several groups of organizations and people with stake in an evaluation of a program or policy. most obviously, there are researchers and implementers. there are also participants. and, for much of the global development ecosystem, there are funders of the program, who may be separate from the funders of the evaluation. both of these may work through sub-contractors and consultants, bringing yet others on board.

.

our contention is that not all of these actors are currently, explicitly acknowledged in the current transparency movement in social science evaluation, with implications for the later acceptance and use of the results. the current focus is often on a contract between researchers and evidence consumers as a sign that, in ben olken’s terms, researchers are not nefarious and power (statistically speaking)-hungry (2015). to achieve its objectives, the transparency movement requires more than committing to a core set of analyses ex ante (through pre-analysis or commitment to analysis plans) and study registration.

.

to make sure that research is conducted openly at all phases, transparency must include engaging all stakeholders — perhaps particularly those that can block the honest sharing of results. this is in line with, for example, EGAP’s third research principle on rights to review and publish results. we return to some ideas of how to encourage this at the end of the blog.

.

now, back to the opening statement, a subversion of the goal of evidence-informed decision-making. there are many interesting ways that stakeholders may try to dodge an honest sharing of results once they know what the results are. one is to claim that the public — whether in office or general public — will not be able to make sense of the results, so anything confusing, or, really, unexpected, needs to be pruned from the public report. instead, all the not-as-hoped results can be relegated to internal rather than public, learning.

.

decision-makers may indeed need brief synopses (written or otherwise) rather than being presented with a long report. different combinations and permutations of the evidence may be presented to different stakeholders using different modes of communication, in line with what is salient to them.

.

however, this is not a suitable excuse to fail make the full set of findings public. moreover, an assessment of what stakeholders can/not interpret that fails to account for how they say they want to receive evidence misses a key point of participation and partnership. it might reveal our (mis-)estimation of the policymaker’s intelligence and the complex policy challenges decision-makers encounter as part of their daily work.

.

we’ve talked elsewhere about committing to a decision process informed by evidence. in this post, we are after something even more simple: for key stakeholders to commit ex ante to making the results of a commissioned study public, irrespective of their respective priors regarding the intervention being studied. of course, the piece of research should be deemed as technically sound. assuming that it is, the goal is to encourage the honest sharing of results regardless of the direction of the results.

.

in theory, everyone party to a good ex ante evaluation (and ex post, though there may be slightly less stakeholder engagement; or the degree of engagement could vary depending on the emerging results from the study) is aware that the results for the effect of an intervention on an outcome of interest can be as hoped, opposite, null, or otherwise mixed and confusing. in practice, everyone has a prior, which may involve not just an educated hypothesis but an emotional commitment to a particular outcome.­

.

so what can help reduce the impulse and potential to cover-up unexpected results?

1. better explanation of research processes and norms. in some cases, key actors within commissioning agencies may be initially enthusiastic about the idea of evaluation without fully understanding what it — and a measurement and results focus more generally — really entails. here, one often makes the mistake of focusing on agency-capacity, rather than the capacity of individuals within these agencies. by capacity, we refer not only to technical know-how of evaluation methods but also familiarity with research processes and norms. disparity in capacity can lead to serious contradictions within the same agency on the way research findings are treated.

.

too often, though, efforts at “capacity-building” and other modes of education for individuals within agencies about evaluation focus on evaluation designs and analysis. this comes at the expense of explaining the research process, the variety of possible evaluation outcomes, and norms around transparent reporting of results. patrick dunleavy recently outlined the process of storyboarding research from the get-go to improve working in teams and helping to visualize the end-product. such a process may be useful for a broader array of stakeholders than the research team, so that the whole process (the whole magic of “analysis and writing up”) can be made transparent. this represents a potentially softer, friendlier and more feasible version than drafting the entire report in advance, as humphreys et al. attempted in their paper on fishing. it also may allow more of the process to be visible, rather than just the final reporting structure.

.

2. invest time in bringing all stakeholders to understand and agree with the research objectives and processes. several research studies (especially evaluations) have a committee of advisers to steer the process. these are critical stakeholders in addition to those that commission and carry out the research. ideally, all of those involved — including this committee of advisers — would reach a common understanding of the research objectives and methods to be followed. this would also include identifying policy messages from the study and engagement strategies.

.

however, common ground is sometimes elusive, as these wider groups do not arrive early on at a fruitful working arrangement or basic understanding of the research process. setting clearly understood objectives and a shared understanding of research processes may be time consuming but is invaluable when seen in the context of decision-making and transparency over research findings that may not match everyone’s priors.

.

3. formal commitment to results reporting across stakeholders. right now commitment to analyses and results reporting exist between researchers and the public or, really, other researchers. but researchers are not the only ones determining the content of results reporting — and thus reporting requires additional sets of (public? formal? registered?) commitments. these could, like pre-analysis plans or commitment-to-analysis plans, take the form of committing to a core set of analyses and reporting on these results. it may also take the form of MOUs that are less technical than ex ante analysis plans but still represent a commitment to reporting a certain set of results regardless of the direction of those results.

.

in any case, the goal is to move the commitment from being between researchers (and perhaps mostly intelligible to researchers) to also involving study commissioners, other stakeholders with the power to block the publication of findings, and the public (such as the public paying the taxes to fund the program).

.

4. early engagement with decision-makers. if decision-makers are a primary audience for the evaluation and if communicating to decision-makers is seen as a barrier to a complete, nuanced presentation of evaluation findings, then engaging with decision-makers early on may help. we recognise the time constraints of decision-makers and the importance of clarity in messaging. but the clarity of presentation and the complicatedness of the results need not be zero-sum.

.

one way to reduce this tension and to better communicate complex or complicated findings to decision-makers is to engage them in the evaluation from the very beginning, so that the potential for nuanced findings can be gradually introduced. if faced with a passive policy audience at the end of an evaluation, whose only role has been to turn up to listen to research findings in a workshop, the space for taking in complexity, nuance, and caveats in messaging will be limited. but assuming that evaluations findings need to assert only non-complex finding and straightforward recommendation is hugely problematic since we are talking about evaluations in social systems. as such, getting early buy-in and opening channels to gradually introduce results are important.

.

with these steps in place, chances are better that our based-on-a-true-story colleagues could have avoided the scenario that we referred to at the beginning of this post. an early commitment to the research processes and an agreement on the way forward would have helped prime key stakeholders to the possibility that research findings might be a mixed bag — which necessitates a nuanced dissemination strategy but not the burying of unfavorable results.

Aside

brief thoughts on tolerance from george washington via sarah vowell

there’s a lot of talk recently in various quarters about “tolerance” and who is and who is not — individually, nationally, etc. not all of it makes sense or rings true.

.

it was nice, then, to come across this snippet from a letter from george washington written after the war for the independence of the united states, in vowell‘s new lafayette in the somewhat united states. as with all things of the era, it reflects ideals rather than being a perfect mirror of reality, but it is in keeping with a whole ethos of working to become less imperfect. vowell writes:

after the war, in 1790, newport’s synagogue would go on to inspire one of washington’s finer moments as a president and a person. responding to a letter from touro’s moses seixas, who asked the president if ratifying the bill of rights was, to paraphrase, good for jews.

.

washington would send a letter addressed to the hebrew congregation at newport. the first amendment, he explained, exposed tolerance as a sham, because intolerance implies one superior group of people deigning to put up with their inferiors.

.

it is now no more that toleration is spoken of as if it were the indulgence of one class of people that another enjoyed for the exercise of their inherent national rights,’ washington wrote.for, happily, the government of the united states… gives to bigotry no sanction, to persecution no assistance.

.

emphasis added. the book (30 pages shy of the end) is recommended.

draft thoughts on showing the work over time in a theory of change (comments welcome)

in draft work with vegard iversen (see here), we have been developing some ideas around using (and showing) both ex ante and ex post theories of change. this is partly in line with a learning agenda for theories of change (as outlined in valters’ recent work here and in my follow-up here, among other places). a learning agenda includes both internal learning but also, and equally importantly in my view, a commitment to making learning public.

.

the overall argument we advance in the paper relates to bringing theory — both formal and programmatic — to the center of making claims about the external validity of evaluation findings. a small piece of this argument relates to reporting. i’d certainly be interested in views on the below, which is a slightly modified view of the text as it currently stands. i feel that much of the received wisdom is that higher and higher levels of abstraction are what make research findings portable across settings. and, yet, in many instances of discussing external validity (such as here), the conversation inevitably veers towards mixed methods.

.

Thicker (Geertz 1972) and richer descriptions about settings and implementation processes — linked with the ToC and its critical assumptions — will facilitate learning. The more deeply researchers can probe the local — and the more detail authors use describe it — the more the reader can try to assess the potential for generalization. Thickness, not thinness, helps users of evidence make this assessment in light of their own setting.

.

Indeed, Lincoln and Guba denote “thick description” as the main way of allowing results to be transferred, calling for “narrative developed about the [setting to allow] judgments about the degree of fit or similarity may be made by others who wish to apply all of part of the findings elsewhere” (Lincoln and Guba 1986).

.

We believe that similar concerns are what motivated Woolcock to close his paper on external validity with a call for more case study work (Woolcock 2013). Honesty about problems encountered and tweaks to implementation processes and quality made as a result of experiential learning and adaptation will facilitate lesson learning both ‘there’ and ‘here.’

.

The Theory of Change, a central premise of our arguments about taking external validity seriously, can provide a useful tool for structuring reporting and lesson learning. Said more plainly, we want to have ToCs play equally important roles at the beginning of a study and at the end. In his review of how Theories of Change are used, Valters writes: “as far as I know, there is no particular tool to go back and see if a ToC is right or not.” This lack of critical reflection clearly does not support lesson learning here nor there.

.

To this end, we put forward the idea that careful ex ante assembly and ex post refinement of a Theory of Change will assist in studies that rigorously increase their potential for external validity. Such reporting, perhaps particularly when structured by the ToC, would reflect the learning that happens between the design phase and during data collection and the analysis (Pritchett, Samji, and Hammer 2012).