thoughts from #evalcon on evidence uptake, capacity building

i attended a great panel today, hosted by the think take initiative and idrc and featuring representatives from three of tti’s cohort of think tanks. this is part of the broader global evaluation week (#evalcon) happening in kathmandu and focused on building bridges: use of evaluation for decision making and policy influence. the notes on evidence-uptake largely come from the session while the notes on capacity building are my own musings inspired by the event.

.

one point early-on was to contrast evidence-informed decision-making with opinion-informed decision-making. i’ve usually heard the contrast painted as faith-based decision-making and think the opinion framing was useful. it also comes in handy for one of the key takeaways from the session, which is that maybe the point (and feasible goal) isn’t to do away with opinion-based decision-making but rather to make sure that opinions are increasingly shaped by rigorous evaluative evidence. or to be more bayesian about it, we want decision-makers to continuously update their priors about different issues, drawing on evidence.

.

this leads to a second point. in focusing on policy influence, we may become too focused on influencing very specific decision-makers for very specific decisions. this may lead us to lose sight of the broader goal of (re-)shaping the opinions of a wide variety of stakeholders and decision-makers, even if not linked to the immediate policy or program under evaluation. so, again, the frame of shaping opinions and aiming for decision-maker/power-center rather than policy-specific influence may lead to altered approaches, goals, and benchmarks.

.

a third point that echoed throughout the panel is that policy influence takes time. new ideas need time to sink in and percolate before opinions are re-shaped. secretary suman prasad sharma of nepal noted that from a decision-maker point of view, evaluations are better and more digestible when they aim to build bit by bit. participants invoked a building blocks metaphor several times and contrasted it with “big bang” results. a related and familiar point about the time and timing required for evaluation to change opinions and shape decisions is that planning for the next phase of the program cycle generally begins midway through current programming. if evaluation is to inform this next stage of planning, it requires the communication of interim results — or a more thoughtful shift of the program planning cycle relative to monitoring and evaluation funding cycles in general.

.

a general point that came up repeatedly was what constitutes a good versus a bad evaluation. this leads to a key capacity-building point: we need more “capacity-building” to help decision-makers recognize credible, rigorous evidence and to mediate between conflicting findings. way too often, in my view, capacity-building ends up being about how particular methods are carried out, rather than on the central task of identifying credible methodologies and weighting the findings accordingly (or on broader principles of causal inference). that is, capacity-building among decision-makers needs to (a) understand how they currently assess credibility (on a radical premise that capacity-building exercises might generate capacity on both sides) and (b) help them become better consumers, not producers, of evidence.

.

a point that surfaced continuously about how decision-makers assess evidence was about objectivity and neutrality. ‘bad evaluations’ are biased and opinionated; ‘good evaluations’ are objective. there is probably a much larger conversation to be had about parsing objectivity from independence and engagement as well as further assessment of how decision-makers assess neutrality and how evaluators might establish and signal their objectivity. as a musing: a particular method doesn’t guarantee neutrality, which can also be violated in shaping the questions, selecting the site and sample, and so on.

.

other characteristics of ‘good evaluation’ that came out included those that don’t confuse being critical with only being negative. findings about what is working are also appreciated. ‘bad evaluation’ assigns blame and accountability to particular stakeholders without looking through a nuanced view of the context and events (internal and external) during the evaluation. ‘good evaluation’ involves setting eval objectives up front. ‘good evaluation’ also places the findings in the context of other evidence on the same topic; this literature/evidence review work, especially when it does not focus on a single methodology or discipline (and, yes, i am particularly alluding to RCT authors that tend to only cite other RCTs, at the expense of sectoral evidence and simply other methodologies), is very helpful to a decision-making audience, as is helping to make sense of conflicting findings.

..

a final set of issues related to timing and transaction costs. a clear refrain throughout the panel is the importance of the timing of sharing the findings. this means paying attention to the budget-making cycle and sharing results at just the right moment. it means seeing windows of receptivity to evidence on particular topics, reframing the evidence accordingly, and sharing it with decision-makers and the media. it probably means learning a lot more from effective lobbyists. staying in tune with policy and media cycles in a given evaluation context is hugely time consuming. a point was made and is well-taken that the transaction costs of this kind of staying-in-tune for policy influence is quite high for researchers. perhaps goals for influence by the immediate researchers and evaluators should be more modest, at least when shaping a specific decision was not the explicit purpose of the evaluation.

.

one is to communicate the findings clearly to and to do necessary capacity-building with naturally sympathetic decision-makers (say, parliamentarians or bureaucrats with an expressed interest in x issue) to become champions to keep the discussion going within decision-making bodies. to reiterate, my view is that a priority for capacity-building efforts should focus on helping decision-makers become evidence champions and good communicators of specific evaluation and research findings. this is an indirect road to influence but an important one, leveraging the credibility of decision-makers with one another. two, also indirect, is to communicate the findings clearly to and to do necessary capacity-building with the types of (advocacy? think tank?) organizations whose job is to focus on the timing of budget meetings and shifting political priorities and local events to which the evidence can be brought to bear.

.

the happy closing point was that a little bit of passion in evaluation, even while trying to remain neutral and objective, does not hurt.

Aside

what i lost / terror

here’s a post that i’ve been half-meaning to write for awhile. for some time, i thought i had said all i needed to say in writing some words for her memorial.

.

if the enormity of our – my – loss truly ever hits me, it will be through the small shared moments that can no longer be accumulated, even though skype tells me elif is only offline for now and gmail suggests that I may have meant to include her on my emails. with every absurd statement or mannerism over which we can’t exchange glances and snarky giggles or looks of outright disgust; with every annoyance or potentiality that can no longer be re-enacted and analysed over tea or wine; for every internet chat that no longer comes through filled with “hey lady”s “:-)”s and “;-)”s and exclamation marks at precisely the needed moment and in precisely the needed amount; and with every glass of wine i order at grafton’s knowing that she won’t be pedaling up soon in 4-inch heels to join me. maybe in this succession of elif-shaped voids I will begin to grasp what has stolen from me and from the world — through intolerance, the antithesis of all that elif believed.

.

she was fearless in her approach to life, fiercely loyal in her friendships, focused in her work and infectious when she laughed. she *is* a fiercely loyal friend, appreciative and incisively honest, a yogi with a sharp tongue but a sharper wit, short-tempered but with a heart big enough to always make it OK, a perfectionist wrapped up in layers of clashing-but-considered clothes and scarves and flowers and hats. she is one of the finest partners-in-crime anyone could ask for.

.

it’s been two years. but from time to time i still find myself etching a sentence or two in my mind. two 21st septembers have passed since the westgate mall shooting and i only managed to take a few sentences from my head and put them in a draft blog. i spilled little red wine out in remembrance on the appropriate dates and at a recent wedding that i know would have pleased her. a few weeks ago an (academic) article made me cry, resulting in some of the writing below.

.

but i didn’t press ‘publish’ until watching the horrors of beirut and paris unfold across social media while sitting alone in a hotel in abuja, too connected and too separated and possibly with one too many heinekens. more dates. 9-11 and and 9-21 and 12/11 and 13/11 and 26/11. too many dates. ‘a calendar’ as the noun of accumulation of ‘terror.

.

which i guess is when it hit me. it isn’t about the dates or the symbols or even really the cities. it’s about what is lost every day, for all of us, because of acts of terror. i don’t walk into a mall anywhere in the world without thinking about elif and wanting to walk out immediately. every time i hear someone use one of the words or phrases elif and i deemed as terrible, like “leaf peeping” (which people in new england insist on saying when they are going on a perfectly good outing to admire the autumn foliage) and “nibble” and “sequelae” (which particularly alarmed elif and she sketched once as a fearsome and carnivorous caterpillar-being) i want to write her immediately. i cannotfor the very specific reason of someone else’s hate and retribution. or statement.

.

wikipedia tells me there is no agreed upon definition of terrorism but that pre-french revolution usages relate to a spreading mind-set of terror or dread, before questions about being state-sponsored or not cluttered up contemporary efforts at pinning down the idea. i’m actually not sure whether a visceral, sensory definition lies in the subtle sense of dread and suspicion of people that results from such acts or the small dead space in your brain, like an amputation, that still tries to light up when you think of someone you can no longer write. a hyper-sensitivity and a numbness.

.

the best i can do now, or ever, is to remind the world what has been taken from them.

.

elif and i bonded over the sort of humor that does not amuse everyone and downright offends some people. our first day of class together, in foundations of global health, the professor announced that some percentage of the world’s children would not enjoy their 5th birthdays. this is a euphemistic way of describing inequitable and horrifying under-5 mortality rates around the world, mostly from infectious disease, unhygienic surroundings, and poorly attended births. elif and i would not have been in a school of public health if we thought the underlying subject matter humorous. but the phrasing still tickled us. the birthdays wouldn’t be enjoyed because of insufficiently grand party hats? not enough party guests? somehow the subject of the joke became timmy and timmy and his failed birthday party were a recurring touchstone that got us through the two years till qualifying exams and three more years of school after that. and, hell, through elif (and ross) being on the verge of having their own child, traveling to nairobi from dar for just that purpose

.

and so it was a few weeks ago, on a random day, that i found myself sobbing when reading lant pritchett’s blog on the end of kinky development, in which he declares that “no one has ever held an ‘i am over $1.25 a day’ party.” which seems liked just the sort of party i would want to plan for timmy with elif.

.

which, again, i guess, is the point, if there is one, which i am never sure there is. my grief isn’t eiffel-tower shaped or cedar-tree shaped or red or white or green or blue. it’s pink and teal and elif-shaped. it doesn’t come on a particular date. it comes any time of the day or night when i want to write “elif, you won’t believe…” and can only think ‘fuck you’ to people i have never met.

teaching qualitative analysis: an intro

teaching qualitative analysis is not easy for several reasons. first, an awful lot of material on doing qualitative research focuses on data collection. relatedly, then, a lot of academic papers that draw on qualitative data and analytic methods focus on data collection and organization. too often the use of an analytic software stands in for an explanation of how analysis was done.

.

second, lecturing on qualitative analysis is much like a powerpoint lecture on riding a bicycle. it sounds very easy (right foot down, then left foot down). it only gets hard when you try to do it.

.

nevertheless, a lecture must begin somewhere. i hope my notes, below, may prove useful to someone else.

.

despite my impulse to start with lincoln and guba’s paper, since this wasn’t an audience that spends all the their time reading academic papers or thinking about theory, i started with an example published qualitative piece. i found one that focused on a similar data source and level (interviews with high-level stakeholders as opposed to, say, a focus groups in a village or historical document review) as well as stated analytic approach (in this case, this paper by smit et al. was a good match).

.

the intuition was that — even though this was not an audience entirely used to reading research outputs — it would be helpful to get a handle on the type of research product toward which we wanted to build before getting lost in the nitty gritty of analysis. with a slightly different audience, i probably would have made the lincoln & guba piece mandatory to provide a touchstone for considering and critiquing the paper and then for storyboarding our own paper.

.

after reviewing a few key terms central to doing qualitative work (sources of qualitative data (talk, text, observations, images), positionality, inductive reasoning, deductive reasoning, codes and coding), we spent much of the first day discussing and critiquing the paper

.

first, individually and then in pairs, and then finally as a big group, we explored these questions:

  • what are the goals set out by the researchers for this project and paper
  • why did a qualitative approach make sense to answer these questions?
  • what are the key conclusions the researchers draw from their analysis?
  • what types of data do the researchers use to support their conclusions?
  • what types of analyses do the researchers use to support their conclusions?
  • what is convincing about the link between the researchers’ results and their conclusions? could anything have been done to make this more convincing?
  • do the researchers achieve the goals they set out for themselves? why or why not? what could have been done differently?

.

then, as a larger group, we explored these additional questions (which are mostly notes to myself of topics to cover rather than and handout distributed to to participants; the first set of questions i did distribute):

  • methods: data collection, organization, analysis
    • who were the data collectors? can we tell from the paper? how?
    • what was the positionality of the interviewers vis-à-vis the informants? what difference does this make?
    • what data collection strategy was used?
    • why do key informant interviews make sense as a data source given the research questions and goals?
    • were any other types of data used? how?
    • how were key informants chosen?
      • how many interviews were completed?
      • what does purposive sampling mean? snowball sampling?
      • how do the authors signal that the sample is representative of the relevant interests (i.e., what is thematic saturation or redundancy? what does this imply about the relationship between data collection, entry, and analysis in qualitative research?)? is this convincing? could it have been more convincing and if so, how?
    • how do the authors display their sample? is it helpful? what characteristics do they highlight and why? could it have been done better or differently?
    • what is a semi-structured interview guide? how does it differ from a completely structured or unstructured questionnaire or guide?
      • what types of questions did the researchers ask? how do we know? what else might we have liked to have known?
    • is there anything else we would have liked to have known about how the data were collected?
    • how were the data converted into transcripts? do the authors provide all the information we want on this?
    • what does it mean in this case that an inductive approach was used?
      • how did the researchers set about their induction? is this convincing?
        • what does it mean that key themes “emerged”?
      • what would have been different if the researchers had used a deductive approach? how would the analysis have changed? what would have been the trade-offs?
      • what did the authors actually do in analysis? do they provide us enough information to know?

.

  • results
    • how do the authors reassure us that the information from different stakeholders is used and presented in a balanced way? could this have been done differently or better? if so, how?
    • figure 1 is the main display of the (descriptive) results.
      • did you look at it carefully when reading the paper or did you skip over it?
      • where did the figure come from? what do the bullet points in each box represent?
      • is this figure meant to be descriptive or analytic?
      • what is helpful about this display? what could have been done differently?
    • how are the results in this paper organized?
      • how does the presentation of results relate to the research questions?
      • how are quotations used to communicate the results? is this effective? convincing?
      • how were the quotes selected? are they meant to be representative or exceptional? how do you know?
      • were conflicting or diverging viewpoints represented? how do you know?
      • do you feel the researchers have drawn reasonable inferences from the data?
      • do the conclusions follow from the data?
      • do you feel that the researchers already had the conclusions in mind before they analyzed the data? does this affect the convincing-ness of the analysis?

.

  • interpretations
    • what did the researchers do to make the present paper credible?
    • what did the researchers do to make the present paper balanced?
    • what did the researchers do to enhancing transparency?
    • is the paper ultimately convincing? why/not?

Aside

revisiting maximum city amid delhi’s air pollution

earlier this week, a friend responded to this article on delhi’s pollution levels by reporting to facebook:

in the last week, 2 of my friends have moved back (one permanently & the other temporarily) to the states because of peak pollution levels. others are booking flights to leave the city for portions of the winter

it seems that most of the adaptations we strive towards are restricted to creating healthy spaces for ourselves amongst the pollution that most of the city’s residents cannot escape. 

.

what she is saying, and is right, is that those of us that can and are staying in delhi are partially creating an air-istocracy. some of us are able to refine the very most public of goods — the air — for ourselves. a public good is by definition non-rivalrous and non-excludable. and yet we are working to make breathable air exclusive: in our flats, in our enclosed vehicles, in our office spaces, behind our masks.

.

this needs to change, lest we become confined to these bubbles and delhi becomes even less friendly to pedestrians and cyclists and generally to taking a stroll or letting in a bit of fresh air through the windows.

.

what i wrote on facebook, and i stand by at risk of being offensive, is this: in a considered and intentional, if provocative, turn of phrase to indicate violation or abuse without consent, delhi rapes my lungs on a daily basis.

.

this intended to play on one of the major threats the outside world sees about living in delhi. the point is not to belittle violence against women experienced in delhi — which i have been merciful in not experiencing but which is a reality — or other forms of structural violence coped with on a daily basis. worrying about and living with these forms of violence wear people down to the point that they feel they can’t deal with something like the air. and so it goes undiscussed. but clean air is not ignorable. it is a form of violence and it needs to be addressed.

.

to tackle a problem of the common or public good is a challenge anywhere; it is deeply bound up with ideas of citizenship and the social contract, of paying taxes and the role of government and the space for activism. it requires a government that can impose regulations to protect public goods and it requires citizens to expect and demand this of their government, though it is not a commodity that can be handed out. it is about far more than putting up ‘clean city, green city’ signs (as, incidentally, tackling violence against women is about more than hanging up coasters in taxis and autos declaring (in english) that the vehicle respects women).

.

delhi and india, perhaps in particular, have a lot of work to do. i hope the world stays tuned and that india rises to the challenge. to close, i’ll allow someone else —  mehtu in maximum city — to muse on public goods in india (a passage that, ironically, i first read in the much cleaner air of rishikesh):

.

the flats in my building are spotlessly clean inside; they are swept and mopped every day, or twice every day. the public spaces – hallways, stairs, lobby, the building compound – are stained with betel spit; the ground is littered with congealed wet garbage, plastic bags, and dirt of human and animal origin. it is the same all over bombay, in rich and poor areas alike. this absence of a civic sense is something that everyone from the british to the hindu nationalists have drawn attention to, the national defect in the indian character (p. 138).

.

.

avoiding perversions of evidence-informed decision-making

*this is a joint post with suvojit, here.

.

avoiding “we saw the evidence and we made a decision…”

“…and that decision was: given that the evidence didn’t confirm our priors or show a program to be a success, to try to downplay and hide the evidence.”

.

before we dig into that statement (based-on-a-true-story-involving-people-like-us), we start with a simpler, obvious one: many people are involved in evaluations. we use the word ‘involved’ rather broadly. our central focus for this post is people who may block the honest presentation of evaluation results.

.

in any given evaluation, there are several groups of organizations and people with stake in an evaluation of a program or policy. most obviously, there are researchers and implementers. there are also participants. and, for much of the global development ecosystem, there are funders of the program, who may be separate from the funders of the evaluation. both of these may work through sub-contractors and consultants, bringing yet others on board.

.

our contention is that not all of these actors are currently, explicitly acknowledged in the current transparency movement in social science evaluation, with implications for the later acceptance and use of the results. the current focus is often on a contract between researchers and evidence consumers as a sign that, in ben olken’s terms, researchers are not nefarious and power (statistically speaking)-hungry (2015). to achieve its objectives, the transparency movement requires more than committing to a core set of analyses ex ante (through pre-analysis or commitment to analysis plans) and study registration.

.

to make sure that research is conducted openly at all phases, transparency must include engaging all stakeholders — perhaps particularly those that can block the honest sharing of results. this is in line with, for example, EGAP’s third research principle on rights to review and publish results. we return to some ideas of how to encourage this at the end of the blog.

.

now, back to the opening statement, a subversion of the goal of evidence-informed decision-making. there are many interesting ways that stakeholders may try to dodge an honest sharing of results once they know what the results are. one is to claim that the public — whether in office or general public — will not be able to make sense of the results, so anything confusing, or, really, unexpected, needs to be pruned from the public report. instead, all the not-as-hoped results can be relegated to internal rather than public, learning.

.

decision-makers may indeed need brief synopses (written or otherwise) rather than being presented with a long report. different combinations and permutations of the evidence may be presented to different stakeholders using different modes of communication, in line with what is salient to them.

.

however, this is not a suitable excuse to fail make the full set of findings public. moreover, an assessment of what stakeholders can/not interpret that fails to account for how they say they want to receive evidence misses a key point of participation and partnership. it might reveal our (mis-)estimation of the policymaker’s intelligence and the complex policy challenges decision-makers encounter as part of their daily work.

.

we’ve talked elsewhere about committing to a decision process informed by evidence. in this post, we are after something even more simple: for key stakeholders to commit ex ante to making the results of a commissioned study public, irrespective of their respective priors regarding the intervention being studied. of course, the piece of research should be deemed as technically sound. assuming that it is, the goal is to encourage the honest sharing of results regardless of the direction of the results.

.

in theory, everyone party to a good ex ante evaluation (and ex post, though there may be slightly less stakeholder engagement; or the degree of engagement could vary depending on the emerging results from the study) is aware that the results for the effect of an intervention on an outcome of interest can be as hoped, opposite, null, or otherwise mixed and confusing. in practice, everyone has a prior, which may involve not just an educated hypothesis but an emotional commitment to a particular outcome.­

.

so what can help reduce the impulse and potential to cover-up unexpected results?

1. better explanation of research processes and norms. in some cases, key actors within commissioning agencies may be initially enthusiastic about the idea of evaluation without fully understanding what it — and a measurement and results focus more generally — really entails. here, one often makes the mistake of focusing on agency-capacity, rather than the capacity of individuals within these agencies. by capacity, we refer not only to technical know-how of evaluation methods but also familiarity with research processes and norms. disparity in capacity can lead to serious contradictions within the same agency on the way research findings are treated.

.

too often, though, efforts at “capacity-building” and other modes of education for individuals within agencies about evaluation focus on evaluation designs and analysis. this comes at the expense of explaining the research process, the variety of possible evaluation outcomes, and norms around transparent reporting of results. patrick dunleavy recently outlined the process of storyboarding research from the get-go to improve working in teams and helping to visualize the end-product. such a process may be useful for a broader array of stakeholders than the research team, so that the whole process (the whole magic of “analysis and writing up”) can be made transparent. this represents a potentially softer, friendlier and more feasible version than drafting the entire report in advance, as humphreys et al. attempted in their paper on fishing. it also may allow more of the process to be visible, rather than just the final reporting structure.

.

2. invest time in bringing all stakeholders to understand and agree with the research objectives and processes. several research studies (especially evaluations) have a committee of advisers to steer the process. these are critical stakeholders in addition to those that commission and carry out the research. ideally, all of those involved — including this committee of advisers — would reach a common understanding of the research objectives and methods to be followed. this would also include identifying policy messages from the study and engagement strategies.

.

however, common ground is sometimes elusive, as these wider groups do not arrive early on at a fruitful working arrangement or basic understanding of the research process. setting clearly understood objectives and a shared understanding of research processes may be time consuming but is invaluable when seen in the context of decision-making and transparency over research findings that may not match everyone’s priors.

.

3. formal commitment to results reporting across stakeholders. right now commitment to analyses and results reporting exist between researchers and the public or, really, other researchers. but researchers are not the only ones determining the content of results reporting — and thus reporting requires additional sets of (public? formal? registered?) commitments. these could, like pre-analysis plans or commitment-to-analysis plans, take the form of committing to a core set of analyses and reporting on these results. it may also take the form of MOUs that are less technical than ex ante analysis plans but still represent a commitment to reporting a certain set of results regardless of the direction of those results.

.

in any case, the goal is to move the commitment from being between researchers (and perhaps mostly intelligible to researchers) to also involving study commissioners, other stakeholders with the power to block the publication of findings, and the public (such as the public paying the taxes to fund the program).

.

4. early engagement with decision-makers. if decision-makers are a primary audience for the evaluation and if communicating to decision-makers is seen as a barrier to a complete, nuanced presentation of evaluation findings, then engaging with decision-makers early on may help. we recognise the time constraints of decision-makers and the importance of clarity in messaging. but the clarity of presentation and the complicatedness of the results need not be zero-sum.

.

one way to reduce this tension and to better communicate complex or complicated findings to decision-makers is to engage them in the evaluation from the very beginning, so that the potential for nuanced findings can be gradually introduced. if faced with a passive policy audience at the end of an evaluation, whose only role has been to turn up to listen to research findings in a workshop, the space for taking in complexity, nuance, and caveats in messaging will be limited. but assuming that evaluations findings need to assert only non-complex finding and straightforward recommendation is hugely problematic since we are talking about evaluations in social systems. as such, getting early buy-in and opening channels to gradually introduce results are important.

.

with these steps in place, chances are better that our based-on-a-true-story colleagues could have avoided the scenario that we referred to at the beginning of this post. an early commitment to the research processes and an agreement on the way forward would have helped prime key stakeholders to the possibility that research findings might be a mixed bag — which necessitates a nuanced dissemination strategy but not the burying of unfavorable results.

Aside

brief thoughts on tolerance from george washington via sarah vowell

there’s a lot of talk recently in various quarters about “tolerance” and who is and who is not — individually, nationally, etc. not all of it makes sense or rings true.

.

it was nice, then, to come across this snippet from a letter from george washington written after the war for the independence of the united states, in vowell‘s new lafayette in the somewhat united states. as with all things of the era, it reflects ideals rather than being a perfect mirror of reality, but it is in keeping with a whole ethos of working to become less imperfect. vowell writes:

after the war, in 1790, newport’s synagogue would go on to inspire one of washington’s finer moments as a president and a person. responding to a letter from touro’s moses seixas, who asked the president if ratifying the bill of rights was, to paraphrase, good for jews.

.

washington would send a letter addressed to the hebrew congregation at newport. the first amendment, he explained, exposed tolerance as a sham, because intolerance implies one superior group of people deigning to put up with their inferiors.

.

it is now no more that toleration is spoken of as if it were the indulgence of one class of people that another enjoyed for the exercise of their inherent national rights,’ washington wrote.for, happily, the government of the united states… gives to bigotry no sanction, to persecution no assistance.

.

emphasis added. the book (30 pages shy of the end) is recommended.

draft thoughts on showing the work over time in a theory of change (comments welcome)

in draft work with vegard iversen (see here), we have been developing some ideas around using (and showing) both ex ante and ex post theories of change. this is partly in line with a learning agenda for theories of change (as outlined in valters’ recent work here and in my follow-up here, among other places). a learning agenda includes both internal learning but also, and equally importantly in my view, a commitment to making learning public.

.

the overall argument we advance in the paper relates to bringing theory — both formal and programmatic — to the center of making claims about the external validity of evaluation findings. a small piece of this argument relates to reporting. i’d certainly be interested in views on the below, which is a slightly modified view of the text as it currently stands. i feel that much of the received wisdom is that higher and higher levels of abstraction are what make research findings portable across settings. and, yet, in many instances of discussing external validity (such as here), the conversation inevitably veers towards mixed methods.

.

Thicker (Geertz 1972) and richer descriptions about settings and implementation processes — linked with the ToC and its critical assumptions — will facilitate learning. The more deeply researchers can probe the local — and the more detail authors use describe it — the more the reader can try to assess the potential for generalization. Thickness, not thinness, helps users of evidence make this assessment in light of their own setting.

.

Indeed, Lincoln and Guba denote “thick description” as the main way of allowing results to be transferred, calling for “narrative developed about the [setting to allow] judgments about the degree of fit or similarity may be made by others who wish to apply all of part of the findings elsewhere” (Lincoln and Guba 1986).

.

We believe that similar concerns are what motivated Woolcock to close his paper on external validity with a call for more case study work (Woolcock 2013). Honesty about problems encountered and tweaks to implementation processes and quality made as a result of experiential learning and adaptation will facilitate lesson learning both ‘there’ and ‘here.’

.

The Theory of Change, a central premise of our arguments about taking external validity seriously, can provide a useful tool for structuring reporting and lesson learning. Said more plainly, we want to have ToCs play equally important roles at the beginning of a study and at the end. In his review of how Theories of Change are used, Valters writes: “as far as I know, there is no particular tool to go back and see if a ToC is right or not.” This lack of critical reflection clearly does not support lesson learning here nor there.

.

To this end, we put forward the idea that careful ex ante assembly and ex post refinement of a Theory of Change will assist in studies that rigorously increase their potential for external validity. Such reporting, perhaps particularly when structured by the ToC, would reflect the learning that happens between the design phase and during data collection and the analysis (Pritchett, Samji, and Hammer 2012).