two nice posts came out yesterday that relate directly or tangentially to building theories of change. david evans wrote about, inter alia, what lies behind a null finding, here. marcus jenal wrote here about how complexity shouldn’t stop us from building theories of chang, up front, so long as we stand ready to adapt (parts*) of them. these two posts sort of collided in my head as ways of thinking about how tocs link to study design (as well as program planning) — thanks for inspiring me to write on a saturday morning!
.
plenty has been written on the relevance of tocs and a good place to start if you’re catching up is with craig‘s work, such as here. he, marcus, and others highlight the centrality of assumptions about why things may not work to a good theory of change. one reason to spend time on carefully hashing out the assumptions is that they help strengthen both research design and program planning, hence the blog title (the received wisdom is that when you assume, you do indeed make an ass out of u + me).
.
what excites me about building theories of change (or conceptual models, as i originally learned about them) is getting to draw simultaneously on more formal theory, on practical lessons from the empirical literature (making it useful when empirical papers offer basic information and lessons about mundane (read: fascinating and useful) implementation details), and a healthy dose of common sense, where ‘common’ includes drawing on a variety of perspectives.
.
getting a good start on a toc (even if planning to adapt and iterate throughout) is important if you, like me (with vegard, as we try to lay out in our work-in-progress here), see almost every other aspect of program planning and evaluation design as flowing from it: the content of questionnaires, the timing of data collection, which methods are most appropriate for answering which types of questions (links in the toc), what monitoring plans are needed, the enabling factors on which program design can draw and enhance, the contextual constraints a program can try to loosen, and even the way that the final report may look and the story it will try to tell about what met expectations, what didn’t, and why.
.
marcus’s post has some useful ideas about building tocs, including new-to-me cynefin and accommodating competing hypotheses in early toc iterations on how change might occur. i have also written a bit about ways to try to enhance the toc-building process, here (and also some lessons from putting this into practice, here). these and other exercises will (perhaps implicitly) shed light on the ways that a program may not work as expected.
.
another set of useful ideas, especially in light of david’s post (and the paper that inspired it), can be drawn from cartwright and hardie. the book is worth checking out, as i skip over many of their ideas related to toc-building (not what they call it) to focus on one they call the ‘pre-mortem.‘
.
less morbidly and more relevantly, we might call it a pre-null exercise. this type of thought experiment is useful because well-considered theories of change incorporate, through assumptions, hypotheses/theories of no change — that is, all the reasons why the expected pathways to change may get blocked or washed out or never be walked at all, culminating in a null result. the existence and tractability of such roadblocks and breakdowns are important lessons to learn from any research project. this is why thinking, early on, through the assumptions component of a toc is so critical, so that research designs can build in ways to catch potential reasons for no change.
.
the basic pre-null thought exercise is, at the beginning of project and study design, to imagine yourself at the end an analysis, with a lack of significant results. think through and list the possible reasons for this (beyond a true null) and then incorporate them both into program and study design (especially as these two seem to be (again?) moving closer together, see here and also samii on deep engagement).
.
david’s post gives some nice broad categories to consider: (1) lack of implementation fidelity or other implementation snafus (making it particularly important for a toc to include implementer incentives as well as considering the beneficiary viewpoint, as advocated by pritchett et al, among others), (2) altered behavior among ‘beneficiaries’ over time, (3) ge effects, or (4) that the intervention may work differently for various sub-groups (yes, sub-group assumptions should appear in a toc!).
.
trying to anticipate these different ways we might end up with null results means they can be better represented as toc assumptions and, accordingly, incorporated into study and questionnaire design — and we all end up wiser as a result.
.
i think it is fair to say broadly that this and many other thought exercises go un-done during the study design phase of rigorous evaluation and research (i like, for example, this posner et al paper for its effort to do some of this work ex post but of course wish it had — and think much of it could have — happened before the study). these efforts certainly go unreported and perhaps even untracked by researchers themselves, not just in the academic literature but, perhaps more upsettingly, in study reports that have fewer restrictions on words.
.
i am hoping that exercises like a pre-null thought experiment will be useful to researchers planning studies. what i am struggling to figure out is: why they aren’t happening much now.
.
here are some of my working hypotheses:
- lack of time during program and/or study planning stages.
- lack of clarity about toc-building (or conceptual modelling or whatever term you fancy) as being a key goal of formative work and deep stakeholder engagement (or more general lack of formative work and meaningful stakeholder engagement).
- lack of funding for this kind of toc-building work and engagement.
- lack of clarity about what constitutes a good theory of change and how it links to broader study and program design.
- lack of (sociological) imagination or a sense of not needing to employ this during study design.
- limited discussion of implementation lessons-learned (including during the pilot phase) in the empirical literature and little value (or actual disincentives) placed on sharing implementation details — good, bad, and ugly — that can inform future tocs.
- under-valuing of theory-of-change-building (along with needs assessment and diagnostics?) as part of formal research education (these are things that can be taught, you don’t need to only learn them during your first research project, though certainly some of the lessons may only hit home then).
.
the follow-up question is, of course, how we can start to try to do better, such that inexplicable nulls become a bit more endangered.
.
*i note ‘parts’ because while from a learning perspective, we might want to be able to scrap all our initial conceptions but from an accountability (and, actually, learning) perspective, we probably want to hold some things, such as goals, as fixed.
Hey Heather,
To add to your list of hypotheses:
– Programme staff already have a strong sense of a. what programme they will implement (inc. specifics) and b. what impact that will have.
Often this is because programmes are actually rather constrained in what they might do (and how they might change), for example, because they have to work in tandem with state or private interest structures. One problem with the way theories of change are approached is that these incentive/interest structures are not really factored in…Theories of Change are conducted more like pie-in-the-sky thought experiments.
This then promotes a rather one-dimensional way of thinking about impact…because if there is no way of the programme realistically changing, then over-thinking the impact of the programme can create serious problems. What if change is required? The incentives around decision-making shape whether people feel investing in theories of change is worthwhile.
As we discussed over 140 characters recently, researchers/evaluators haven’t come to grips (enough) with how programmes are actually shaped…
LikeLike
you’re right, their scope for change may inform whether program staff see a long toc exercise as useful. i don’t think this lets researchers off the hook. also, if most of the changes any given program can make are going to be small tweaks because they are unable to make big pivots, isn’t this better served by a more fine-grained toc that can help pick up on details that can be altered within constraints?
LikeLike
Interesting ideas. Mulling these ideas over, I wondered whether it is a bit technocratic – that with enough thought up front we would get a clear conceptual model. Does this represent the real world of international development (could we even apply the same principles to a domestic issues?)
I think a key problem is ‘when approval happens’. A key challenge is that resources (from a donor) or permissions (from governments) often happen at the point when we know the very least. So while we talk about iterative delivery, we probably also need to talk about iterative approval. When design and approval is seen as a technical, one-off exercise, it can take a long time (approvers want to see everything clearly spelled out) and then – by the time things start – things need to happen fast (to meet spending profiles, declining political capital given delays etc) squeezing out time for iterative start-up. Not easy – who wants a donor wanted to be involved in a more interative approval process?
Another aspect is wheter we are clear on the problem we are trying to fix – do we have a clear diagnosis the problem? Is it relatively ‘simple’ or is it an ‘adaptive’ challenge where solutions will be more ’emergent’. If the latter, then we may not want (or even be able) to specfiy too much up front and will want to develop the Theory of Change and delivery models over time. Chris Pycroft (DFID Head of Sudan) talks of programme design in these settngs being ‘short on the what’ but ‘long on the how’, which I find a helpful heuristic…
LikeLike
hi pete: thanks for your thoughtful comments.
it is possible that some of the ideas presented in my post veer to the technocratic but my instinct is to stand by them. this is to a large extent from an evidence-generation, monitoring, and evaluation point of view — the more we can think through what might happen (on- and off-plan) up front, the more likely we are to be able to build it into our evaluation designs and therefore the kind of learning that can happen over the course of an evaluation, not only as an ex post exercise to figure out why something didn’t work. this should help with reducing the un-interpretable nulls that were part of the focus on the paper reviewed in david’s post — which i think represent a real but undesirable reality of much current evaluation in international development (and domestic programming).
that said, i don’t see this kind of up-front time investment and specifying a best, detailed guess about how change will happen as incompatible with an emergent stance, which may be a more accurate reflection of some aspects of international development. things might change — but the fact that things might change as context changes and as learning happens (similar to marcus’s point that complexity shouldn’t necessarily dissuade us from using a theory of change) does not prevent us from taking a good stab up front.
i do think you are spot-on about the donor perspective and starting to think about iterative funding and approval — including funding and approval that allows more time for up-front diagnostics and toc development (again, not as a way of saying that later change can’t happen). i’d be delighted to think through ideas around this more with you or to hear any additional thought that you’ve already put into it.
i’d also be interested in hearing a bit more about how the ‘pycroft heuristic’ works in practice. this still sounds to me like something that can be considered carefully up front — we have a sense of how we think things may change over time and we can even make a valiant effort of laying some of this out over time (i would take this to be ‘long on the how’ but perhaps i am not understanding). this can help develop a solid measurement plan, even if the contents of the intervention (the ‘what’) to facilitate this change evolves over time. am i understanding correctly?
LikeLike
Whether it comes in the form of country experience, social science research or local relationships, I really don’t think we can undervalue the importance of knowledge at the outset (and throughout) a programme. How can we even know whether a problem/issue/context is ‘complex’ unless we’ve given it some serious thought? I think we should be careful to distinguish between a. knowing a lot about the context/issue and b. outlining a detailed plan upfront. I would argue that a. tends to be very important (although not always) whereas b. is typically constricting. That’s not to say plans are bad per se. Plans can be constraining, but planning is essential.
Increasingly (and I know this is simple) I just feel like the industry need to get to grips with the fact that stuff changes all the time and we need to be able to react. Or in more wonky terms, change is complex and unpredictable and programme interventions need to reflect that.
Agree the issue of approval is major. For me it brings a focus on programme relationships: with the state, donors, partners and end-users. For example a security sector reform programme in a given country may have got buy in specifically through keeping on the good side of the police Inspector General, who has specific desires for what a SSR programme may look like. His/her ongoing approval can be the issue upon which a programme stands or falls but this is not the kinda of stuff you see in up-front analysis (say risk columns/ToCs) as much as we should, given its importance.
LikeLike
i am pretty sure i agree… but think i want to push back a little (maybe i am missing something here and i am not actually pushing back at all). but: isn’t part of getting to know the context and sector really well partially about talking to people and diagnosing the problems, asking for ideas around possible solutions, and even pitching possible solutions and pressure testing where things might go wrong (and if any built-in troubleshooting could be put into place as part of early program design)? and isn’t the output of some of these conversations and building relationships precisely some of the initial inputs into a good theory of change and helpful in clarifying assumptions — even allowing that the context and circumstances may shift?
again, i am coming at this in part from an evaluation angle and being frustrated with unexplained null results, as in david’s blog (karthik & paul’s paper). i do think that these unexplained results partly stems from a limited effort to at least make some good guesses about ways things might go wrong up front and monitor, measure, and document appropriately. of course, monitoring plans can (and should) also evolve as does a toc but the desire to be capturing some of these angles is what motivates me to try to do as much up-front planning and imagining as possible, even with the expectation that things will change.
if part of what you’re saying is that early engagement ought to seek to clarify whose buy-in and support (in what forms) is needed for the program to continue and to make the continuation of that support an explicit assumption with regards to the successful operation of the program, then yes, i agree. assumptions are certainly not only operational in nature… hopefully i haven’t implied that (i would hope the kind of pre-null exercise i suggested would help to bring these issues to light but, again, maybe i am missing something?)!
LikeLike
I think it is a fair assumption to make that evaluators often stick to a narrowly defined causal model, and do not account for all the possibilities that could arise from each stage and component in their ToC. While this should be the goal, I would be okay if the unexpected developments along the causal change at least do not go unnoticed. They could be documented and investigated in more depth in a subsequent study.
I do wonder though if evaluators are best placed to think through all the possible variations in the ToC. Evaluators are limited by their knowledge of implementation. Many evaluators we know are spread too thin across sectors to really know enough about every one of them. This calls for implementers to be more integral participants in evaluations – beyond their consent to participate by implementing the interventions, towards their knowledge being actively sought and used to analyse and interpret findings. This goes back partly to the issue of integrity in evaluations we once discussed – how integrity in conducting an evaluation is more important that pure independence.
Finally, a thought – David and G&M try to explain unexplained null results. I think your call for more effort to explain should extend to positive results? Positive results are as likely to go unexplained if evaluators find that their narrow causal model is proven – but we may not know ‘how’, if we don’t really look into it.
LikeLike
thanks much! on the last point, my sense is that they give categories of what could stand behind a null result but do not necessarily say that each study looked at each of those and, therefore, could explain them. but your point is equally useful (and i think the same thought exercise would lead to collecting data that cold answer both questions): do we necessarily know how something worked if we find a positive impact.
on the other, capturing unintended outcomes would mean, at a bare minimum, incorporating a ‘most sig change’ (http://mande.co.uk/special-issues/most-significant-change-msc/) kind of question in almost all evaluations?
LikeLike