*an updated version of this post, in which i try to answer some of my musings below, can be found here.
recently, 1 had the privilege of speaking about external validity at the CLEAR South Asia M&E Roundtable (thank you!), drawing on joint work i am doing with vegard iversen on this question of when and how to generalize lessons across settings.
.
my main musing for today is how the conversation at the round table, as well as so many conversations that I have had on external validity, always bend back to issues of monitoring and mixed-methods work (and reporting on the same) throughout the course of the evaluation.
.
my sense is that this points to a feeling that a commitment to taking external validity seriously in study design is about more than site and implementing partner selection (doing so with an eye towards representativeness and generalization, especially if the evaluation has the explicit purpose of informing scale-up).
.
it is also about more than measuring hawthorne effects or trying to predict the wearing off of novelty effects and the playing out of general equilibrium effects should the program be scaled-up — though all these things are clearly important.
.
the frequency with which calls for better monitoring come up as an external validity concern suggests to me that we need to take a hard look at what we mean by internal validity. in a strict sense, internal validity relates to the certainty about the causal claim of program ‘p’ on interesting outcome ‘y.‘ but surely this also includes a clear understanding of what program ‘p’ itself is — that is, what is packed into that beta treatment variable in the regression, which is likely not to have been static over the course of an evaluation or uniform across all implementation sites.
.
this is what makes monitoring using a variety of data collection tools and types so important — so that we know what a causal claim is actually about (as cartwright, among others, have discussed). this is both important in understanding what happened at the study site itself, as well as trying to learn from a study site ‘there’ for any work another implementer or researcher may want to do ‘here.‘ some calls for taking external validity seriously seem to me to be veiled calls for re-considering the requirements of internal validity (and issues of construct validity).
.
as a side musing, here’s a question for the blogosphere: we usually use ‘hawthorne effects‘/observer effects (please note, named for the factory where the effect was first documented, not for some elusive dr. hawthorne) to refer to the changes in participant/subject/beneficiary behavior strictly because they are being observed (outside of the behavior changes intended by the intervention itself).
.
but in much social science and development research, (potential) beneficiaries are not the only ones being observed. so too are implementer’s, who may feel more pressure to implement with fidelity to protocol, even if the intervention doesn’t explicitly alter their incentives to do so. can we also consider this a hawthorne effect? Is there another term in place for observer-effects-on-implementers? surely the potential for such an effect must be one of the potential lessons from the recent paper on how impact evaluations help deliver development projects?