In the US, the Nurse-Family Partnership is considered a proven program. It sends nurses to visit poor first-time mothers, and in three randomised controlled trials (RCTs) it’s been shown to have a big impact on the health and wellbeing of mother and child. The model was imported to the UK on that basis and shifted to a much bigger scale. But when it was already being implemented at 132 sites across the country, they tested it in an RCT — and found it had no effect.
The Nurse-Family Partnership is one of the clearest recent cases of a puzzle that bedevils any attempt to replicate a program that’s working somewhere else: generalisability. It’s the question mark at the centre of any scale up: if we replicate an approach in a new context, can we expect the impact to be similar?
That new context may be another city in the same country, or it could be another country on the other side of the world. Just because it has the same problem does not mean the same intervention will work, because every intervention works in tandem with its context.
The more different the context, the less appropriate the intervention may be. And the more the intervention is adapted to suit the new context, the less relevant any existing evidence becomes. In the end, it may effectively be a new, untested intervention.
Ultimately, replication is always a gamble. But how can we shorten the odds?
What’s the evidence
Before trying to replicate an intervention, it pays to do two things. First, assess the existing evidence: how many and what kind of evaluations have been done, and in what contexts. And second, understand what that evidence actually means out of context.
More and better executed trials are clearly an advantage. Replicating something on the strength of a single evaluation is risky; replicating something like a conditional cash transfer program, which has been trialled dozens of times in different contexts, is less so. But even then, there’s considerable debate about what that evidence really tells you.
Take the randomised controlled trial (RCT). With evidence-based policymaking and the professionalisation of the non-profit sector, RCTs are the flavour of the moment. Describing them as a gold standard has become a kind of policy cliché. But they have their limitations — and those limitations are particularly relevant when it comes to replication elsewhere.
A well-executed RCT says something about the intervention then and there and on that population. In that regard, they are the gold standard. But beyond that specific context, they have no special privilege over other kinds of evidence. An RCT done over there has no real bearing on how that program would work here.
But that’s not to say that RCTs, and evidence in general, from another place are useless. In fact, they can provide vital insight into why something works — and that’s the key for knowing whether it can be replicated in different contexts.
What’s the mechanism
Once you know that something works, you can dissect why it works: the underlying mechanisms, and the conditions they require. The Poverty Action Lab’s Sugar Daddies HIV prevention program is a good example of this approach.
In Kenya, this program was remarkably effective in reducing a key mode of HIV transmission: sexual relationships between teenage girls and older men. An RCT found that showing eighth-grade girls and boys statistics and a short video on the higher rates of HIV among older men dramatically changed their behaviour. The number of teen girls who became pregnant by an older man over the next 12 months fell by 60%.
Clearly the intervention works — but why does it work? What does its effectiveness rely on?
They identified three things. Relationships between older men and teenage girls must be common. Older men need to have higher rates of AIDS that younger men — a greater relative risk of transmission. And, crucially, the girls need to be unaware of that.
By analysing the mechanism, they derived the conditions it requires. Then they were in a position to predict whether it would work elsewhere: they simply needed to check for those conditions.
What’s the new context
When Rwanda wanted to replicate the Sugar Daddies program, the Poverty Action Lab asked whether the local conditions met the requirements for that mechanism. They started to parse the data.
They saw that HIV infection rates were higher among older men than younger men, and many teenage girls who were sexually active were so with men more than five years older than them. So far, so promising.
A team from Abdul Latif Jameel Poverty Action Lab (J-PAL) decided to do some on-the-ground research to see how teenage girls perceived the risk. Two things emerged. They knew that older men were more likely to be infected than younger men — but they also massively overestimated the percentage of infected men in all age groups.
This muddied the picture, and presented an alarming possibility. The Sugar Daddies program would not affect relative risk — the girls already knew older men were more likely to be infected. But it could actually lower their perception of HIV risk from unprotected sex in general. That means the program could well have increased HIV transmission.
That highlights how important understanding the interaction between intervention and context can be.
This kind of approach can be done systematically, as demonstrated by the international NGO Evidence Action. One of their programs, No Lean Season, reduces seasonal poverty by giving travel subsidies that help men temporarily move to a city, find work there, and send money home.
“First we look for the existence of seasonality: are there large groups of people that experience seasonal privation? It’s a common feature of agricultural livelihoods, but it’s not universal,” said Karen Levy, Director of Global Innovation at Evidence Action.
“That gives you a set of places. Then you look within that for places where there are growing and thriving cities nearby. Two examples of places we looked at that passed the first test but failed the second were Zambia and Malawi. They have an acute hunger season, but there aren’t surplus wage labour opportunities in towns and cities — so they have that nail, but our hammer isn’t going to work there.”
“And within that set of places, you’re looking for those where the gap between the impoverished and the nearby towns and cities is far, but not too far. If you have to buy a plane ticket, that’s not something that No Lean Season can solve. But equally if the fare costs just three dollars then perhaps a transport subsidy is not going to solve that problem. We’re looking for that bit of resistance that our subsidy can overcome.”
First, identify the need, then systematically confirm the presence of the conditions required for the intervention. With this approach, you avoid taking a program somewhere it will never work.
Adapt the intervention to the new context
Assessing the evidence for the intervention, the mechanism of the intervention and the new context you want to transplant it to are the preliminary tests. If a program passes them, then it’s time to adapt it.
Again, understanding why something works, and thus what is essential, is crucial. “The more you understand of the underlying mechanisms, the more you know where very high fidelity to the original design is needed, and what things are more flexible,” said Levy.
However, exactly what is flexible is rarely clear or agreed upon. And that gives rise to the tension between fidelity and adaptation that troubles so many scale ups.
The Nurse-Family Partnership is a good example. “The delivery of the program in England was high quality,” said Professor Mike Robling, who carried out the RCT. “But those fidelity targets were based upon the parameters that were observed in the original US trials. How relevant or essential are those when applied in a new setting?”
After the disappointing RCT, the program’s licensors loosened the fidelity requirements. Now the UK team are experimenting by changing things like the content of the program, or the number of visits people receive. They are also allowing each unit that delivers the program to customise it to their needs.
Given that the UK has a public health system, whereas the US does not, it may be that the program simply won’t work. Or they may strike upon a version that does deliver results.
It helps to remember that replication is not about slavishly copying every feature: it’s about replicating impact. It’s about the underlying mechanisms, rather than the superficial aspects of delivery.
In the end, a program should have a fixed core, but any number of delivery models that are context dependent. Those delivery models may look very different.
Take Reach Up, the early childhood intervention analysed in one of our case studies. It originated in Jamaica, where it had a remarkable impact on children’s educational attainment through its weekly home visits by community health workers. But replications in Peru and Colombia have struggled to achieve anything like the same impact. However, just recently in Bangladesh they experimented with a very different delivery model: doing group sessions in health clinics, and half the number of sessions. And it achieved the same impact as in Jamaica — not in spite of, but because of those adaptations.
A four-step framework
There’s no formula for replication, no algorithm to dispel all doubt. Even the simplest intervention is context dependent in countless, subtle ways — it’s impossible to say with certainty how it will fare in another place.
However, as presented here, there’s a four-step framework that can disqualify many mistakes before they happen, and improve the odds of replications you pursue.
First, assess the evidence. Is it good, recent and relevant?
Second, understand the mechanism: why the intervention works.
Third, check the new site fulfils the conditions for that mechanism.
And fourth, adapt as necessary. Don’t be beholden to superficial features.
Sugar Daddies was disqualified at step three. Perhaps Nurse-Family Partnership should’ve been. Clearly, a more systematic approach to replication can save resources — and could save a lot more.
(Picture credit: Flickr/Aikawa Ke)