I’m a data nerd and a data cheerleader, but still I fear Bill English’s datatopia
I’m a data nerd and a data cheerleader, but still I fear Bill English’s datatopia
NZ’s new prime minister is a champion of evidence-based policy, the “social investment approach” and open government. So why is fellow data-evangelist Keith Ng warning of a data-democracy cargo cult?
Bill English is the most data nerdy prime minister we have ever had. An ex-Treasury wonk and a champion of open data across government, he’s likely to continue his push for evidence-based, data-driven policy-making from the 9th Floor.
As a data nerd, I should love it. I used to dream about this datatopia too. But here’s why it could go terribly wrong.
There are two parts to his vision: Evaluation and democratisation. The evaluation part is called the Social Investment approach, which is about “applying rigorous and evidence-based investment practices to social services”… which basically means treating social spending as an investment, and measuring it like you’d measure an investment (ie on a spreadsheet).
The democratisation part is about opening up data and making it available to everyone, in order to democratise policy-making. As English has put it:
“Public policy people have this view that everything they do is highly complex and very special. We run university systems just to train people in public policy. But they’re wrong. Policy is now a commodity – you can print world best practice off the internet. You don’t need a department to know it, a 12-year-old can do it.”
How could anyone have a problem with rigorous and evidence-based evaluations, or with the democratisation of policy-making?
Tyranny of the null hypothesis
This came up recently in a podcast with political scientist Grant Gordon, on whether an evidence-based approach ought to be used to evaluate conflict interventions. He offered a hypothetical (around 43:20 in): What if the US civil rights movement was assessed like this?
The changes in wellbeing for African-Americans don’t show up for years, but racial violence spiked as a direct response to the reforms. From the evidence, it looked like the civil rights reforms wasn’t making people’s lives better and was actually causing violence! So it should be abandoned – right?
The civil rights hypothetical is great because we have intuitions about what is just. We can sense (and know with the benefit of hindsight) why it was important, beyond what could’ve been measured at the time. It tells us why we should push through with something despite the lack of evidence; through that, we can understand the limitations of what was measured.
But most policies are going to be much lamer: There is insufficient evidence to prove that a thing which should work in theory is, in fact, working. Should we push through that?
There are several reasons why there might not be evidence. We could be measuring the wrong things, we could be measuring the right things in the wrong way, or maybe it actually just doesn’t work.
But a cornerstone of the scientific method is the null hypothesis: The assumption that nope, that drug doesn’t cure cancer; nope, the Higg Boson isn’t there; and nope, the policy doesn’t work. The null hypothesis is the default – like “innocent until proven guilty”, it’s presumed that things don’t cure cancer until it’s proven that it does. It’s a great standard for science, but it’s a catastrophic principle to apply to government.
It’s no coincidence that Big Tobacco are great fans of “evidence-based policy”. Like in this 2012 excerpt from former Philip Morris spokeperson, now National MP, Chris Bishop:
“We support evidence-based regulation of all tobacco products. In particular, we support measures that are effective in preventing young people from smoking. Plain packaging fails this standard because it is not based on sound evidence and will not reduce youth smoking.” (emphasis added)
When Bishop said that it wasn’t based on sound evidence, he meant that it was only based on experimental evidence, rather than real world empirical evidence – which didn’t exist because plain packaging wasn’t in place, which shouldn’t go ahead because there was no evidence, which didn’t exist because… and so on.
The idea is that nothing ought to be done until evidence shows it really works. Like “it’s better that ten guilty people go free than let one innocent suffer”… except “it’s better that 5000 people per year die of smoking than one innocent intellectual property right suffer”.
The tobacco industry’s (mainly in Australia) position kept shifting as the evidence started to come in: “nothing happened in the first year so we’re right”, “sure you have evidence but we commissioned this other piece of evidence which says we’re right”, through to, as always “evidence schmevidence, it’s always been about the principle of intellectual property rights”.
So, surprise! They weren’t really interested evidence-based policy after all.
The problem isn’t that the tobacco industry is self-interested. It’s that when we pretend governance is science, we’re creating a bias towards accepting the null hypothesis, towards doing nothing.
But governance isn’t science.
Science is about challenging, acquiring, and testing knowledge. It loses nothing from being uncertain, but false knowledge can lead science astray for years, even centuries. That’s why the scientific method is skeptical and conservative by design.
Governance is about making decisions with imperfect knowledge. They have to be best guesses because inaction can be as catastrophic as incorrect action, and because sometimes solid knowledge is hard to come by.
When we use the language of science for governance, we’re setting up a hair-trigger (“but there’s no evidence for that”) that favours doing nothing. The best case scenario is that it’s a sincerely inquisitive process that could be abused by bad actors to keep us locked into inaction. The worst case scenario is that this is inaction by design, that it is ideologically driven Conservatism trying to hide behind the language of scientific conservatism.
The cargo cult of data democracy
The alternative isn’t to ignore evidence, but to consider the evidence in the context of its limitations, to weigh up conflicting information, and to look beyond what’s immediately apparent. That’s the part of the job that English’s “12 year old” can’t do. That is why we have university systems to train people in public policy. That is why what they do is complex and special.
You can open up numbers to people, and a 12 year old can figure out which number is bigger, and whether the trend is going up or down. But suggesting that’s how decisions get made – or even that a modern society could make real decisions like this – is just plain wrong. It’s a cargo cult to believe that because policymakers have “data”, therefore if you have “data” you can/are meaningfully engaging with policy.
Don’t get me wrong, open data is really important. I make a living off open data. Some of my best friends are open data. And data have many uses beyond policy. But in policy, data is just an important link in a very long chain; in democracy 2.0 (or whatever version we’re on now), open data is also just a link in a very long chain.
The links on either side of that chain are academics and politicians, journalists and spin-doctors, all vying to interpret that data. Opening data makes it more available to these people, and potentially makes them more effective, but it doesn’t empower anyone who wasn’t empowered before.
If we’re to democratise policymaking, we need to democratise expertise and time. And we aren’t doing that. In fact, as we advance into the brave new world of data, data expertise becomes ever more inaccessible.
How many people understand how the census works? The difference between that and a survey? Means and medians? Well, that’s the old, obsolete baseline for working knowledge of stats.
Compare that with “how many people know how the IDI work?”, or even know what the IDI is? How many people understand the consequences and potential of linking vast sets of data, of the depth and breadth of the administrative data that the IDI draws from?
Or what predictive models are and how they work? When a good prediction model is defined as one which is right more than 25% of the time, what are their benefits and limitations, and where should they fit into how decisions are made? Are we, as a democracy, equipped to have a conversation about what to do with “predictors” that are wrong most of the time, but considerably better than nothing?
As data-driven policymaking becomes more sophisticated, the circle of experts is getting smaller, and everyone on the outside is getting left behind.
Decision makers have rarely been technical experts, and it’s always been assumed that they don’t need to be – that’s what briefings are for. But this isn’t just a subject matter, this is the tool that we use to understand the world. A fuzzy understanding of data is a fuzzy understanding of everything that the data describes, everything that data touches.
It’s coming. Bill English’s datatopia was under way before, it will take centre-stage now that he’s prime minister, and it’ll probably continue after he leaves. We won’t be able to roll back the data revolution even if we wanted to.
And the opportunities really are enormous. We will know more, know better, know faster. But I’m taking a glass half-full view because we need to remember what goes in the other half. Data needs to be backed up knowledge, expertise and analysis (shout-out to all the faceless bureaucrats!). Rigour and discipline need to be backed up by vision and imagination about what could be.
Above all, we need to not treat data as a magic lamp that just gives us what we ask for. It’s a powerful tool. We need to understand what it is and how to use it, or we’ll be at the mercy of data nerds and dumb machines.