The dangers and rewards of Data Journalism

Data journalism at work

Allison Schrager has a problem with data journalism:

But I worry that data give commentary a false sense of authority since data analysis is inherently prone to bias. The author's priors, what he believes or wants to be true before looking at the data, often taint results that might appear pure and scientific. Even data-backed journalism is opinion journalism. So as we embark on this new wave of journalism, we should be aware of what we are getting and what we should trust.

How, though, is this different from traditional journalism? Inevitably, a journalist brings her bias to any story - and however hard we strive to eliminate that, it will find expression, whether we like it or not.

There's some very useful advice towards the end of her post about how to avoid distorting the data too much, and how to make sure that you're looking at the whole story, but she doesn't address that central question hinted at my her use of "a false sense of authority".

My gut instinct is that people are more wary of bias in data journalism, because there's a tendency to believe that the "numbers don't lie". And they might not. But how you chose to present them has a very big impact on the message a particular truth tells...

That's art, my friend, not science

Tim Hartford wrote about how misinformation can be beautiful for the FT a little while back:

Data visualisation creates powerful, elegant images from complex data. It's like good prose: a pleasure to experience and a force for good in the right hands, but also seductive and potentially deceptive. Because we have less experience of data visualisation than of rhetoric, we are naive, and allow ourselves to be dazzled. Too much data visualisation is the statistical equivalent of dazzle camouflage: striking looks grab our attention but either fail to convey useful information or actively misdirect us.

Visualisation, in particular, is in danger of dressing art up as science. You need to be very careful that the data tells the story it actually claims it does, and that you don't distort things for the sake of a more compelling "angle" or aesthetic representation.

And, y'know, check your own assumptions at the door when you dive into this. But this is all good, standard journalism practice anyway - not something new or unique to data journalism.

The fox knows his pivot tables, that's what...

Of course, this whole discussionon has been triggered by various interviews that Nate Silver has given around the launch of FiveThirtyEight. Matthew Ingram sums up the discussion pretty well:

When it comes to using data of any kind in the creation of journalism, Silver says that traditional journalists are quite good at the first two steps of the process -- namely, the collection of data and the organization of it into a news story or other format. However, they often fail to do as good a job at the next two steps, he says, which include the explanation or analysis of the data and some kind of generalization about its future implications.

Lurking in this is an explanation for the current obsession with data journalism: traditionally, we've only had very limited access to significant datasets. The arrival of digital technology has made collecting, sharing and analysing datasets significantly more simple, and so has opened up a whole new field of journalism, that we've only scratched the surface of before. It's difficult to complain about this (although some people try...), because more sources of stories is pretty much always a good thing.

But for an industry which tends to bend liberal in its politics, journalism can be very small "c" conservative in its outlook. There's an inherent suspicion of the new that anyone who has been working in online digital development will be familiar with. Coupled with the mistrust of journalists that the last few years of revelations has engendered, it's not a surprise that people are slightly suspicious of what we're doing with the numbers.

An oasis of fact in a desert of opinion

However, I think that the Economist's robust defence of Silver and his approach to journalism actually gets to the heart of the matter. Right now, we have far, far too much opinion-based journalism and not nearly enough fact-based reporting.

As the piece concludes:

There is, and always will be, a place for bullshit--or if you prefer a more dignified construction, a place for arguments driven by ideas, belief and feeling rather than data. Positivism is in no danger of sweeping such journalism away in toto; American newspapers and airwaves are full, far too full, of shouters, table-bangers, aspersion-casters and heartstring-tuggers. They drive ratings and traffic (and inspire blogposts). But to the extent that Mr Silver's mission is to shrink bullshit's share of our national conversation, I can only wish him Godspeed.

Who can argue with that? I meet more journalism students interested in becoming opinion columnists than I do those interested in data journalism. I've talked to managing editors at our national papers who despair of finding graduates keen to get on a do reporting, rather than writing leader-type columns.

Data journalism is actually a form of back-to-the-roots movement, of focusing our journalism back on finding facts and the stories within them. We're just using different tools to do that.