Google’s Top Stories algorithm is failing to detect authoritative sources

This is a pretty significant Google algorithm fail:

That’s utterly incorrect news ranking right at the top of a Google search for a breaking news event.

Over the last few years, the “news box” that appeared at the top of Google search results for news-related queries has been replaced by the carousel. It looks like this:

Ruddy hell

On mobile it features sites with Accelerated Mobile Pages (AMP) support:

Rudd dud

My guess is that this was a desktop phenomenon, and probably didn’t show up on mobile. However, that’s still serious, and Gizmodo’s Melanie Ehrenkranz managed to get a detailed statement as to why this happened:

A Google spokesperson sent an unprompted email offering additional information about how Top Stories works and what lead to the 4chan links being included in the module. According to Google, inclusion in Top Stories is determined by a number of factors, including both the “authoritativeness” of a site and how “fresh” the link is. The company said the Top Stories module for searches of “Geary Danley” was triggered by an increase in queries of that name. Since 4chan is arguably not an authoritative website, the freshness of the story seemingly pulled the link into Google’s module.

The authoritativeness/freshness aspect of the carousel is well known – and there’s no surprises there. And no excuses, either. As The Guardian‘s Martin Belam puts it:

Since when was 4chan news?

And Martin’s right. What’s interesting is that 4chan is in a news slot. Traditionally, we believed that slot was populated from Google News members – and you have to apply and be accepted into that section – rather than from the general index. The idea that Google is now juts pulling any old site into the news slot is troubling, especially when we see such clear patterns of sentiment manipulation going across the web. This suggests that intentionally dishonest news sources can actually get themselves into the carousel just by publishing fast and often.

And it’s not like we haven’t seen that pattern before. Ehrenkranz again:

In February, a LinkedIn blogger wrote over 150 articles about how to stream the Super Bowl consisting of nonsensical strings of keywords aimed at fooling Google’s search algorithm.

And it worked really well.

I’d say something was seriously amiss in the balance of authoritativeness aspect of the Top Stories algorithm right now.