On February 14, 2018, 14 students and 3 staff members were killed by Nikolas Cruz at the Marjory Stoneman Douglas High School in Parkland, Florida.
As news quickly spread across the web, conspiracy theorists quickly chimed in, saying that the events were pre-planned based on the dates that Google Search outputted next to many of the articles related to the incident. Some even went as far to say that many news outlets pre-wrote their stories as part of a campaign to ban guns in the United States.
If you search “Florida school shooting” and limit the dates to before February 14, 2018, many articles about the incident come up with dates pre-dating the actual event.
For those who love a good conspiracy story, this just isn’t your time as there is a really simple explanation as to why these dates are inaccurate, not just for this incident but for any major news story that gets indexed by Google Search.
It comes down to bad website programming
When Google indexes an article, it looks for a number of key identifying details from the website’s HTML meta tags. It can pull information such as the story title, the date and time it was published, who wrote the story, the relevant thumbnail, and much more. However, it can only do a as good a job depending on how much meta data is provided by the website.
If a web page does not have this information readily available in the background, Google can make a right mess of things depending on how badly coded the website is.
Let’s take a look at one of those problem websites. Like many other major publications, Japanese news site, Kyodo News had a publication date of February 5, 2018, even though the tragedy took place on the 14th (9 days after the article date).
For starters, Google Search’s robot crawlers would look for specific date meta tags in a website’s header like the one below:
<meta property=”article:published_time” content=”2018-02-14T23:10:32+00:00″>
There are a few variants of this, but they basically tell Google when the article was published.
Unfortunately, many news websites that use custom content management systems, do not include this crucial bit of information.
Going back to our Kyodo News example, a quick run through the site’s HTML source code shows that there is no dedicated tag for the publication date.
But what about the date below the article title?
Google was unable to find any meta tags in the site’s header section, so what about the date under the article?
If we dig through the code, we get this as the raw output.
Feb 15, 2018 – 10:02
So how did Google still think the story was published on the 5th?
If you look to the right hand side under their ‘Popular’ sidebar widget, there is an unrelated post dated “Feb 5, 2018”.
If we dig through the HTML source code once again, it’s wrapped like this.
<p class=”time”>Feb 5, 2018 | <a href=”/kyodo_news”>KYODO NEWS</a></p>
Notice how the paragraph (p) tag has the word ‘time’ describing the class?
In short, Google’s crawler found this element to be the first to indicate what it thought was the article date due to the class being labeled as ‘time‘. It bypassed the previous date because the element outputting the 15th (which is the correct publication date) had no identifying information. The search crawler would have simply bypassed it as a result.
Google then mislabeled the post as being published on the 5th instead of the 15th.
If you happen to use a content management system (CMS) like WordPress, you can install a 3rd party plugin that easily handles meta tags for search engines and social media previews, but as many major news outlets have their own in-house CMS, things like this get overlooked due to cost constraints. A simple addition like this could run the news outlet hundreds or even thousands of dollars, depending on how much the developer wants to rip them off for.
Now that we’ve debunked this myth, it’s time to take off those tin foil hats.