The few clues to what caused it point to a problem from within
| DAVID TUFFLEY | Suddenly and inexplicably, Facebook, Instagram, WhatsApp, Messenger and Oculus services were gone. And it was no local disturbance. In a blog post, Downdetector.com, a major monitoring service for online outages, called it the largest global outage it had ever recorded — with 10.6 million reports from around the world.
The outage had an especially massive knock-on effect on individuals and businesses around the world that rely on Whatsapp to communicate with friends, family, colleagues and customers.
It took Facebook nearly six hours to get services back online, albeit slowly at first. Ironically, the outage was so pervasive Facebook had to resort to using Twitter, its rival platform, to get updates out into the world.
The internet and its outwardly visible face (the World Wide Web) is a remarkably fault-tolerant machine. It was designed to be resilient — and the web has never gone down completely. As such, global outages like this one are quite rare.
But they do happen. To Google’s embarrassment, several of its services including Gmail, YouTube, Hangouts, Google Calendar and Google Maps went offline for about an hour in December last year.
And in June this year, a cloud-computing company that services clients such as the Guardian, the New York Times, Reddit and The Conversation went offline too.
What caused it?
While Facebook’s management was apologetic, they gave no hint as to what caused the outage.
With hacking issues becoming all too common in today’s cyber-security threat environment, the question arises whether Facebook’s outage might have been the result of a successful hack. But this seems unlikely.
According to a report from The Verge referencing Facebook’s Chief Technology Officer and Vice President of Infrastructure, it seems the problem was probably Facebook’s internal infrastructure.
Facebook engineers were sent to one of the company’s data centres in California to work on the problem, which implies they were unable to log in remotely to the data centre.
Experts have said the outage could have only have come from inside the company. It’s likely Facebook engineers inadvertently made changes to how the network is set up, creating a cascading set of problems.
Such events have happened before, albeit not with such a catastrophic effect.
However, given the highly confidential way Facebook operates its network, it’s not possible to know exactly what happened with the network configuration. We will probably never be told.
A Domain Name Server problem
Supporting the network configuration explanation is the fact that the error messages that appeared when people tried to contact facebook.com and whatsapp.com indicated it was a DNS problem. So the websites still existed, but couldn’t be reached.
DNS stands for Domain Name Server and is described as the “phonebook of the internet”. It translates domain names read by us into encoded internet addresses (IP addresses) to be read by computers.
When you enter a domain name such as “facebook.com” or “whatsapp.com” into your browser, the Domain Name Server is consulted and the corresponding encoded internet address, the IP, is called.
When everything is working as it should, the user is then connected to the requested domain. On the strength of evidence gleaned from expert sources close to Facebook, it seems most unlikely the outage was caused by an external attack.
A whistleblower speaks up
The Facebook outage occurred only hours after the US-based 60 Minutes program aired an incendiary interview with former Facebook employee and whistleblower, 37-year-old Harvard graduate Frances Haugen.
In a complaint to federal law enforcement, and in the interview, Haugen alleges Facebook’s Instagram app is harming teenage girls, and that Facebook’s own research indicates the company “amplifies hate, misinformation and political unrest, but the company hides what it knows”.
To support the allegations, Haugen shared more than 10,000 pages of internal documentation with the U.S. Securities and Exchange Commission — all pretty damning stuff.
She said: “The thing I saw at Facebook over and over again was there were conflicts of interest between what was good for the public and what was good for Facebook, and Facebook over and over again chose to optimise for its own interests, like making more money”.
Given the timing of the interview and Facebook’s global outage, it’s natural to wonder whether the two events are connected. However, with the absence of any definitive evidence to support this theory, a causal link has not been established between both events.
But considering the seriousness of Haugen’s allegations, and the weight of objective evidence in the form of thousands of insider documents, it’s clear further investigation is warranted.
Facebook has around 2.89 billion monthly active users and a market capitalisation of US$1.21 trillion. By any standard, it’s a big and powerful company with a great deal of influence. Now is the time to shine a light on its ethics, or lack thereof. Hopefully there won’t be any more outages to slow down this process.
David Tuffley is Senior Lecturer in Applied Ethics & CyberSecurity, Griffith University