— Twitter VP of Engineering, Mazen Rawashdeh, in a blog post published yesterday apologizing for the outage and explaining what went wrong:
The cause of today’s outage came from within our data centers. Data centers are designed to be redundant: when one system fails (as everything does at one time or another), a parallel system takes over. What was noteworthy about today’s outage was the coincidental failure of two parallel systems at nearly the same time.
I wish I could say that today’s outage could be explained by the Olympics or even a cascading bug. Instead, it was due to this infrastructural double-whammy.
Rawashdeh continues on to say, “We are investing aggressively in our systems to avoid this situation in the future.” During the outage, it had been speculated that increased Olympics-related tweets or a bug similar to one that caused Twitter to crash in late June may have been to blame.