Yesterday’s hours-long Amazon Web Services (AWS) outage provided a vivid illustration of how much large parts of the Internet depend on the cloud service. It also presented a puzzle for many users: because the AWS health dashboard itself depends on the cloud service, the status messages failed to indicate any signs of trouble throughout the outage.
Now resolved, yesterday’s outage of Amazon’s S3 (Simple Storage Service) cloud-based object storage service caused many Web sites to be inaccessible or slow to load for several hours. Affected sites and services included Adobe, Coursera, Cracked, Imgur, Mailchimp, Medium, Quora, Slack, Trello as well as Internet health-tracking sites such as Downdetector and Is It Down Right Now.
S3 is an “object storage with a simple Web service interface to store and retrieve any amount of data from anywhere on the Web,” according to Amazon. Used by more than 150,000 Web sites, S3 is designed for up to 99.99 percent availability. Yesterday’s outage illustrated that one-in-ten-thousand chance of non-availability.
Problem at Virginia Data Center
While Amazon’s cloud service health dashboard gave no indication of trouble, yesterday morning AWS noted on its Twitter account that S3 was “experiencing high error rates” that the company was working to recover. Because the dashboard wasn’t showing alert color changes due to the S3 issue, Amazon also posted updates in a banner at the top of the Web page.
By 1:49 p.m. PST, all S3 services for object retrieval, listing, deletion and addition had been recovered and were back to working normally, Amazon said. The company said that the outage was traced to its US-EAST-1 gateway location, which is its data center in northern Virginia.
During the outage, Twitter became the place for various AWS customers and others to share information as well as to vent and post humorous items about…