futuristic computer ... imresizer

Amazon’s AWS Cloud Restored After Global Internet Blackout Disrupted Thousands

Amazon Web Services returned to normal operations after massive cloud failure took down major apps and websites globally for nearly 15 hours, affecting millions.

Summarize with:
ChatGPT
Perplexity
Gemini
Claude.ai
Grok

Amazon Web Services returned to normal operations late Monday evening after a massive cloud computing failure took down huge chunks of internet worldwide. The disruption affected millions of users and businesses across continents for nearly 15 hours, making it one of the largest technology breakdowns since last year’s CrowdStrike incident.

What Happened During the Outage

Around 3 a.m. Eastern Time on Monday morning, people across America and Europe suddenly couldn’t access their favorite apps and websites. Snapchat wouldn’t load, Fortnite players got kicked out of games, and even Amazon’s own shopping site struggled. The problem started at AWS data center in northern Virginia, which powers a big portion of the internet’s behind-the-scenes operations.

AWS first admitted something went wrong at 12:11 a.m. Pacific Time. Their health dashboard showed “increased error rates and latencies” for services in their US-EAST-1 region. What started as technical hiccup quickly spiraled into full-blown crisis affecting thousands of companies globally.

The root cause turned out to be DNS resolution problems with DynamoDB, Amazon’s database system. DNS works like internet’s phone book – it helps computers find websites. When that stops working properly, everything connected to those services breaks down.

Which Services Got Hit

The outage didn’t discriminate. Gaming platforms like Fortnite, Roblox, Clash of Clans and Rainbow Six Siege all went dark. Social media users found Snapchat frozen, with over 22,000 outage reports flooding Downdetector at peak. Popular apps like Duolingo, Canva, and Wordle stopped responding entirely.

Financial services took major hits too. Robinhood, the stock trading app, left investors unable make trades during market hours. Cryptocurrency exchange Coinbase confirmed AWS issues prevented users from accessing their accounts, though they assured everyone their funds remained safe. Payment service Venmo experienced disruptions that stopped people from sending money.

Even Amazon’s own products weren’t spared. Ring doorbell cameras stopped recording, Alexa voice assistants went silent, and Prime Video subscribers couldn’t stream their shows. The irony wasn’t lost on frustrated customers – AWS couldn’t even keep Amazon’s own services running.

Airlines Delta and United faced system problems. UK banks including Lloyds and Bank of Scotland reported difficulties. Signal messaging app confirmed outages linked directly to AWS. Perplexity AI’s CEO Aravind Srinivas posted on X that his entire platform went down because of AWS problems.

The Long Road to Recovery

Amazon’s engineers worked through the night trying fix things. By 5:22 a.m. Eastern Time, AWS reported they applied “initial mitigations” and saw “early signs of recovery”. But that didn’t mean everything magically worked again. The company warned customers about continued failures and delays.

Three hours after the outage began, AWS said most services started recovering. However, lingering problems persisted throughout Monday. Some platforms like Reddit and Roblox stabilized quicker than others. Snapchat and Duolingo kept experiencing intermittent disruptions even after AWS claimed the main issue got resolved.

By 6:35 a.m. ET, Amazon announced the “underlying DNS issue has been fully mitigated”. Still, they cautioned that some services faced backlogs requiring additional time to process. It wasn’t until 6 p.m. ET – nearly 15 hours after problems started – that AWS finally declared “services returned to normal operations” on their health dashboard.

Financial Services Severely Affected

Experts estimate this outage could cost businesses hundreds of billions of dollars. Mehdi Daoudi, CEO of internet monitoring company Catchpoint, told reporters the economic impact likely reached that staggering figure due lost productivity and halted operations.

Analysis from Tenscope calculated major websites lose roughly $75 million per hour when down. Amazon alone potentially lost $72.8 million hourly. Snapchat faced losses around $612,000 per hour, while Zoom lost approximately $532,580 hourly. For smaller platforms like Reddit and Canva, costs still ran into hundreds of thousands per hour.

Over 1,000 companies reported disruptions affecting their operations. Downdetector received more than 11 million user reports during the incident. Internet speed testing firm Ookla confirmed at least 4 million users experienced problems globally.

Why This Keeps Happening

The US-EAST-1 region in northern Virginia has history of causing widespread internet failures. Similar major outages hit that same data center in 2017, 2020, 2021, and 2023. The December 2021 incident remains AWS’s worst outage ever, lasting over seven hours.

Cybersecurity experts say the problem stems from too many critical services depending on single region. When one part fails, it creates cascading effect across entire internet. AWS provides infrastructure for approximately 30% of global cloud market, making any disruption feel massive.

Professor Oli Buckley from Loughborough University explained the situation bluntly: “A wide range of platforms have realised that Amazon is more than just shopping and streaming movies”. The concentration of internet infrastructure among few providers like AWS, Google Cloud and Microsoft Azure creates inherent fragility.

What Amazon Says No

Amazon hasn’t issued detailed public statement beyond technical updates on their health dashboard. The company attributed problems to malfunctioning subsystem monitoring network load balancers distributing traffic between servers. An internal problem ticket reviewed by reporters mentioned “tons of internal services” needed resolution and repair.

AWS recommended customers flush their DNS caches if still experiencing issues. They assured users no evidence suggested cyberattack or malicious activity caused the failure. Experts agreed this appeared like regular operational outage rather than security breach.

The incident serves as harsh reminder about modern internet’s vulnerability. When single company controls such massive portion of cloud infrastructure, even minor technical glitch can disrupt daily life for millions worldwide.

(Source: aboutamazon)

Leave a Reply

Your email address will not be published. Required fields are marked *

Amazon’s AWS Cloud Restored After Global Internet Blackout Disrupted Thousands
Share