Eblogtip.com
  • Categories
    • News
    • Technology
    • Domains
    • Hosting
    • Promotions

Archives

  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • December 2022

Categories

  • News
  • Technology
  • Uncategorized
eBlogTip
  • Categories
    • News
    • Technology
    • Domains
    • Hosting
    • Promotions
  • News

Major Microsoft Azure outage was caused by a simple typo

  • June 5, 2023
Total
0
Shares
0
0
0


A Microsoft Azure DevOps outage in the South Brazil Region, which lasted over 10 hours, was caused thanks to a typo in the code that saw 17 production databases deleted.

Having apologized to impacted customers for the outage, Microsoft has now issued a full post-mortem, sharing details about the investigation that took place from when the outage was first noticed at 12:10 UTC on May 24, until its remedy at 22:31 UTC on the same day.

Microsoft principal software engineering manager Eric Mattingly shared details of the code base upgrade which formed part of Sprint 222. Inside the pull request was a hidden typo bug in the snapshot deletion job, which ended up deleting the Azure SQL Server rather than the individual Azure SQL Database.

Coding error

Mattingly explained: “when the job deleted the Azure SQL Server, it also deleted all seventeen production databases for the scale unit,” confirming that no data had been lost during the accidental process.

The outage was detected within 20 minutes, at which point the company’s on-call engineers got to work, however according to the event log the root cause was identified at 16:04, almost four hours after the outage had begun.

Microsoft blamed the over ten-hour fix time on the fact that customers themselves are unable to restore Azure SQL Servers, as well as backup redundancy complications and a “complex set of issues with [its] web servers.”

Having learned from its mistake, Microsoft has no promised to roll out Azure Resource Manager Locks to its key resources, in an effort to prevent future accidental deletion. 

Despite a same-day fix, customers in the region were left without access to some services for several hours, emphasizing how easy it is for things to go wrong and the importance of having backup plans to reduce reliance on single service providers, including cloud storage and other off-prem infrastructure.


Source link

Total
0
Shares
Share 0
Tweet 0
Pin it 0
Previous Article
  • Technology

Helium Health gets $30M, backed by AXA IM Alts and 23andMe’s Anne Wojcicki

  • June 5, 2023
View Post
Next Article
  • News

Twitter’s U.S. Ad Sales Plunge 59% as Woes Continue

  • June 5, 2023
View Post
You May Also Like
View Post
  • News

Payday 3 players still can’t get online at peak times three days after launch

  • September 23, 2023
View Post
  • News

Marvel’s Spider-Man 2 actor says Peter Parker could “look like a goblin” as long as his performance is better

  • September 23, 2023
View Post
  • News

Samsung accidentally leaks its own Galaxy S23 FE, Galaxy Buds FE and Tab S9 FE

  • September 23, 2023
View Post
  • News

Microsoft clarifies Windows 11 23H2 update isn’t arriving next week

  • September 23, 2023
View Post
  • News

iPhone 15 Pro drop test suggests it’s not as durable as iPhone 14 Pro

  • September 23, 2023
View Post
  • News

TV Networks’ Last Best Hope: Boomers

  • September 23, 2023
View Post
  • News

Netflix Prepares to Send Its Final Red Envelope

  • September 23, 2023
View Post
  • News

TV Networks’ Last Best Hope: Boomers

  • September 23, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

eBlogTip.com
  • Categories

Input your search keywords and press Enter.