Home

Azure stumbles in Western Europe, Microsoft blames 'thermal event'

Microsoft has warned of a “thermal event” impacting Azure users in its West Europe region, and perhaps elsewhere.

A status update time-stamped 2249 UTC on November 5th advises that as of 1700 UTC on the same day, “a subset of customers in the West Europe region may experience service disruptions or degraded performance across multiple services, including Virtual Machines, Azure Database for PostgreSQL Flexible Servers, MySQL Flexible Servers, Azure Kubernetes Service, Storage, Service Bus, and Virtual Machine Scale Sets, among others.”

Microsoft also warns that users of Azure Databricks in West Europe “may see degraded performance when launching or scaling all-purpose and jobs compute workloads, impacting Unity Catalog and Databricks SQL operations.”

The West Europe region is in the Netherlands.

The Windows giant blamed the problems on “a thermal event affecting datacenter cooling systems, which led to a subset of storage scale units going offline in a single availability zone.”

Microsoft said the incident occurred after automated monitoring systems detected a spike in hardware temperatures and related service incidents across multiple storage scale units.

The company said one impacted storage scale unit has recovered, recovery efforts are in progress on others, and it expects to see signs of recovery on those units "in approximately 90 minutes."

Microsoft has also warned that “Resources in other availability zones that depend on these storage units may also be impacted.”

That’s perhaps the least welcome aspect of this incident, because hyperscale cloud operators try to enhance resilience by building multiple datacenters in close proximity. Cloud operators call those clusters of datacenters “regions” and each discrete facility is an “availability zone.” Hyperscalers advise customers to rent resources in multiple availability zones, so that if one experiences trouble, the result is not catastrophic.

This incident shows that spreading resources across availability zones is no guarantee of resilience. ®

Source: The register

Previous

Next