Quantcast
Channel: Microsoft Dynamics 365 Community
Viewing all articles
Browse latest Browse all 13977

Post Incident Report | NAM | CRM Online | A Small subset of organizations in North America inaccessible on November 24, 2014

$
0
0

Summary

On November 24th, 2014 some CRM Online organizations hosted in one of our North American data centers went offline. The issue was detected by monitoring and Dynamics Service Engineering followed established troubleshooting processes to investigate and fix the issue. Less than 1% of customers in North America were affected.

 

 

Customer Impact

During the incident, customers would have experienced very slow load times or timeouts while trying to access their CRM Online organization during a portion of the incident time.

 

Incident Start Date and Time

November 24, 2014 9:52 PM PST

 

Date and Time Service was Restored

November 24, 2014 10:12 PM PST

 

Root Cause

One of the SQL servers in a single cluster began to experience higher than normal CPU utilization, causing slow performance. The Service Engineering team received an alert to this condition and began failing over some availability groups to alternate database servers to alleviate the high CPU. Unfortunately, some of the availability groups did not fail over cleanly and those customers experienced a brief outage until those availability groups were brought back online.

 

Next Step(s)

Issue

Next Step

Team Owner

Timeline

Not all databases failed over cleanly

Investigation through logs and other data of why databases did not fail over.

Microsoft Dynamics CRM Online Service Engineering

Underway

High CPU utilization

Investigate the cause of the high CPU utilization on the single SQL server and what can be done to protect the service should it happen again

Microsoft Dynamics CRM Online Service Engineering

Underway


Viewing all articles
Browse latest Browse all 13977

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>