(Resolved) 2018-08-07 03:00 (UTC) Service degradation affecting delivery of Notifications for Alert Rules

Last updated: Tue Aug 07 09:16:09 GMT 2018

The ThousandEyes platform experienced delays in delivering all forms of Notifications (email, Webhooks and Integrations) for Alert Rules, between 2018-08-06 21:30 UTC and 2018-08-07 03:00 UTC.

Affected scope

All Notifications for Alert Rules were delayed during the period of the issue.

Root Cause

Notifications were created normally, but were held in a queue instead of being dispatched immediately. The cause of this failure was a restart of the queueing service outside of the system configuration management, causing the service to start without the correct configuration.

Our internal monitoring for notifications in the queue did trigger an alert, but the alert was sent to a lower-priority channel, due to past noisiness of the alert. We will address the alert sensitivity issue to ensure this situation is not repeated.

Status

The issue has been resolved.

Event Timeline

2018-08-07 03:00 UTC: Issue resolved. Notifications dispatch services were restored and queued notifications were delivered.
2018-08-06 21:30 UTC: Notifications dispatch services were delayed and notifications were held in queue.