Summary:
Endpoint Agent data services were partially unavailable. The issue is now solved.
Impact:
Endpoint Agents data reads were impacted by this issue. As a result, endpoint agent data was partially unavailable from dashboards, views, scheduled tests, and the API.
The Endpoint Agent controller (c1.eb.thousandeyes.com) was also impacted. During this outage Endpoint Agents were not be able to receive new scheduled test, label, or monitored domain assignments. The platform will report Agents as unseen during this outage, affecting the reported "last seen" time. New Endpoint Agents were also not be able to complete the registration process during this time.
Endpoint Agent data submission is not affected by this outage.
The issue is now resolved.
Timeline:
2020-10-12 13:53 UTC: Issue first observed
2020-10-12 15:45 UTC: Issue has been identified. Operations applied a fix and we are currently monitoring performance.
2020-10-12 16.24 UTC: Endpoint Agent controller service has been restored. Endpoint Agents may experience delays when receiving new scheduled test, label, or monitored domain assignments.
2020-10-12 17.02 UTC: Endpoint Agents services have been restored but users may still observe delay in Endpoint Agent data availability.
2020-10-12 17.32 UTC: Endpoint Agent services have been restored but we are still observing a delay in Endpoint Agent data availability of up to 90 minutes.
2020-10-12 19.48 UTC: We are still observing a delay in Endpoint Agent data availability of up to 120 minutes.
2020-10-12 22.29 UTC: We are still observing a delay in Endpoint Agent data availability of up to 150 minutes.
2020-10-12 22.29 UTC: Endpoint Agent services have been restored but we are still observing a delay in Endpoint Agent data availability.
2020-10-12 23.40 UTC: Since the changes have been applied, we are observing the delay decreasing. Data availability is anticipated to return to normal by 01:30 UTC 2020-10-13.
2020-10-13 01:27 UTC: Instant tests services have been restored while the scheduled tests are still experiencing a delay in Endpoint Agent data availability.
2020-10-13 02:36 UTC: This incident has been resolved.
Please visit status.thousandeyes.com for real-time updates
(Resolved) 2020-10-12 13:53 UTC: Service degradation affecting Endpoint Agent data availability
Last updated: Tue Oct 13 12:30:30 GMT 2020