On May 9–10, Synoptic experienced elevated realtime data latency caused by an unexpected interaction between database maintenance activities and production data processing systems. During this period, some (not all) realtime observations were delayed by up to approximately 20 minutes.
The issue originated during planned database optimization and migration work that produced higher-than-expected load on portions of the production environment, impacting write performance and downstream data processing workflows. While automated monitoring alarms were triggered, response was not prioritized for the following combination of reasons:
Once the issue was identified, Synoptic engineering teams immediately implemented mitigation measures, including halting the affected replication processes, restarting impacted services, and scaling database resources. Realtime processing performance returned to normal following remediation activities.
We are conducting a full post-incident review and are already implementing several improvements, including:
We apologize for the disruption and appreciate our customers’ patience while the issue was resolved.