Back to overview
Degraded

Partial Outage on the Calculation / Pricing Endpoints

Apr 12 at 02:00am CEST
Affected services
Production
Sandbox

Resolved
Apr 12 at 08:45am CEST

The issue has been resolved.

Created
Apr 12 at 02:00am CEST

This is a regression of https://status.squake.earth/incident/354138.

Summary

Requesting a price quote from the API failed for some products; two endpoints were affected: pricing and calculation including a pricing.

The root cause was faulty meta data in the products table in combination with not gracefully handling such faulty data in the API when syncing it from another source.

An initial data fix within ten minutes of the incident fixed the problem; however, it regressed overnight when a CRON job synced back the faulty data.

A permanent solution to make the API more robust against such failures was rolled out and included on a high level:

  • better isolation of the most critical endpoints for calculation and pricing on a code level to minimize side effects
  • improved the robustness of all touch points where we synchronize data from external sources to handle potentially faulty data more gracefully
  • internal post-mortem meeting to share learnings across the entire engineering team