Partial Outage on the Calculation / Pricing Endpoints
Resolved
Apr 12 at 08:45am CEST
The issue has been resolved.
Affected services
Production
Sandbox
Created
Apr 12 at 02:00am CEST
This is a regression of https://status.squake.earth/incident/354138.
Summary
Requesting a price quote from the API failed for some products; two endpoints were affected: pricing and calculation including a pricing.
The root cause was faulty meta data in the products table in combination with not gracefully handling such faulty data in the API when syncing it from another source.
An initial data fix within ten minutes of the incident fixed the problem; however, it regressed overnight when a CRON job synced back the faulty data.
A permanent solution to make the API more robust against such failures was rolled out and included on a high level:
- better isolation of the most critical endpoints for calculation and pricing on a code level to minimize side effects
- improved the robustness of all touch points where we synchronize data from external sources to handle potentially faulty data more gracefully
- internal post-mortem meeting to share learnings across the entire engineering team
Affected services
Production
Sandbox