Incidents | SQUAKE.earth GmbH Incidents reported on status page for SQUAKE.earth GmbH https://status.squake.earth/ https://d1lppblt9t2x15.cloudfront.net/logos/933609b44c8df1e31bf3b738ffc7f3cc.svg Incidents | SQUAKE.earth GmbH https://status.squake.earth/ en Production recovered https://status.squake.earth/ Mon, 30 Jun 2025 09:27:29 +0000 https://status.squake.earth/#3aa05ddee42cb205ed9db39f7f24fd794b328606e22c16184f43270ea9172954 Production recovered Production went down https://status.squake.earth/ Mon, 30 Jun 2025 09:26:39 +0000 https://status.squake.earth/#3aa05ddee42cb205ed9db39f7f24fd794b328606e22c16184f43270ea9172954 Production went down Production recovered https://status.squake.earth/ Mon, 30 Jun 2025 09:01:39 +0000 https://status.squake.earth/#b85d6c1b6d33ca53a311ccb7b20410bfb6bb28b308a3f22309887d4bca4f3515 Production recovered Production went down https://status.squake.earth/ Mon, 30 Jun 2025 08:59:46 +0000 https://status.squake.earth/#b85d6c1b6d33ca53a311ccb7b20410bfb6bb28b308a3f22309887d4bca4f3515 Production went down Production recovered https://status.squake.earth/ Wed, 11 Jun 2025 11:34:23 +0000 https://status.squake.earth/#8d4fcb4c69e7fe04f464760a81945440fd17bd911d5c75e602205cea2d6b2a49 Production recovered Sandbox recovered https://status.squake.earth/ Wed, 11 Jun 2025 11:34:15 +0000 https://status.squake.earth/#d735ec6d20b16be5cb60e15f6dc58a0a93958266ff2f722b11500ba1f6b803fc Sandbox recovered Production went down https://status.squake.earth/ Wed, 11 Jun 2025 11:33:24 +0000 https://status.squake.earth/#8d4fcb4c69e7fe04f464760a81945440fd17bd911d5c75e602205cea2d6b2a49 Production went down Sandbox went down https://status.squake.earth/ Wed, 11 Jun 2025 11:33:15 +0000 https://status.squake.earth/#d735ec6d20b16be5cb60e15f6dc58a0a93958266ff2f722b11500ba1f6b803fc Sandbox went down Production recovered https://status.squake.earth/ Fri, 30 May 2025 12:46:29 +0000 https://status.squake.earth/#ef3bc235bb421c14e435ee730b3387aabc3a71228b93e8ffce7494e2076a5db7 Production recovered Production went down https://status.squake.earth/ Fri, 30 May 2025 12:42:58 +0000 https://status.squake.earth/#ef3bc235bb421c14e435ee730b3387aabc3a71228b93e8ffce7494e2076a5db7 Production went down Production recovered https://status.squake.earth/ Fri, 16 May 2025 08:52:22 +0000 https://status.squake.earth/#e01d0796d4304d492af3276c9a24c57fb34b042e2c149f91b0d1f699740e5450 Production recovered Production went down https://status.squake.earth/ Fri, 16 May 2025 08:51:59 +0000 https://status.squake.earth/#e01d0796d4304d492af3276c9a24c57fb34b042e2c149f91b0d1f699740e5450 Production went down Production recovered https://status.squake.earth/ Fri, 16 May 2025 08:44:08 +0000 https://status.squake.earth/#45125b8afeeafeefc1e8712399a3c100ae512931d3cfaa04f28ba05f1d73cc60 Production recovered Production went down https://status.squake.earth/ Fri, 16 May 2025 08:40:59 +0000 https://status.squake.earth/#45125b8afeeafeefc1e8712399a3c100ae512931d3cfaa04f28ba05f1d73cc60 Production went down Production recovered https://status.squake.earth/ Thu, 15 May 2025 16:14:36 +0000 https://status.squake.earth/#13e6cb9fdb8d75eebf508b7b19f3da48695cf5910299c144ba88356159b8ed7d Production recovered Production went down https://status.squake.earth/ Thu, 15 May 2025 16:10:25 +0000 https://status.squake.earth/#13e6cb9fdb8d75eebf508b7b19f3da48695cf5910299c144ba88356159b8ed7d Production went down API Degraded performance https://status.squake.earth/incident/419516 Wed, 19 Mar 2025 16:15:00 -0000 https://status.squake.earth/incident/419516#250718c059da7fa1471827df51de9a626aaa25f6a97c657fd358eaab2c8c8cc2 ## Post-Mortem: EU Region Request Timeout Incident ### Incident Overview **Incident Start:** 19th March 2025, 15:15 UTC **Incident End:** 19th March 2025, 16:15 UTC **Impact:** The incident caused timeouts for 4.5% of incoming requests, primarily affecting the EU region. The majority of affected requests were to the `/v2/calculations` endpoint. No security concerns were identified. ### Timeline 19th March 2025, 15:15 UTC: The incident began immediately following a deployment, with a series of requests timing out in the EU region. 19th March 2025, 15:17 UTC: The issue was acknowledged, and the tech lead was engaged to investigate. 19th March 2025, 15:42 UTC: A potential root cause was identified tied to high CPU usage and cache reloading post-release. 19th March 2025, 16:02 UTC: An immediate fix was deployed by reducing containers and disabling the cache reload worker. 19th March 2025, 16:15 UTC: The issue was fully resolved after containers were shuffled and the request backlog cleared. Root Cause Analysis The incident was triggered by a spike in traffic and high CPU usage immediately following a release. The root cause was traced to the cache reload process across containers, which overwhelmed the Puma queue. This led to timeouts as the system struggled to handle the simultaneous cache reload and incoming requests. Metrics showed no suspicious activity beyond the CPU and traffic spikes, confirming the issue stemmed from the cache reload mechanism during the deployment. ### Resolution and Recovery To resolve the incident: - The number of running containers was reduced to alleviate database strain. - The cache reload worker was disabled to prevent further queuing. - All pending requests were allowed to process fully. - The cache sync job was re-enabled once the spike subsided. By 16:15 UTC, the system stabilized, and normal operation resumed in the EU region. ### Impact Assessment Approximately 4.5% of requests timed out during the incident, primarily affecting the `/v2/calculations` endpoint in the EU region. The US region remained unaffected. While not a security threat, we acknowledge the disruption this caused for affected users. For assistance with impacted requests, such as re-running calculations, please contact the SQUAKE team. ### Preventive Measures To prevent recurrence, we are implementing the following: - Enhanced Observability: Improving tooling to monitor application performance and resource usage more effectively, especially during deployments. - Cache Reload Optimization: Revising the cache reload process to lock it during boot, preventing parallel job enqueuing by Sidekiq. - Load Testing: Conducting pre-release load tests to ensure consistent performance under varying traffic conditions, even if unrelated to the specific release. - Release Timing Strategy: Exercising greater caution in scheduling releases to avoid overlapping with peak traffic periods. We are committed to learning from this incident and enhancing the reliability of our services moving forward. Production recovered https://status.squake.earth/ Wed, 19 Mar 2025 14:54:42 +0000 https://status.squake.earth/#8e82d4fd3fdf7cea755702126bb366d335a86d6e6086c525d933e2955af20c11 Production recovered Production went down https://status.squake.earth/ Wed, 19 Mar 2025 14:49:13 +0000 https://status.squake.earth/#8e82d4fd3fdf7cea755702126bb366d335a86d6e6086c525d933e2955af20c11 Production went down Production recovered https://status.squake.earth/ Wed, 19 Mar 2025 14:42:44 +0000 https://status.squake.earth/#f8da8f5109aac3a7a325bd816271349ddab9b0bbfbc90eac82b3e2454c4728a1 Production recovered Production went down https://status.squake.earth/ Wed, 19 Mar 2025 14:37:26 +0000 https://status.squake.earth/#f8da8f5109aac3a7a325bd816271349ddab9b0bbfbc90eac82b3e2454c4728a1 Production went down Production recovered https://status.squake.earth/ Wed, 19 Mar 2025 14:34:47 +0000 https://status.squake.earth/#37fc9405d60269103bec929bf70c1e10eae0ca50b288ee19ce8baddf1a7e9acd Production recovered Production went down https://status.squake.earth/ Wed, 19 Mar 2025 14:30:41 +0000 https://status.squake.earth/#37fc9405d60269103bec929bf70c1e10eae0ca50b288ee19ce8baddf1a7e9acd Production went down Sandbox recovered https://status.squake.earth/ Wed, 12 Feb 2025 02:57:27 +0000 https://status.squake.earth/#b240f2acd660011cfbc2b325ea1bdd4d9596dbc91d5dc826a6ee16e93ce29e05 Sandbox recovered Sandbox went down https://status.squake.earth/ Wed, 12 Feb 2025 02:56:05 +0000 https://status.squake.earth/#b240f2acd660011cfbc2b325ea1bdd4d9596dbc91d5dc826a6ee16e93ce29e05 Sandbox went down Homepage recovered https://status.squake.earth/ Fri, 11 Oct 2024 23:20:49 +0000 https://status.squake.earth/#15bb5d395c225638a92d30d04d0df21c6366689c30fa18b80deaf80224e813ea Homepage recovered Homepage went down https://status.squake.earth/ Fri, 11 Oct 2024 22:50:46 +0000 https://status.squake.earth/#15bb5d395c225638a92d30d04d0df21c6366689c30fa18b80deaf80224e813ea Homepage went down Production recovered https://status.squake.earth/ Mon, 26 Aug 2024 10:44:58 +0000 https://status.squake.earth/#4f9349b550f5bac99c33d5350b9d5415206464853b96d272755248d1796d31d0 Production recovered Production went down https://status.squake.earth/ Mon, 26 Aug 2024 10:43:09 +0000 https://status.squake.earth/#4f9349b550f5bac99c33d5350b9d5415206464853b96d272755248d1796d31d0 Production went down Production recovered https://status.squake.earth/ Mon, 26 Aug 2024 10:14:17 +0000 https://status.squake.earth/#3921671782dada62637cdefb06e5c0c97d24e29cde60d931402c4217404cb36a Production recovered Production went down https://status.squake.earth/ Mon, 26 Aug 2024 10:13:57 +0000 https://status.squake.earth/#3921671782dada62637cdefb06e5c0c97d24e29cde60d931402c4217404cb36a Production went down Production recovered https://status.squake.earth/ Mon, 26 Aug 2024 08:17:05 +0000 https://status.squake.earth/#1a15336665019ea5da60145f0baf64fc4be778764e3ae894fb8232036f28d823 Production recovered Production went down https://status.squake.earth/ Mon, 26 Aug 2024 08:16:14 +0000 https://status.squake.earth/#1a15336665019ea5da60145f0baf64fc4be778764e3ae894fb8232036f28d823 Production went down Production recovered https://status.squake.earth/ Mon, 26 Aug 2024 07:24:39 +0000 https://status.squake.earth/#a98141970b914fe38f9389f4faa4ed5538aed0fc043f36e3a885c37469272577 Production recovered Production went down https://status.squake.earth/ Mon, 26 Aug 2024 07:23:42 +0000 https://status.squake.earth/#a98141970b914fe38f9389f4faa4ed5538aed0fc043f36e3a885c37469272577 Production went down Production recovered https://status.squake.earth/ Fri, 23 Aug 2024 09:26:27 +0000 https://status.squake.earth/#d7d9057e6383b536743409f19661d1303b46d6d799731f6161f66a8ac8beb581 Production recovered Production went down https://status.squake.earth/ Fri, 23 Aug 2024 09:23:28 +0000 https://status.squake.earth/#d7d9057e6383b536743409f19661d1303b46d6d799731f6161f66a8ac8beb581 Production went down Production recovered https://status.squake.earth/ Fri, 23 Aug 2024 09:12:28 +0000 https://status.squake.earth/#b182839364332137948e76b6f12cb9ce44d2105a390743f9f0dbd9b261564758 Production recovered Production went down https://status.squake.earth/ Fri, 23 Aug 2024 09:04:28 +0000 https://status.squake.earth/#b182839364332137948e76b6f12cb9ce44d2105a390743f9f0dbd9b261564758 Production went down Production recovered https://status.squake.earth/ Fri, 23 Aug 2024 08:30:21 +0000 https://status.squake.earth/#47bc71db1b64383625bb7ab89d4e8456c2a6c61861b2e324810eae92c1ab95ca Production recovered Production went down https://status.squake.earth/ Fri, 23 Aug 2024 08:20:39 +0000 https://status.squake.earth/#47bc71db1b64383625bb7ab89d4e8456c2a6c61861b2e324810eae92c1ab95ca Production went down API Degraded performance https://status.squake.earth/incident/419516 Fri, 23 Aug 2024 07:45:00 -0000 https://status.squake.earth/incident/419516#12ce6e525042d054dc23b4f795506d54b4fd4c55f3f48231ab44c04486b235b6 ## **Incident Overview** **Incident Start:** 22nd August 2024, 13:28 UTC **Incident End:** 23rd August 2024, 09:45 UTC **Impact:** API requests were met with 504 errors, primarily affecting traffic originating in the EU. The incident resulted in a failure to process 2.5% of incoming requests during the affected period. ## **Timeline** - **22nd August 2024, 13:28 UTC:** The first signs of the issue were observed when response times drastically increased. In response, we scaled horizontally by increasing the number of containers, which temporarily mitigated the issue. - **23rd August 2024, 07:45 UTC:** Despite the horizontal scaling, the issue resurfaced as requests were spread across the machines, eventually overwhelming the system again. At this point, we identified that the root cause was insufficient compute power in our containers. - **23rd August 2024, 09:45 UTC:** To address the issue, we scaled vertically by adding additional compute power to the machines, which successfully resolved the incident. ## **Root Cause Analysis** The outage was triggered by the launch of a new methodology (GATE4 methodology) that significantly increased the memory consumption due to the large dataset it processed. Simultaneously, we experienced an unexpected traffic spike, complicating the identification of the exact root cause initially. The combined effect of increased memory usage and traffic overwhelmed our system, leading to 504 errors. ## **Resolution and Recovery** Once the root cause was identified as a lack of compute power in our containers, we scaled the machines vertically by adding additional compute resources. This action successfully mitigated the issue, and the API service was restored by 07:45 UTC on 23rd August 2024. ## **Impact Assessment** During the incident, approximately 2.5% of requests failed to process. We recognise the inconvenience this may have caused and are committed to ensuring such issues are addressed promptly. If you require any assistance, such as re-running calculations or investigating specific requests, please reach out to us. ## **Preventive Measures** To prevent similar incidents in the future, we are implementing the following actions: 1. **Improved Scaling Policies and Alarms:** We are refining our scaling policies and alarms to better handle unexpected traffic spikes and resource demands. 2. **Enhanced Notifications:** We are adding new notification systems to ensure we are promptly alerted to potential issues, enabling us to take immediate action. We take every incident seriously and are committed to learning and improving from each one. Our team is dedicated to preventing future occurrences and ensuring the reliability of our services. Production recovered https://status.squake.earth/ Fri, 23 Aug 2024 07:33:57 +0000 https://status.squake.earth/#bf9668b8aa5f42b5ad192846121db351c1bd212e1f858d86cc3bf0385fd53dcf Production recovered Production went down https://status.squake.earth/ Fri, 23 Aug 2024 07:32:35 +0000 https://status.squake.earth/#bf9668b8aa5f42b5ad192846121db351c1bd212e1f858d86cc3bf0385fd53dcf Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 17:01:23 +0000 https://status.squake.earth/#e7d21abf74645ebf1642e28de173d62bc7863144e01e20092584fafa0b373348 Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 17:00:24 +0000 https://status.squake.earth/#e7d21abf74645ebf1642e28de173d62bc7863144e01e20092584fafa0b373348 Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 15:52:18 +0000 https://status.squake.earth/#00ec267787abc1c4f9bae68fc53ea8d41a9f4d90a9ccc9b71d1a674d94a001a6 Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 15:50:16 +0000 https://status.squake.earth/#00ec267787abc1c4f9bae68fc53ea8d41a9f4d90a9ccc9b71d1a674d94a001a6 Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 15:38:26 +0000 https://status.squake.earth/#6dab8b820d094668d9fef9a8fd4497ac03a72884b68fdcebc8c4efc80d629859 Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 15:37:26 +0000 https://status.squake.earth/#6dab8b820d094668d9fef9a8fd4497ac03a72884b68fdcebc8c4efc80d629859 Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 15:25:42 +0000 https://status.squake.earth/#e771fc377c96f7f4d28a53f535be63758771deed55e2ae7326fbade7428d851b Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 15:21:26 +0000 https://status.squake.earth/#e771fc377c96f7f4d28a53f535be63758771deed55e2ae7326fbade7428d851b Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 15:04:42 +0000 https://status.squake.earth/#ad5ff094e49b99f9e27fb1dac1ee471a6f890368920afa0ba01101aa8c4670bc Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 15:03:26 +0000 https://status.squake.earth/#ad5ff094e49b99f9e27fb1dac1ee471a6f890368920afa0ba01101aa8c4670bc Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 14:40:47 +0000 https://status.squake.earth/#ad534ab31b780f3b520ca76d27244329da27414b561e34dbcbffb36d4f7f9e0d Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 14:39:26 +0000 https://status.squake.earth/#ad534ab31b780f3b520ca76d27244329da27414b561e34dbcbffb36d4f7f9e0d Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 14:28:31 +0000 https://status.squake.earth/#7f4e506160b952fa1373d0e27772bf21901b6d19cbccc1d31aac41a1ec0f6a95 Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 14:20:25 +0000 https://status.squake.earth/#7f4e506160b952fa1373d0e27772bf21901b6d19cbccc1d31aac41a1ec0f6a95 Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 14:15:23 +0000 https://status.squake.earth/#d38020aedde1d64205cfcf7024a239ddb78c0e1e0e24bc5796d5e3f983bcc75f Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 14:14:24 +0000 https://status.squake.earth/#d38020aedde1d64205cfcf7024a239ddb78c0e1e0e24bc5796d5e3f983bcc75f Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 13:51:47 +0000 https://status.squake.earth/#1a5ab6c5a316bc72208787ba585974dba42ca0d010e14d1416a7e05be714a55a Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 13:49:18 +0000 https://status.squake.earth/#1a5ab6c5a316bc72208787ba585974dba42ca0d010e14d1416a7e05be714a55a Production went down Production recovered https://status.squake.earth/ Thu, 22 Aug 2024 13:29:24 +0000 https://status.squake.earth/#fbd369fe7f04c2e935869646dd5ff42ed5a70924227228c925742cc9eff0acd5 Production recovered Production went down https://status.squake.earth/ Thu, 22 Aug 2024 13:26:21 +0000 https://status.squake.earth/#fbd369fe7f04c2e935869646dd5ff42ed5a70924227228c925742cc9eff0acd5 Production went down API Degraded performance https://status.squake.earth/incident/419516 Thu, 22 Aug 2024 11:28:00 -0000 https://status.squake.earth/incident/419516#910f965552f39238fa3713039db06f92c1ac367d8905e6ae09a8681615aad9f1 We're experiencing degraded performance on API /calculations endpoint Sandbox recovered https://status.squake.earth/ Tue, 20 Aug 2024 22:03:45 +0000 https://status.squake.earth/#bd1a13517c2f5355d86102bc27e6aea0c38fc65c9d80fc22a92c60c3c8ad02ee Sandbox recovered Sandbox went down https://status.squake.earth/ Tue, 20 Aug 2024 22:02:57 +0000 https://status.squake.earth/#bd1a13517c2f5355d86102bc27e6aea0c38fc65c9d80fc22a92c60c3c8ad02ee Sandbox went down Sandbox recovered https://status.squake.earth/ Wed, 14 Aug 2024 13:24:18 +0000 https://status.squake.earth/#f444d4f57c1ce4f77a6e9d683c04c868cd6501c32eeb5ab52e59aefaccdcdebf Sandbox recovered Sandbox went down https://status.squake.earth/ Wed, 14 Aug 2024 13:22:48 +0000 https://status.squake.earth/#f444d4f57c1ce4f77a6e9d683c04c868cd6501c32eeb5ab52e59aefaccdcdebf Sandbox went down Production recovered https://status.squake.earth/ Wed, 07 Aug 2024 13:46:47 +0000 https://status.squake.earth/#1702c9e20ad1d715a544bbf3f9cec0c5a511f7661d979ba79919229a955ac0f2 Production recovered Production went down https://status.squake.earth/ Wed, 07 Aug 2024 13:45:48 +0000 https://status.squake.earth/#1702c9e20ad1d715a544bbf3f9cec0c5a511f7661d979ba79919229a955ac0f2 Production went down Production recovered https://status.squake.earth/ Thu, 11 Jul 2024 17:44:02 +0000 https://status.squake.earth/#f6c38b5a5cc02132a875dba3e3bbb892894ca5f95db7a55acd3aefba0addb946 Production recovered Production went down https://status.squake.earth/ Thu, 11 Jul 2024 17:42:00 +0000 https://status.squake.earth/#f6c38b5a5cc02132a875dba3e3bbb892894ca5f95db7a55acd3aefba0addb946 Production went down Sandbox recovered https://status.squake.earth/ Thu, 04 Jul 2024 11:19:43 +0000 https://status.squake.earth/#6975f603098e2a1685c443eedd526c7561e43a72a8cfa167132925def9fa6376 Sandbox recovered Sandbox went down https://status.squake.earth/ Thu, 04 Jul 2024 11:01:35 +0000 https://status.squake.earth/#6975f603098e2a1685c443eedd526c7561e43a72a8cfa167132925def9fa6376 Sandbox went down Partial Outage on the Calculation / Pricing Endpoints https://status.squake.earth/incident/356285 Fri, 12 Apr 2024 06:45:00 -0000 https://status.squake.earth/incident/356285#9fa6dad9a6756f7dd2a248ee07f09ca37adb35a3b5b186f1bbeded667faccf79 The issue has been resolved. Partial Outage on the Calculation / Pricing Endpoints https://status.squake.earth/incident/356285 Fri, 12 Apr 2024 06:45:00 -0000 https://status.squake.earth/incident/356285#9fa6dad9a6756f7dd2a248ee07f09ca37adb35a3b5b186f1bbeded667faccf79 The issue has been resolved. Partial Outage on the Calculation / Pricing Endpoints https://status.squake.earth/incident/356285 Fri, 12 Apr 2024 00:00:00 -0000 https://status.squake.earth/incident/356285#72fc6ccf2a7d4f91cc36c57ca2009d2b91052059bf09af445db90403ace9627f This is a regression of https://status.squake.earth/incident/354138. ## Summary Requesting a price quote from the API failed for some products; two endpoints were affected: pricing and calculation including a pricing. The root cause was faulty meta data in the products table in combination with not gracefully handling such faulty data in the API when syncing it from another source. An initial data fix within ten minutes of the incident fixed the problem; however, it regressed overnight when a CRON job synced back the faulty data. A permanent solution to make the API more robust against such failures was rolled out and included on a high level: - better isolation of the most critical endpoints for calculation and pricing on a code level to minimize side effects - improved the robustness of all touch points where we synchronize data from external sources to handle potentially faulty data more gracefully - internal post-mortem meeting to share learnings across the entire engineering team Partial Outage on the Calculation / Pricing Endpoints https://status.squake.earth/incident/356285 Fri, 12 Apr 2024 00:00:00 -0000 https://status.squake.earth/incident/356285#72fc6ccf2a7d4f91cc36c57ca2009d2b91052059bf09af445db90403ace9627f This is a regression of https://status.squake.earth/incident/354138. ## Summary Requesting a price quote from the API failed for some products; two endpoints were affected: pricing and calculation including a pricing. The root cause was faulty meta data in the products table in combination with not gracefully handling such faulty data in the API when syncing it from another source. An initial data fix within ten minutes of the incident fixed the problem; however, it regressed overnight when a CRON job synced back the faulty data. A permanent solution to make the API more robust against such failures was rolled out and included on a high level: - better isolation of the most critical endpoints for calculation and pricing on a code level to minimize side effects - improved the robustness of all touch points where we synchronize data from external sources to handle potentially faulty data more gracefully - internal post-mortem meeting to share learnings across the entire engineering team Partial outage of some endpoints https://status.squake.earth/incident/354138 Thu, 11 Apr 2024 16:29:00 -0000 https://status.squake.earth/incident/354138#22725947505f5cb86b167429a4f3da67435e203052c1f4accfb35e3cd1cfad7a We have rolled out multiple improvements across the app. During a rollout, a few records failed to migrate correctly, causing about 8 % of pricing / calculation-with-pricing requests to fail. All other endpoints operated normally. Partial outage of some endpoints https://status.squake.earth/incident/354138 Thu, 11 Apr 2024 15:15:00 -0000 https://status.squake.earth/incident/354138#9569d52d92f3609c396d002ecde3abcdf04e5ee6461252385604c06961b5ec96 under investigation API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 18:06:38 -0000 https://status.squake.earth/incident/284876#c5e52a5042ae5bd235ae7a9d0ea449b32d7b4b23395614baebded66641980778 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 18:06:07 -0000 https://status.squake.earth/incident/284876#0c8b798a4fce3d6a47676cb614bd2bf6e9cf9cd27294b8aa7a1dbade78caa112 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 18:01:00 -0000 https://status.squake.earth/incident/284876#e1b0e6da58d71e62065a4ca885d0b03f8433e58b2a6999dfbabfc292cd62d51d API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 18:00:10 -0000 https://status.squake.earth/incident/284876#9ff4a2fbf178ef6a60aafb40309dce4e06ddb2dbd30e685ffc486a87f951fc02 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:54:59 -0000 https://status.squake.earth/incident/284876#0b0b142d586c94c08796f4a064978620f4edc252450dbd765f99ab7d9c03e1d4 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:54:39 -0000 https://status.squake.earth/incident/284876#4276b4fe3f3065a6885bc1275ed01699b1db51c026b9c9191221c8f35c729c7f API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:49:42 -0000 https://status.squake.earth/incident/284876#3a42bccbb91d5472e50eeb008a56fb971d81ade0849d8eda6d68dc25e1db179a API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:48:33 -0000 https://status.squake.earth/incident/284876#69f7a7b9ca497e42c01a383398b106b886d42495460dd113855991aa762a92f2 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:43:43 -0000 https://status.squake.earth/incident/284876#2cab5e418f652b38894ec1381f2b9f664457686742df75c2a9b854eec0cd109b API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:43:07 -0000 https://status.squake.earth/incident/284876#7bd64116326af4d3f09216d81715732a7ed9f7c661e2df8d14d8de891cde3d3b API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:37:59 -0000 https://status.squake.earth/incident/284876#233982f6977b4e130bc57af8ac3070ac4dcab8aa1b65ceef215141ba48d47b70 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/284876 Thu, 09 Nov 2023 17:16:32 -0000 https://status.squake.earth/incident/284876#49354ff9dbb6a3202a04c75d88c12e6819f319e08beb817d30f69f8e6daac5da API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/278556 Fri, 27 Oct 2023 14:05:44 -0000 https://status.squake.earth/incident/278556#45ca5c16a6efd28292cbe355565b642f6758a480315317e4b8b7b3d627d3b729 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/278556 Fri, 27 Oct 2023 14:03:19 -0000 https://status.squake.earth/incident/278556#65bacef3260b6c73f3113de4c90be9801d1712e1e177ca2fa80c857721ad3586 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 13:37:35 -0000 https://status.squake.earth/incident/277830#47ade6344acdeebeed183a02b88f94cb1b77d7099d48b156aa750511f3aa7626 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 13:33:11 -0000 https://status.squake.earth/incident/277830#fb92cde51c91e2354fec4d8993e7c1d34c298b04b0ba79e2e59e1b901ac9d9e9 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 11:51:51 -0000 https://status.squake.earth/incident/277830#f7b09e7f2ed4c54c7c3ae595c408d7e45675583682b5af7b8625a2ae73cecd59 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 11:47:53 -0000 https://status.squake.earth/incident/277830#31eb758007d20050a43b046d75b886021b47885739625be37c34541450969988 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 10:42:07 -0000 https://status.squake.earth/incident/277830#e8adbb7b960dd0a166d78e677f43c91e82ccb331fc19caf2962620a72b34ad83 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 10:36:58 -0000 https://status.squake.earth/incident/277830#0fada90525b750bf56b768235cc5a238e2ef464b77a574da99017d748234f32e API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 10:30:00 -0000 https://status.squake.earth/incident/277830#4995fb5c9aec36fe6f654ddb13b389aeedbc126f1910fa2bb9f7f0301167e318 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 10:24:04 -0000 https://status.squake.earth/incident/277830#def7ddb3551159c2020e45a007572db576b01d26449b2c17c8bf5d3dc070e149 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 09:45:28 -0000 https://status.squake.earth/incident/277830#b800d6b4865a3372d1cb0416f5841fb9d0c82c880291b85651c7c34cb510aa29 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 09:39:25 -0000 https://status.squake.earth/incident/277830#4e5bd2f50499f8355f612391a6235cad8e4c2c69e2657099e2bded12617ba6c4 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 09:37:29 -0000 https://status.squake.earth/incident/277830#085a6812aae65e4a3742a78f3f126cfbdf3922408a20ce32246997fbb1e2f693 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 09:31:45 -0000 https://status.squake.earth/incident/277830#1375e7231cd043932149a869cadd8d78c6df865bb838d3675cf6651359234174 API - Production (v1) and API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 08:56:51 -0000 https://status.squake.earth/incident/277830#58d7edfb7016146a8cfa6bda056f28649bac061981efb63cef23119517e32ec2 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 08:52:25 -0000 https://status.squake.earth/incident/277830#a0797fc5faa3afecee91ba42652cce0ce034eacb1748a7a85b513e5da6057a19 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 07:58:51 -0000 https://status.squake.earth/incident/277830#1583187c10697bd5cc82ca17fad9b0ecabb8b8a6fe8981db659d72ace5b5e4cc API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 07:54:33 -0000 https://status.squake.earth/incident/277830#3da2d3361dfaa7d667082d466fe488a694fd6a4d981421bc42b2ef77da33e001 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 07:43:49 -0000 https://status.squake.earth/incident/277830#71799c1084f23ecd59e7128092b28d7f5f0e3d66bb72556a41398ffe3f698a4c API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 07:39:16 -0000 https://status.squake.earth/incident/277830#3d3fe29cfe4b62118ae453aebc015b98f355aad4e4a34000975eb3f87f6a561e API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 05:50:59 -0000 https://status.squake.earth/incident/277830#ef2ab0de75b8f8c8a017989a376438295b8fb763e622198fd35eea21e63d7e19 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 05:46:27 -0000 https://status.squake.earth/incident/277830#29499e09b70a1299dc5eab0e6459d9336c867000caa16c21271ba26ef75aa1b6 API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 05:41:19 -0000 https://status.squake.earth/incident/277830#c09859d51b3d64d54bd4e0e462bf6b4c488143d194467a0b9b91b8ee555e06ce API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277830 Thu, 26 Oct 2023 05:36:55 -0000 https://status.squake.earth/incident/277830#052cb4da15df17b304663d517af94e59a05312cc75ba9df45649d56c11776ffc API - Sandbox went down. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277478 Wed, 25 Oct 2023 12:07:42 -0000 https://status.squake.earth/incident/277478#d67178476c76c99d3e6fe85c78399d0605c81f85e18e493ca879cef360907522 API - Sandbox recovered. API - Production (v1) and API - Sandbox are down https://status.squake.earth/incident/277478 Wed, 25 Oct 2023 12:06:01 -0000 https://status.squake.earth/incident/277478#114b816facadf9e89a78bd9616f3a7ef0798b0815458ad917e6351719d7ea535 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/274934 Thu, 19 Oct 2023 14:08:30 -0000 https://status.squake.earth/incident/274934#f09f88a2fd0d29981d41384d4432f78a8482fb7807082a4caa87ea5e1ea5b541 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/274934 Thu, 19 Oct 2023 14:04:09 -0000 https://status.squake.earth/incident/274934#abf83e6308fb87c8523bae91186f9ed31df09b124fe0afb86979b5aa85e0f773 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/258351 Wed, 13 Sep 2023 03:52:31 -0000 https://status.squake.earth/incident/258351#cd5ef5cd412702ec79b76b486b3876443bd4cd48caae596794a91633cc3b0093 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/258351 Wed, 13 Sep 2023 03:42:17 -0000 https://status.squake.earth/incident/258351#0a27a909cdfc49ae520cce8b1fb32a576d2cdb4cee76210629423ed822abcfd1 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/258351 Wed, 13 Sep 2023 03:41:28 -0000 https://status.squake.earth/incident/258351#9541bcc02ceb62d77e6aeece8a81491d4dd3f6061292b53a0eb5311dee189335 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/258351 Wed, 13 Sep 2023 03:37:04 -0000 https://status.squake.earth/incident/258351#e7a90e1db052b8a0f8ff69081b7d3ed95bf54042431154f457987d47cba8d268 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/258351 Wed, 13 Sep 2023 03:35:47 -0000 https://status.squake.earth/incident/258351#8f57895cefd96001ac52fd9f5b3b886d2c3ce8bebc3d044e6535e664e67b87ee API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/258351 Tue, 12 Sep 2023 21:03:22 -0000 https://status.squake.earth/incident/258351#b99884e5fe558e51bf4f09b22922d1be68468899a7691b563a5c4f6c0021278a API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/258351 Tue, 12 Sep 2023 20:41:50 -0000 https://status.squake.earth/incident/258351#5881895d42dab61efa869ac82b15ceead34b9057a1c4af254fb014aa0a52fed9 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/258351 Tue, 12 Sep 2023 20:38:59 -0000 https://status.squake.earth/incident/258351#0bb0d83571c48e12378e30c7cb02a10c8c3e7bbb4bf8f85a9eb06dfba1800739 API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/258351 Tue, 12 Sep 2023 14:06:19 -0000 https://status.squake.earth/incident/258351#df09bc491aad62699f9fa5f31b328fd1c2b48cb5bc974a4c89ecd090dc2c2a32 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/258351 Tue, 12 Sep 2023 14:02:26 -0000 https://status.squake.earth/incident/258351#df6264f76b11534a9ff42751e689b624fc9815da9ad4b46af477d8efe57c094a API - Sandbox went down. API - Sandbox is down https://status.squake.earth/incident/257236 Sat, 09 Sep 2023 20:23:28 -0000 https://status.squake.earth/incident/257236#8b05787bfb0bb14c65b683011a3df4b183fafe293f1d0eb93113a3d68fd1bba0 API - Sandbox recovered. API - Sandbox is down https://status.squake.earth/incident/257236 Sat, 09 Sep 2023 20:19:17 -0000 https://status.squake.earth/incident/257236#f2c1b392e7cb7cfce4b903fbffcfddfe53b4a32937b28484b92e8939b306d877 API - Sandbox went down. API - Production (v2) is down https://status.squake.earth/incident/256307 Thu, 07 Sep 2023 14:39:40 -0000 https://status.squake.earth/incident/256307#dfbbbc2d9cc61141b8a45ef09e42658686a56cd6f718fedd5e95f7dfa2f98aca API - Production (v2) recovered. API - Production (v2) is down https://status.squake.earth/incident/256307 Thu, 07 Sep 2023 14:28:29 -0000 https://status.squake.earth/incident/256307#ca283e3fad035588ac7b7349556ad43aed2d94ce2163a47786da6d1626a0953e API - Production (v2) went down. API - Production (v2) is down https://status.squake.earth/incident/255235 Tue, 05 Sep 2023 10:21:07 -0000 https://status.squake.earth/incident/255235#96d9ef762e9e53e19fd0f81708173718e080f31acbdeb70bc008bf09bd4462a7 API - Production (v2) recovered. API - Production (v2) is down https://status.squake.earth/incident/255235 Tue, 05 Sep 2023 10:18:59 -0000 https://status.squake.earth/incident/255235#4d1b7dadf824c04d9c5c673308f266366b3890cbc0af868e5e34b48a754737a8 API - Production (v2) went down. API - Production is down https://status.squake.earth/incident/252320 Tue, 29 Aug 2023 12:37:50 -0000 https://status.squake.earth/incident/252320#a13be1b970e2fc2d7af2a4f9d6f42fe2bc7ac0938018308032a4e0ce70523b77 API - Production recovered. API - Production is down https://status.squake.earth/incident/252320 Tue, 29 Aug 2023 12:36:50 -0000 https://status.squake.earth/incident/252320#092f39f738f69fddb928e8e14bc5e20bdf4392256488cbdeedafd34697df7277 API - Production went down. Partial outage of the API https://status.squake.earth/incident/241321 Thu, 03 Aug 2023 16:45:00 -0000 https://status.squake.earth/incident/241321#59230f40d60742dcd6dfa4e65f43178c9c4ef8513c18cc32619342ed316ed38b Issue resolved Partial outage of the API https://status.squake.earth/incident/241321 Thu, 03 Aug 2023 16:45:00 -0000 https://status.squake.earth/incident/241321#59230f40d60742dcd6dfa4e65f43178c9c4ef8513c18cc32619342ed316ed38b Issue resolved Partial outage of the API https://status.squake.earth/incident/241321 Thu, 03 Aug 2023 15:33:00 -0000 https://status.squake.earth/incident/241321#e822d2dd3d5836036f218ffd3b22a74bbf32fe5aa0e12f9d9bc59ed021fe786b ## What happened The API was unreachable, or rather requests ended in a 500 error. The incident started on Thursday, 3rd of August, at 3.30 pm UTC, and by 3.55 pm UTC, most traffic from affected servers was redirected to a healthy cluster. The error was finally resolved by 4.45 pm UTC. Within this ~1h interval, about 35% of all requests failed. The main impact was on traffic originating in the US. ## Why this happened A new feature was launched that lets users see which API key was used for purchases. This required updating account configurations. However, right after the release, the authentication layer failed due to a stale cache of the updated configuration. ## Estimated costs For some customers, we were down for 25 to 75 minutes. Please reach out to us if there is anything we can do to help you, e.g., re-run some calculations or look into specific requests. ## How to prevent this in the future We take every incident seriously, meaning we do a thorough analysis and discuss with the team steps we can do to improve and learn from any incident. The resulting actions are: We are improving our gradual rollouts to avoid such issues in the future. Partial outage of the API https://status.squake.earth/incident/241321 Thu, 03 Aug 2023 15:33:00 -0000 https://status.squake.earth/incident/241321#e822d2dd3d5836036f218ffd3b22a74bbf32fe5aa0e12f9d9bc59ed021fe786b ## What happened The API was unreachable, or rather requests ended in a 500 error. The incident started on Thursday, 3rd of August, at 3.30 pm UTC, and by 3.55 pm UTC, most traffic from affected servers was redirected to a healthy cluster. The error was finally resolved by 4.45 pm UTC. Within this ~1h interval, about 35% of all requests failed. The main impact was on traffic originating in the US. ## Why this happened A new feature was launched that lets users see which API key was used for purchases. This required updating account configurations. However, right after the release, the authentication layer failed due to a stale cache of the updated configuration. ## Estimated costs For some customers, we were down for 25 to 75 minutes. Please reach out to us if there is anything we can do to help you, e.g., re-run some calculations or look into specific requests. ## How to prevent this in the future We take every incident seriously, meaning we do a thorough analysis and discuss with the team steps we can do to improve and learn from any incident. The resulting actions are: We are improving our gradual rollouts to avoid such issues in the future. API - Production is down https://status.squake.earth/incident/231437 Tue, 11 Jul 2023 09:10:50 -0000 https://status.squake.earth/incident/231437#5fde50a55ab61f895e10a2aabd2749ef35effa737a6cb6b1cf1a63864d476121 API - Production recovered. API - Production is down https://status.squake.earth/incident/231437 Tue, 11 Jul 2023 09:09:50 -0000 https://status.squake.earth/incident/231437#964578e45b2613dd85ba86a7dccb58eca91196d29959f7dbc0206e44c525a045 API - Production went down. API - Production is down https://status.squake.earth/incident/231437 Tue, 11 Jul 2023 09:03:46 -0000 https://status.squake.earth/incident/231437#d4c706b5251e38dc6e8c911d2ea852b5b821ee71a47874b420d13aa8ed65e7ba API - Production recovered. API - Production is down https://status.squake.earth/incident/231437 Tue, 11 Jul 2023 09:00:47 -0000 https://status.squake.earth/incident/231437#2aa7d5191081ffb67a80722f9d075506f36cab2804942826297a54bceb4b0f19 API - Production went down. API Reference unavailable https://status.squake.earth/incident/185691 Thu, 16 Mar 2023 12:35:00 -0000 https://status.squake.earth/incident/185691#087ab6c6c22124707f7566c13c5eb44c5d612d295c847347f5a98475ccabc205 We restructured the API reference page for the release of version 2 of the SQUAKE API. This resulted in an invalid SSL certificate for the docs subdomain due to a human error on our side. We had tested the same process on our test system before without issues. All other systems were unaffected. The recovery took about 15-20 minutes. We apologize for any inconvenience caused. Production API unreachable https://status.squake.earth/incident/149549 Sat, 31 Dec 2022 07:41:00 -0000 https://status.squake.earth/incident/149549#9b91b98cd2211aa7550bf668190366b3188724412577163533acfe0fd7b5c377 Upgrading all our SSL certificates succeeded and the API is reachable again. Production API unreachable https://status.squake.earth/incident/149549 Sat, 31 Dec 2022 07:38:00 -0000 https://status.squake.earth/incident/149549#d6853294b85256f67e34e50bf9f4e6371350d7ebd708115e487d2f251bf0e6c7 ### What happened The API was unreachable ### Why this happened We upgraded SSL certificates, issuing one new certificate for each DNS record. During the upgrade, an error occurred. ### Estimated costs We were down for about 2-3 minutes. ### How to prevent this in the future We are working on testing various solutions to this issue, for example not validating the SSL certificate during the upgrade period. Production unreachable https://status.squake.earth/incident/96605 Tue, 21 Jun 2022 08:17:00 -0000 https://status.squake.earth/incident/96605#4ccfb33a31b83ae92a828d948111d839848a7e8158eedcc2650b439fbea597ad Cloudflare resolved their issue, all systems are back online Production unreachable https://status.squake.earth/incident/96605 Tue, 21 Jun 2022 08:17:00 -0000 https://status.squake.earth/incident/96605#4ccfb33a31b83ae92a828d948111d839848a7e8158eedcc2650b439fbea597ad Cloudflare resolved their issue, all systems are back online Production unreachable https://status.squake.earth/incident/96605 Tue, 21 Jun 2022 08:17:00 -0000 https://status.squake.earth/incident/96605#4ccfb33a31b83ae92a828d948111d839848a7e8158eedcc2650b439fbea597ad Cloudflare resolved their issue, all systems are back online Production unreachable https://status.squake.earth/incident/96605 Tue, 21 Jun 2022 08:15:00 -0000 https://status.squake.earth/incident/96605#bf6de9ceb234946648828ad9ad8cffb10097635e97c9796942c45cb19b8f77e3 ## What happened - Production API, API Documentation, and Dashboard stopped responding at Jun 21, 2022 - 06:43 UTC - All services started responding again at 07:20 UTC - All application servers operated normally - Database servers operated normally - API was reachable when contacting the server directly from within our VPN ## Why this happened We are protecting our API using Cloudflare's services. Cloudflare had an outage which caused requests to never reach our servers. Cloudflare posted a post-mortem [here](https://www.cloudflarestatus.com/incidents/xvs51y9qs9dj). ## Estimated costs - We were down for 1h 16 minutes ## How to prevent this in the future For longer-lasting outages of Cloudflare, we can failover the routing to a backup provider. For short outages, a failover will take too long (updating DNS records takes between a few minutes to several hours). In general, Cloudflare is a highly reliable provider. Production unreachable https://status.squake.earth/incident/96605 Tue, 21 Jun 2022 08:15:00 -0000 https://status.squake.earth/incident/96605#bf6de9ceb234946648828ad9ad8cffb10097635e97c9796942c45cb19b8f77e3 ## What happened - Production API, API Documentation, and Dashboard stopped responding at Jun 21, 2022 - 06:43 UTC - All services started responding again at 07:20 UTC - All application servers operated normally - Database servers operated normally - API was reachable when contacting the server directly from within our VPN ## Why this happened We are protecting our API using Cloudflare's services. Cloudflare had an outage which caused requests to never reach our servers. Cloudflare posted a post-mortem [here](https://www.cloudflarestatus.com/incidents/xvs51y9qs9dj). ## Estimated costs - We were down for 1h 16 minutes ## How to prevent this in the future For longer-lasting outages of Cloudflare, we can failover the routing to a backup provider. For short outages, a failover will take too long (updating DNS records takes between a few minutes to several hours). In general, Cloudflare is a highly reliable provider. Production unreachable https://status.squake.earth/incident/96605 Tue, 21 Jun 2022 08:15:00 -0000 https://status.squake.earth/incident/96605#bf6de9ceb234946648828ad9ad8cffb10097635e97c9796942c45cb19b8f77e3 ## What happened - Production API, API Documentation, and Dashboard stopped responding at Jun 21, 2022 - 06:43 UTC - All services started responding again at 07:20 UTC - All application servers operated normally - Database servers operated normally - API was reachable when contacting the server directly from within our VPN ## Why this happened We are protecting our API using Cloudflare's services. Cloudflare had an outage which caused requests to never reach our servers. Cloudflare posted a post-mortem [here](https://www.cloudflarestatus.com/incidents/xvs51y9qs9dj). ## Estimated costs - We were down for 1h 16 minutes ## How to prevent this in the future For longer-lasting outages of Cloudflare, we can failover the routing to a backup provider. For short outages, a failover will take too long (updating DNS records takes between a few minutes to several hours). In general, Cloudflare is a highly reliable provider. Homepage not reachable https://status.squake.earth/incident/80844 Thu, 07 Apr 2022 21:58:00 -0000 https://status.squake.earth/incident/80844#c6446ef4cb93c5e436730cfe69fd0f61291ae77fb4922fc3ee564e53ee114f98 The homepage is reachable again. Homepage not reachable https://status.squake.earth/incident/80844 Thu, 07 Apr 2022 21:29:00 -0000 https://status.squake.earth/incident/80844#c66b39b9da7ecba1ef1b519ea3cf3507170bb1e6feebeb038839d5f973772c46 What happened The Homepage was unreachable for 28min The outage does not affect any of our services. Sandbox not reachable https://status.squake.earth/incident/80560 Wed, 06 Apr 2022 17:05:00 -0000 https://status.squake.earth/incident/80560#4c89241fa9870594454b72ed4be509c7387c4cdc7f69fcc953c86c7f830d6c19 Sandbox is reachable again Sandbox not reachable https://status.squake.earth/incident/80560 Wed, 06 Apr 2022 16:33:00 -0000 https://status.squake.earth/incident/80560#3a67cf87741112379955f0fe3a68fd24da97e7cfbd05ee8b1826354ad57e8037 What happened - Sandbox stopped responding at 18:32:36 UTC+00 - Sandbox started responding again at 19:05:30 UTC+00 - All application servers operated normally - Database servers operated normally - API was reachable when contacting the server directly from within our VPN Why this happened We are protecting our API using Cloudflare's services. Cloudflare had an outage which caused requests to never reach our servers. Cloudflare posted a post-mortem: https://www.cloudflarestatus.com/incidents/g7sp3s3pdx4m. Estimated costs - We were down for 19 minutes - We had increased response time for another 42 minutes after the downtime was resolved - Production was unaffected (Cloudflare only had a partial outage, and Production/Sandbox used separated routing and accounts everywhere, including Cloudflare) - Since Sandbox is for testing only, no production integration was affected How to prevent this in the future For longer-lasting outages of Cloudflare, we can failover the routing to a backup provider. For short outages, a failover will take too long (updating DNS records takes between a few minutes to several hours). In general, Cloudflare is a highly reliable provider.