App not responding

Incident Report for Front

Postmortem

On Friday, Dec 19, at 14:50 UTC (6:50am PST), customers backed in our US-West-2 data center experienced dramatically increased API latency, resulting in the website failing to load and messages being queued in the backend. This continued until 18:10 UTC (10:10am PST). During this time no messages were lost, though there may have been a significant delay for messages to appear in customer inboxes. All queued messages were delivered by 21:00 (1:00pm PST).

‌

Customers based in Front’s EU-West-1 and US-West-1 datacenters may have experienced some delays during this time, as some systems are interdependent, but this impact was intermittent and uncommon.

‌

The root cause of this issue was the failure of a caching system. There are several database systems that support the Front application, which are supported by caching to improve performance. A recent change increased the size of some objects in the cache layer. This is not inherently wrong, and did not have any immediate impact. On Friday the 19th the caching layer in US-West-2 crossed a new threshold of data volume which triggered a large number of evictions, particularly of other data that is necessary for most application activity. Besides putting additional load on the databases, there was simply not enough room in the cache for all the data we needed to store there. This caused a high amount of thrashing that significantly increased latency for all systems.

Posted Dec 19, 2025 - 23:22 UTC

Resolved

All backfills complete

Posted Dec 19, 2025 - 21:23 UTC

Update

Front is operational for all customers. We are continuing to backfill any missed messages and application webhooks.

Posted Dec 19, 2025 - 18:23 UTC

Update

We are continuing to monitor and close our our remaining recovery items for [us-west-2] customers. Other regions are recovered.

Posted Dec 19, 2025 - 17:57 UTC

Monitoring

A fix has been implemented and we are monitoring the results. We are seeing recovery but continuing to close out remaining recovery items.

Posted Dec 19, 2025 - 17:29 UTC

Identified

The issue has been identified and a fix is being implemented.

Posted Dec 19, 2025 - 16:47 UTC

Update

We are continuing to investigate this issue.

Posted Dec 19, 2025 - 16:06 UTC

Investigating

We are currently investigating the issue. [us-west-1], [us-west-2]

Posted Dec 19, 2025 - 15:13 UTC

This incident affected: App.