On November 20, 2020, Platform.sh monitoring detected various stuck cron jobs on a number of projects that may have caused site issues. As we investigated, we discovered this event had the potential to affect projects across multiple regions, most notably on the EU-4 and EU-2 regions.
At 16:05 UTC, an incident was created and escalated to our internal teams for further troubleshooting. Immediately it was identified that the issues were related to a previously deployed feature update related to the handling of cron tasks.
At 16:06 UTC, a statuspage update was posted, and our team began to work on a multi-phased approach that included, but was not limited to the following:
Around 16:49 UTC, our team began executing the previously outlined summary to restore services. Not all projects were affected during this event, and those affected may have experienced varied impacts. During the event, services may have been restored for individual projects at different times throughout the evening as progress was made.
This incident was resolved for the EU-2 and EU-4 regions on November 21, 2020 around 02:30 UTC and monitored until it was officially closed at 03:15 UTC.