Reactivating environments hanging
Incident Report for Platform.sh
Postmortem

On January 16th we released an upgrade to our orchestration layer which added new important capabilities. These improvements allow us to handle higher deployment volumes more efficiently, scaling our platform to a new level. Unfortunately, this release contained a regression.

Between January 18th and January 24th, customers across regions CA-1, US-2, AU, DE-1 and EU-2 reported issues when reactivating development environments that were previously inactive or in some cases populating database relationships. Other customers also reported problems when sending email. We identified and mitigated these issues as they occurred while working on the definitive bug fixes that were deployed in all affected regions on January 28th, restoring all functionality back to normal.

The impact experienced by most of our customers was low, though a couple customers had a longer downtime due to misdiagnosed issues. Mislead by the timing, we incorrectly diagnosed the downtime to be related to the maintenance when it wasn’t and advised them incorrectly. We apologize to these customers. We highly encourage all of our customers to open an urgent ticket if their environment is down for more than 20 minutes, even within a maintenance window, as any downtime taking longer than the build and deploy hook times needs to be investigated by our Support. Investigation into sites with Premium support SLA will be conducted automatically based on monitoring alerts.

Posted Feb 14, 2019 - 15:19 UTC

Resolved
The patch has been applied if you continue to experience issues please contact us via support. https://support.platform.sh/
Posted Jan 24, 2019 - 10:43 UTC
Monitoring
We have deployed a fix to the affected regions for the environment reactivation issue. We are currently verifying its status.
Posted Jan 23, 2019 - 23:47 UTC
Update
Our operations team is deploying the fix to all impacted regions.
Please continue to avoid reactivating previously deactivated branches on projects until we have confirmed that the fix has been fully deployed.
Posted Jan 23, 2019 - 18:11 UTC
Update
The fix has not yet completed testing, so the problem still exists.
In the meantime, please avoid reactivating previously deactivated branches on your projects.
Posted Jan 23, 2019 - 02:26 UTC
Update
We have identified a bug in our software concerning environment reactivation. This bug manifests as environment creation hanging when reactivating an environment. New environment creation is not impacted; however, if you activate an environment that has been previously deleted, the activation operation will continue indefinitely, blocking deployments to any other environments on the project.

To avoid hitting the bug while we are working toward a resolution, please avoid reactivating environments.

Our engineering team has identified the problem and are working toward implementing a fix. We are expecting to implement, test and release a fix in the next 24 hours. We'll let you know if this timeline changes.

Our operations team is working to resolve the issue for projects that currently have stuck activation operations; however, this workaround will not prevent future activation operations from becoming stuck, so please continue to avoid reactivating environments.
Posted Jan 22, 2019 - 22:50 UTC
Update
We are continuing to work on a fix for this issue.
Posted Jan 22, 2019 - 17:41 UTC
Identified
The issue has been identified and our team is working on implementing a fix.
Posted Jan 22, 2019 - 16:44 UTC
Investigating
Reactivating an environment that has been deactivated is hanging at environment creation. We are currently investigating the issue.
Posted Jan 22, 2019 - 16:11 UTC
This incident affected: Australia (au.platform.sh), Canada (ca-1.platform.sh), Europe (France) (fr-1.platform.sh), Europe (Germany) (de-2.platform.sh), Europe (West 2) (eu-2.platform.sh), and USA-2 (East 2) (us-2.platform.sh).