Back to overview

Intermittent Failures due to networking

Dec 26 at 03:33pm UTC
Affected services

Dec 26 at 06:58pm UTC

The error rate seen has subsided and models are seeing previous startup and runtime behavior. We are working with our providers mitigate impact of future incidents like this.

Dec 26 at 05:05pm UTC

We are seeing a reduction in error rate. The root cause is still under investigation.

Dec 26 at 03:39pm UTC

The issue has been narrowed down and the observed errors are only seen in specific subset of infrastructure in a single region. We are continuing to investigate and will provide further updates as information becomes available.

Dec 26 at 03:33pm UTC

We are seeing elevated errors within one of our regions relating to networking issues. This is under active investigation. This is primarily presenting as failed model setup and errors when downloading weights.

These events present as groupings for short windows and then cease. We will provide updates as more information becomes available.