Back to overview
Degraded

Errors downloading weights on model startup

Feb 10 at 09:52pm UTC
Affected services
Prediction serving

Resolved
Feb 10 at 11:16pm UTC

We've not seen any failures after 22:50 UTC, so we're calling this incident resolved.

Our investigation revealed that internal DNS lookup failures put a storage cache subsystem into a broken state. Next week we'll be looking into how to make our systems more robust in situations like this one.

Thank you for your patience.

Updated
Feb 10 at 10:44pm UTC

As far as we can tell things are looking a lot better. We're continuing to monitor the situation for the time being.

Updated
Feb 10 at 10:08pm UTC

We have identified the cause of this issue and are rolling out a fix.

Created
Feb 10 at 09:52pm UTC

We are seeing elevated incidences of weights failing to download on model startup.