Back to overview
Downtime

Errors completing predictions

Jan 22 at 06:22pm UTC
Affected services
Prediction serving

Resolved
Jan 22 at 07:34pm UTC

A fix has been rolled out to the majority of models and errors rate has returned to normal levels. We will continue to monitor to address any more occurrences of the errors.

Predictions affected by this incident (many on T4 gpus, CPU, and a subset of a100s) will appear to be stuck in the starting phase for an extended period of time. These predictions can safely be cancelled and reattempted.

Updated
Jan 22 at 07:12pm UTC

We have identified an error due to a failed deployment. A fix rollout is in progress. Models are recovering the ability to complete predictions as the rollout progresses.

Updated
Jan 22 at 06:31pm UTC

We believe we have identified the source of the failures and are working on an update so that a fix can be rolled out.

Created
Jan 22 at 06:22pm UTC

We are seeing errors occurring in one of our regions. We are currently investigating the errors and will provide updates as they are available.