Errors completing predictions
Resolved
Jan 22 at 07:34pm UTC
A fix has been rolled out to the majority of models and errors rate has returned to normal levels. We will continue to monitor to address any more occurrences of the errors.
Predictions affected by this incident (many on T4 gpus, CPU, and a subset of a100s) will appear to be stuck in the starting phase for an extended period of time. These predictions can safely be cancelled and reattempted.
Affected services
Prediction serving
Updated
Jan 22 at 07:12pm UTC
We have identified an error due to a failed deployment. A fix rollout is in progress. Models are recovering the ability to complete predictions as the rollout progresses.
Affected services
Prediction serving
Updated
Jan 22 at 06:31pm UTC
We believe we have identified the source of the failures and are working on an update so that a fix can be rolled out.
Affected services
Prediction serving
Created
Jan 22 at 06:22pm UTC
We are seeing errors occurring in one of our regions. We are currently investigating the errors and will provide updates as they are available.
Affected services
Prediction serving