Back to overview
Degraded

Llama3-70b-chat Delays

Jul 25 at 09:48pm UTC
Affected services
Prediction serving

Resolved
Jul 25 at 11:44pm UTC

This has been resolved and predictions should be handled normally.

Updated
Jul 25 at 10:06pm UTC

We have significantly increased capacity for the Llama3-70b-chat model. All new predictions should be served in expected time frames. We will continue to handle our backlog of predictions before the load spike.

We will monitor to ensure there are no further processing spikes in prediction handling.

Created
Jul 25 at 09:48pm UTC

We have identified a delay in processing predictions for llama3-70b-chat. We are working on expanding capacity to handle the increased load.

This only impacts llama3-70b-chat official model