Llama3-70b-chat Delays
Resolved
Jul 25 at 11:44pm UTC
This has been resolved and predictions should be handled normally.
Affected services
Prediction serving
Updated
Jul 25 at 10:06pm UTC
We have significantly increased capacity for the Llama3-70b-chat model. All new predictions should be served in expected time frames. We will continue to handle our backlog of predictions before the load spike.
We will monitor to ensure there are no further processing spikes in prediction handling.
Affected services
Prediction serving
Created
Jul 25 at 09:48pm UTC
We have identified a delay in processing predictions for llama3-70b-chat. We are working on expanding capacity to handle the increased load.
This only impacts llama3-70b-chat official model
Affected services
Prediction serving