Back to overview
Degraded

A40 models scaling slowly

Mar 14 at 02:52am UTC
Affected services
Prediction serving

Resolved
Mar 14 at 09:19am UTC

All but a very small slice of our A40 hardware is back online, and Replicate workloads are processing normally. We again thank you for your patience.

Updated
Mar 14 at 07:33am UTC

We're still working with our provider to get the remaining A40 back online. Meanwhile almost all A40 workloads are running correctly on Replicate. We'll provide an update when we're back to 100% service levels.

Thank you for your patience.

Updated
Mar 14 at 06:30am UTC

While most A40 hardware is scheduling, we are continuing to see some delays scaling for some models. We're working the the upstream provider to resolve the residual problems.

Thank you for your patience.

Updated
Mar 14 at 05:12am UTC

Models running on A40 hardware are starting to recover. We are monitoring the situation. Replicate systems will automatically process any backlog of work.

All other hardware types remain fully functional.

Updated
Mar 14 at 03:11am UTC

Our engineers have confirmed the issue is isolated to the A40 hardware type.

We are working with an upstream hardware provider to restore service.

Created
Mar 14 at 02:52am UTC

Models running on A40 hardware are currently scaling slowly, leading to delays in handling predictions.

We are working to identify what's happening here, and will give an update as soon as we know more.