Previous incidents

February 2024
No incidents reported
January 2024
No incidents reported
December 2023
Dec 06, 2023
1 incident

Slow Model Startup

Degraded

Resolved Dec 06 at 10:14pm UTC

We have cleared up the backlog of models seeing a slow starts.

1 previous update

Dec 02, 2023
1 incident

NVIDIA Driver Issues

Resolved Dec 02 at 03:15pm UTC

We have identified a few nodes within one of our regions that exhibit issues with NVIDIA drivers not being installed. We have isolated these nodes from further workload scheduling (both inference and training) and will recycle the problematic nodes.

Dec 01, 2023
1 incident

Container Images pull delays

Degraded

Resolved Dec 01 at 10:34pm UTC

Thank you for your patience. We have cleared up the remaining backlog of pending workloads. Inference and Trainings are now running as expected for all hardware types.

2 previous updates