Previous incidents
February 2024
No incidents reported
January 2024
No incidents reported
December 2023
Dec 06, 2023
1 incident
Slow Model Startup
Degraded
Resolved Dec 06 at 10:14pm UTC
We have cleared up the backlog of models seeing a slow starts.
1 previous update
Dec 02, 2023
1 incident
NVIDIA Driver Issues
Resolved Dec 02 at 03:15pm UTC
We have identified a few nodes within one of our regions that exhibit issues with NVIDIA drivers not being installed. We have isolated these nodes from further workload scheduling (both inference and training) and will recycle the problematic nodes.
Dec 01, 2023
1 incident
Container Images pull delays
Degraded
Resolved Dec 01 at 10:34pm UTC
Thank you for your patience. We have cleared up the remaining backlog of pending workloads. Inference and Trainings are now running as expected for all hardware types.
2 previous updates