Previous incidents
Slow Model Startup
Resolved Dec 06 at 10:14pm UTC
We have cleared up the backlog of models seeing a slow starts.
1 previous update
NVIDIA Driver Issues
Resolved Dec 02 at 03:15pm UTC
We have identified a few nodes within one of our regions that exhibit issues with NVIDIA drivers not being installed. We have isolated these nodes from further workload scheduling (both inference and training) and will recycle the problematic nodes.
Container Images pull delays
Resolved Dec 01 at 10:34pm UTC
Thank you for your patience. We have cleared up the remaining backlog of pending workloads. Inference and Trainings are now running as expected for all hardware types.
2 previous updates
A100 GPU maintenance
Resolved Nov 14 at 06:59pm UTC
The maintenance event has passed. We believe impact to Replicate customers was minimal.
1 previous update
Problems running some A40 models
Resolved Nov 10 at 10:09pm UTC
We have confirmed and corrected any model versions erroneously disabled during this issue.
Use of A40s for predictions and trainings is now working as expected.
4 previous updates
Replicate website unavailable
Resolved Nov 08 at 03:53pm UTC
It looks to us like one of our providers had a brief outage and things are now coming back. We're continuing to monitor the situation.
(Technical details: it looks like an upstream provider had a brief DNSSEC zone signing outage.)
1 previous update
Slow model startup in some cases
Resolved Nov 06 at 12:53am UTC
The slow model startup has resolved. We will continue to work internally and with our provider to remediate the root cause.
1 previous update
Slower predictions and webhook delivery
Resolved Nov 05 at 08:03am UTC
The prediction and webhook delivery issues are resolved now. There might be still a delay in webhook delivery of older predictions.
2 previous updates
Investigating predictions creation issues
Resolved Nov 02 at 06:53pm UTC
The issue has been resolved and predictions are now functioning normally.
2 previous updates
Replicate Web Internal Service Error
Resolved Nov 01 at 09:43pm UTC
Rollback of the problematic change has completed and Replicate website is now functioning normally again.
2 previous updates
SDXL Finetune errors
Resolved Nov 01 at 06:37pm UTC
We have rolled out a fix and confirmed finetunes are working as expected.
2 previous updates
Predictions and trainings degraded
Resolved Oct 19 at 02:30pm UTC
Predictions and trainings are back to normal.
1 previous update
replicate.com database maintenance
Resolved Oct 10 at 12:24pm UTC
All done! Thanks for your patience.
2 previous updates
Webhook delivery interrupted
Resolved Oct 08 at 08:13pm UTC
We identified a problem affecting a small portion of customers -- slow responses to webhooks caused a backlog in processing outbound webhooks -- and have deployed a change to increase available webhook processing capacity. Webhook delivery is back to normal as of a few minutes ago.
1 previous update
Pushing of new versions is broken
Resolved Oct 06 at 11:20am UTC
We've fixed the issue and you should be able to push new versions again.
1 previous update
Slow responses from replicate.com
Resolved Oct 05 at 07:29am UTC
Database load is back to normal, performance should be back to usual levels.