Back to overview
Degraded

API errors and request delays

Nov 26 at 12:28pm UTC
Affected services
API

Resolved
Nov 26 at 07:16pm UTC

We're seeing healthy behavior since our upstream provider applied further fixes in the last hour. We will be sharing further details of how this happened once they are available.

Updated
Nov 26 at 05:16pm UTC

The partner we're working with on this issue has shared with us that they are struggling to manage extremely high bandwidth to some of their systems and this is causing the impact which is affecting Replicate and our customers.

If you're affected and can change your models or deployments to run on other hardware (such as our newly-added L40S GPUs) that will mitigate the impact you're seeing, as this only impacts A100 GPUs.

Updated
Nov 26 at 04:30pm UTC

We've noticed that the fix previously applied appears to have regressed. We've escalated this issue and will provide an update as soon as we have one.

Updated
Nov 26 at 03:57pm UTC

As of a few minutes ago we believe the underlying issues here have been resolved. We don't fully understand the nature of the problem yet but will be following up with our partners to make sure we (and they) do.

Updated
Nov 26 at 02:32pm UTC

We're continuing to investigate this issue, and are aware of the inconvenience this may be causing. We ask for your patience as we work with our infrastructure providers to identify the source of the disruption.

Created
Nov 26 at 12:28pm UTC

We're aware of an issue affecting A100 hardware types which is causing delays and error responses from our API. We are investigating the issue and will provide an update when we have more information.