Rate Limiting
Embedd.to uses per-API-key rate limiting to ensure fair usage.
Default Limits
| Tier | Rate Limit |
|---|---|
| Default | 60 requests per minute |
Rate limits are applied per API key using a sliding window algorithm backed by Redis.
Rate Limit Headers
Every response includes rate limit headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests per minute |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Handling Rate Limits
When you exceed the rate limit, the API returns a 429 Too Many Requests response:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Too many requests",
"resolution": "Retry after 45 seconds"
}
}
The response includes a Retry-After header with the number of seconds to wait.
Best Practices
- Implement exponential backoff — Wait longer between retries on repeated 429s
- Monitor rate limit headers — Track
X-RateLimit-Remainingto avoid hitting limits - Batch operations — Use bulk operations where available instead of individual requests
- Cache responses — Cache read-only responses to reduce API calls
Exempt Endpoints
The /health endpoint is not rate limited.