Retries
Configure retry behavior per Step with work_config. Covers what triggers a retry (5xx and transport errors), what is permanent (4xx), the three backoff strategies, and the global defaults that apply when fields are omitted.
When Retries Happen
Retries are triggered when a Work Item is marked work_not_completed:
- Network or transport failures (connection errors, timeouts)
- HTTP 5xx responses from the Step endpoint
HTTP 4xx responses are permanent failures and do not retry. If business logic failed permanently, return a 4xx with application/problem+json.
Configuration
Retry behavior is configured in work_config on the Step:
{
"id": "charge-card",
"type": "async",
"http": { "endpoint": "https://api.example.com/payments", "timeout": 5000 },
"work_config": {
"max_retries": 3,
"init_backoff": 1000,
"max_backoff": 30000,
"backoff_type": "exponential"
}
}
| Field | Meaning |
|---|---|
max_retries | Max retry attempts. -1 = unlimited. 0 = use global default |
init_backoff | Initial delay in milliseconds. 0 = use global default |
max_backoff | Maximum delay in milliseconds. 0 = use global default |
backoff_type | fixed, linear, or exponential. Empty = use global default |
Steps with max_retries = 0 and empty backoff fields inherit the engine’s global defaults set at startup.
Backoff Strategies
- Fixed: constant delay between every attempt
- Linear: delay grows by a fixed increment each attempt
- Exponential: delay doubles each attempt, up to
max_backoff
Retry Lifecycle
Each Work Item retries independently:
(with backoff)"] Done["Work item complete"] Failed["Work item failed permanently"] Exhausted["Max retries reached"] Start --> Execute Execute --> Success --> Done Execute --> Fail4xx --> Failed Execute --> Fail5xx --> Retry Retry --> Execute Retry -->|"max_retries hit"| Exhausted --> Failed
Compensation Retries
When a step has a compensate endpoint configured, compensation attempts use the same work_config retry settings as normal work. The engine treats 5xx compensation responses as transient and retries with the configured backoff. When max_retries is exhausted, the compensation is marked comp_failed.
See Compensation for full details.
Design Tips
- Use fixed backoff for quick retry of flaky dependencies
- Use exponential backoff for rate-limited or unstable services
- Keep
max_retrieslow unless your Step is idempotent (honorArgyll-Receipt-Token) - Return HTTP 4xx when the failure is permanent and retrying would be wrong