Job lifecycle
Statuses
Section titled “Statuses”Every run goes through a series of statuses:
| Status | Meaning |
|---|---|
| Pending | Waiting to be picked up by a node |
| Running | Currently executing |
| Completed | Finished successfully |
| Retrying | The latest attempt failed and another attempt is scheduled |
| Cancelled | Cancelled by a user, during shutdown, or because it expired |
| Dead letter | Failed after all retry attempts were exhausted |
Retries
Section titled “Retries”When a job fails and retries are configured:
- The run transitions to
Retrying. - After the backoff delay, the run transitions back to
Pending. - If retries are exhausted, the run is marked
Dead letter.
Configure retries on a per-job basis:
app.AddJob("Flaky", async () => { /* ... */ }) .WithRetry(policy => { policy.MaxRetries = 5; policy.BackoffType = BackoffType.Exponential; policy.InitialDelay = TimeSpan.FromSeconds(2); policy.MaxDelay = TimeSpan.FromMinutes(5); policy.Jitter = true; // adds randomness to prevent thundering herd });The default backoff is fixed at 5 seconds with jitter enabled.
Reruns
Section titled “Reruns”Reruns are different from retries. A retry is automatic and part of the same logical execution chain. A rerun is a manual action, typically triggered from the dashboard, that creates a completely independent run with the same job name and arguments.
Reruns don’t inherit the parent’s trace tree. They’re linked via RerunOfRunId for reference, but they’re otherwise standalone.
Deduplication
Section titled “Deduplication”You can prevent duplicate runs by setting a deduplication ID:
await client.TriggerAsync("Import", args, new RunOptions{ DeduplicationId = $"import-{date:yyyy-MM-dd}"});If a non-terminal run with the same deduplication ID already exists, the trigger is rejected. Deduplication IDs are released when the existing run reaches a terminal status (Completed, Cancelled, or Dead letter).
Cron jobs use automatic deduplication based on the job name and scheduled time, so you don’t need to worry about double-firing after a node restart.
Run expiration
Section titled “Run expiration”Set NotAfter on a run to give it a deadline. If the run has not started by that time, it will eventually be cancelled automatically.
await client.TriggerAsync("TimelyReport", args, new RunOptions{ NotBefore = DateTimeOffset.UtcNow, NotAfter = DateTimeOffset.UtcNow.AddMinutes(30)});This is useful for time-sensitive work where a late execution is worse than no execution.
Node health
Section titled “Node health”Nodes send heartbeats every 30 seconds (configurable). If a node goes silent for longer than InactiveThreshold (default 2 minutes), another node recovers its runs by transitioning them through retry flow. Inactive nodes are pruned from the store.