Idempotency Keys: Designing APIs That Survive Retries

Photo by Karola G (Kaboompics)

Photo by Karola G (Kaboompics)
Every double-charged customer in history has the same origin story: a client sent a payment request, the network timed out before the response arrived, and someone or something retried. The server processed both. Nobody wrote a bug in the usual sense; everyone behaved reasonably. The system as a whole still took money twice, and now support is issuing a refund while engineering writes a postmortem.
The cure has been industry-standard for a decade: idempotency keys. The client attaches a unique key to each logical operation, and the server guarantees that a given key executes at most once, replaying the original response for any retry. Stripe popularized the pattern, the IETF HTTP API working group drafted a standard header for it, and every payment-adjacent system I have built since uses it. This post covers the full design: HTTP semantics, server storage, concurrency, and the client contract.
The fundamental problem is that a network failure is ambiguous. When your HTTP call times out, you cannot know whether the request never arrived, arrived and failed, or arrived and succeeded while the response got lost. Stripe's engineering blog calls this the classic two generals problem in API form: the client must choose between never retrying, and risking lost operations, or retrying, and risking duplicates.
In practice you do not even get to choose. Mobile networks in Indonesia drop connections mid-flight constantly; load balancers retry on idle timeouts; queue consumers redeliver on at-least-once semantics; impatient users press the button again. Any POST endpoint that moves money, creates orders, or sends messages will be called twice with the same intent eventually. The only question is whether your API is designed to notice.
Idempotency is built into most HTTP methods by definition; the gap is POST. That gap is exactly where idempotency keys live:
| Method | Idempotent by spec? | What that means in practice |
|---|---|---|
| GET / HEAD | Yes (also safe) | Pure reads. Retry freely. Stripe explicitly does not accept idempotency keys here because they would add nothing. |
| PUT | Yes | Full replacement of a resource. Sending the same PUT twice converges to the same state. Design updates as PUTs where possible and you get retry-safety for free. |
| DELETE | Yes | Second delete returns 404 instead of 200; state is identical. Treat 404-after-delete-retry as success in clients. |
| POST | No | Creates a new resource or triggers an action each time. This is where double charges live, and where the Idempotency-Key header earns its keep. |
A robust implementation is a small state machine keyed by the client's idempotency key. The four steps below, and the schema underneath, are the shape I deploy in NestJS on PostgreSQL:
-- The storage that makes it work: one row per key attempt
CREATE TABLE idempotency_keys (
key text PRIMARY KEY,
request_hash text NOT NULL, -- sha256 of method+path+body
response_code int,
response_body jsonb,
status text NOT NULL DEFAULT 'in_progress',
created_at timestamptz NOT NULL DEFAULT now()
);
-- NestJS guard sketch (the real one is an interceptor):
-- 1. INSERT ... ON CONFLICT (key) DO NOTHING RETURNING *
-- 2. inserted? → run the handler, store response, status='done'
-- 3. conflict + done? → compare request_hash:
-- same → replay stored response (200, same body)
-- differs → 422: key reused for a different request
-- 4. conflict + in_progress? → 409: original still running, retry laterThe race between two simultaneous retries is the part most homemade implementations get wrong. Checking existence with a select and then inserting is a time-of-check to time-of-use bug: both requests pass the check and both execute. The claim must be a single atomic statement, either an insert with on-conflict handling or an advisory lock. This is also why a Redis-only implementation needs care: you want the key claim and the business write to commit or fail together, which a relational transaction gives you for free.
Stripe's production behavior is a useful calibration target because it has survived more retry traffic than any system you or I will build. Keys are accepted on all POST requests, are not accepted on GET and DELETE since those are already idempotent, and are stored for at least 24 hours, after which a reused key is treated as fresh. Replayed responses return the original result whether it was a success or an error, with an idempotent-replayed header so clients can tell.
Two design choices stand out. First, errors are replayed too: if the original attempt failed validation, the retry gets the same 4xx rather than a second execution. Second, the key is just an opaque string up to 255 characters, with UUIDs recommended; the server attaches no meaning to its contents. Both choices keep the contract simple enough that every client library can implement it identically.
Key scoped too broadly
Scope keys per endpoint and per authenticated account, not globally. Otherwise tenant A retrying can collide with tenant B's key, replaying someone else's response. Composite the storage key as account, route, and client key.
No request hash stored
Without hashing the request body, a client bug that reuses keys silently returns stale responses for new operations. Store a hash of method, path, and body, and fail loudly on mismatch.
Unbounded key table
Keys are operational data, not history. Without a TTL job your table grows forever and the unique index slows every claim. Follow the Stripe precedent: purge after 24 hours, or whatever matches your clients' longest realistic retry window.
Replaying side effects outside the transaction
If your handler sends an email and then crashes before marking the key done, the retry sends the email again. Keep external side effects out of the idempotent section: enqueue them transactionally and let the queue worker deduplicate.
The server design only works if clients hold up their half: generate one key when the user initiates an operation, then reuse that exact key across every retry of that operation. A new key per HTTP attempt defeats the entire mechanism. In our ERP frontends the key is generated when the confirmation modal opens, so the approve button can be hammered, the network can flap, and the backend still books exactly one journal entry:
// Client side: generate ONE key per logical operation,
// reuse it across every retry of that operation.
const idempotencyKey = crypto.randomUUID()
async function createPayment(payload: PaymentDto) {
return retryWithBackoff(() =>
fetch("/api/payments", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Idempotency-Key": idempotencyKey, // same key on every attempt
},
body: JSON.stringify(payload),
})
)
}Make idempotency keys mandatory, not optional, on every money-moving POST. Optional keys mean the one client that forgets is the one that double-charges. In NestJS this is a five-line guard: reject any POST to payment routes that lacks the header, and the contract becomes impossible to ignore in integration tests.
Idempotency keys are one of those patterns with an enormous ratio of value to code: a table, an interceptor, and a header turn the scariest class of distributed-systems bug into a non-event. The takeaway is to treat retries as a certainty of operating on real networks, especially on the mobile connections most Indonesian users live on, and to make at-most-once execution a property your API guarantees rather than a behavior you hope for. Build it once, wire it into every POST that matters, and the double-charge postmortem becomes someone else's story.