We charged three customers twice. Why did retries keep billing?

Table of Contents

You roll out an AI support system that can finish a billing step on its own. When a customer action requires a charge, it sends the charge call, waits for a reply, and, if the call times out, retries up to 5 times.

What the agent can do:

charge a customer
wait for a reply or retry
ask the billing provider what happened
send the charge through a separate route

The nasty part of payments is simple: once the charge call leaves your system, a timeout does not mean the provider did nothing. It can mean the charge went through, and the answer never came back.

The dashboard goes green

By lunch, the run view looks clean. The case closes because the last try comes back okay.

Charge step recovered after retry 3. Customer case can close.

{
  "action": "charge_customer",
  "attempts": 3,
  "final_status": "recovered",
  "error_family": "timeout_after_send",
  "status_check_run": false,
  "route": "primary"
}

That looks safe because the last line says recovered. But the log only says the third call worked. It does not say the first call failed.

What the agent didn’t do:

It did not ask the billing provider whether attempt 1 had already been committed.
It did not check whether the same timeout kept showing up on the same route.
It did not ask whether anything about the next try would be different: worker, route, timing window, or status read.
It did not stop when its own business record remained unchanged, and no new proof appeared.

That is the whole idea. This is a proof problem, not a stamina problem. A new try gets a turn only if you can say what new proof it might bring before it runs.

Three refunds later, the pattern is obvious

One charge can look like bad luck. Three customers on the same Monday is a category. Your support team sees refunds. Your dashboard sees green recoveries.

Customer	Retry rule	New proof before next try?	What happened
1	Up to 5 tries	No	Charged twice, then refunded
2	Up to 5 tries	No	Charged twice, then refunded
3	Up to 5 tries	No	Charged twice, then refunded

At that point, the cost is not just the duplicate charge. It is all the extra run history and wait time spent repeating the same customer mutation under the same bad conditions.

If the only thing that changes is the attempt number, you are not learning. You are just rolling the same die again.

What the agent should have done instead: after the same failure repeats unchanged, it stops same-charge retries. It asks the billing provider what happened, or it uses a separate route only if that route still shares the same duplicate check or has a clean way to reverse a double charge.

The next retry has to earn its turn

A retry counts only when you can name the new proof it will create.

That cuts against the usual mix of idempotency keys, backoff, and fixed retry counts.

Those are still good tools. They just do not answer the ugly case where the charge may already have been committed, and another retry might repeat it under the same bad conditions.

Even starting a fresh run is just less history unless it truly changes the worker, code version, timing window, or billing route, or gives you a newer status read.

In particle filtering, this is called particle weight degeneracy: more samples from the same thin slice stop updating your estimate. Each retry under the same bad conditions is the same slice.

After enough repeats, more samples from the same thin slice stop changing your view of what happened. In durable distributed AI tool orchestration, that means the hard part is not getting better at retrying. It is knowing when the next retry is still the same bet.

If your team needs engineers who can tell when a retry adds proof and when it only repeats the same charge path, that’s what we do at InTheValley.

Author
Recent Posts

InTheValley

InTheValley publishes research on AI in production — workflows, architecture, and the engineering consequences of building agent-native systems.

What are you looking for?

InTheValley.blog

When AI runs the company.

We charged three customers twice. Why did retries keep billing?

The dashboard goes green

Three refunds later, the pattern is obvious

The next retry has to earn its turn

Leave a Reply Cancel reply

The dashboard goes green

Three refunds later, the pattern is obvious

The next retry has to earn its turn

You may also like

Leave a Reply Cancel reply