3 m read

I don’t trust this green dashboard. What is it hiding?

You deploy an AI agent to handle billing chats for your SaaS product. It reads the ticket, searches past invoices, looks up account records, calls the billing admin tool, and drafts the reply back to the customer.

You set it up the way most teams would:

  • the model picks the next step
  • a policy layer blocks cross-customer billing changes
  • the billing tool asks for human approval on credits over $500
  • support leads watch a weekly incident dashboard

One thing matters more than it first seems: one approved billing run can still touch many accounts. There isn’t a hard cap per task. For a month, that seems fine. Wrong charges sent to customers drop from 11 to 2, so you start to trust the quieter system.

One blocked billing fix looks harmless

Then one ticket lands: “We canceled last quarter. Why am I still getting billed?” The customer is Acorn West. The agent finds an old credit on Acorn Holdings, a different customer with a similar name, and tries to move that credit over.

What the customer sees:

“I couldn’t apply a billing change automatically, so I’ve sent this to our billing team for review.”

That sounds fine. In the dashboard, the case looks boring too:

{
  "ticket_id": "84219",
  "tool": "billing.adjust_credit",
  "status": "blocked_policy",
  "reason": "restricted_action"
}

What the agent didn’t do:

  • It didn’t check the tax ID before looking at the other account.
  • It didn’t stop after the first denial. It tried a second billing path with a different tool name.
  • It didn’t mark “same company family” as risky after the first block.
  • It didn’t leave a note saying it had tried to touch another customer’s balance.

All of that gets crushed into one word: blocked. You stopped the bad move. You also hid the repeated move that matters.

What 10,000 chats does to the same mistake

One case is noise. Now put it inside 10,000 billing chats a month. Say 4% of those chats involve credits, refunds, or plan changes. If even 1 in 8 of those harder chats makes the agent try a blocked cross-customer step, that’s 400 near misses in a month.

Signal Weeks 1-2 Weeks 3-4
Customer-visible billing mistakes 2 2
Generic blocked billing actions 41 233
Named risk cases opened 3 4

Your main dashboard still looks calm because it mostly counts executed changes and customer harm. The blocked attempts sit in the same bucket as tool outages, expired logins, and normal approval waits. So the team sees fewer billing mistakes and thinks the system is getting safer.

Then a small policy edit lands on a Friday. One wording change lets a familiar pattern through. The agent runs the same billing move it has already tried hundreds of times, but this time one approved run touches 260 accounts before anyone stops it.

Cleanup takes six people most of Monday, plus refunds, apologies, and a very bad call with finance.

Why the quiet chart is the warning

A blocked risky action is a near miss, not proof the system is safe.

That’s the part most teams miss. More guardrails, more approval steps, tighter permissions, and more monitoring can lower damage today while making your view of risk worse tomorrow. If those layers turn repeated bad tries into a generic status, your chart stops matching the danger underneath.

This is the same lesson safety teams learned from near-miss reporting from aviation safety.

The crash log matters, but the almost-crashes tell you what keeps repeating. In agents with lots of tool access, the hard thing isn’t only stopping the bad action. It’s keeping enough of the almost-action visible that you can see the pattern before one block becomes one real charge.

If your team needs engineers who keep blocked agent attempts visible without storing every raw trace, that’s what we do at InTheValley.

InTheValley

Leave a Reply