The tuning loop
Do not rewrite the whole agent first. Preserve the working backend, isolate the calls that fail, and tune against a repeatable set of test cases.
CaptureCollect transcripts, tool calls, latency, interruption points, menu items, and caller intent.
ClassifyTag each failure as STT, menu alias, prompt ambiguity, function schema, POS state, or escalation.
Move LogicPull deterministic rules out of the prompt and into functions, RPCs, or n8n routing where possible.
FallbackFor stuck orders, capture caller number, order intent, and send a restaurant callback task or SMS.
RegressReplay the same call set after each change so accuracy improves without breaking earlier wins.