Most AI automations we see fail don't fail because of the AI — they fail because of human decisions made before a single line of code is written. These are the three most expensive mistakes we've seen in projects over the last 18 months, with real numbers from the lessons.
💰 How much does an AI automation cost in CR? See the complete pricing guide — WhatsApp bots USD 1,500–3,000, pipelines USD 3,000–5,000, agents USD 5,000–8,000.
Mistake 1: Automating a process nobody has documented
The client says "automate my quoting" and the team jumps to integrate the OpenAI API. Three weeks later the bot quotes well for 60% of cases and badly or erratically for the other 40%. Why? Because that process was never fully written down: the exceptions lived in the head of the person quoting by hand.
The rule is brutal but clear: if your team can't write the process step by step, with all its exceptions, in a document of at most 5 pages, it's not ready to be automated. Before coding anything:
- Take the last hour of quotes (10–20 real cases).
- Ask the person who does them to write the step-by-step, including when they change their mind.
- Read the 5 pages. If you find 3+ cases with no clear rule, the process isn't automatable yet.
How much skipping this step costs: in one of our projects, the client lost USD 4,000 rewriting the bot three times because each iteration had to dig back into the human operator's head for which exception was missing. Documenting first would have cost 2 days of the client's team.
Mistake 2: Using loose prompts where you need typed code
There are two worlds in AI automation:
- Text with no serious consequences: answering an FAQ, classifying an email, summarising a document. Here a direct prompt (untyped) works well — if it fails, the user asks again.
- Actions with consequences: invoicing, moving money, sending a mass email, charging a card. Here the prompt is dangerous.
Why? Because an LLM can hallucinate an amount, a date, or a name with perfect naturalness. If your system is "the LLM responds with JSON and we execute it," one day it will respond with an amount with an extra digit and your system will charge a thousand times more. We've seen it.
The solution is typed code in between: the LLM proposes, a TypeScript/Python layer with validated schemas (Zod, Pydantic) checks that the proposal meets the rules, and only then it executes. If the LLM proposes something out of range (amount > 5000 when it should never exceed 1000), the code rejects it before executing.
For critical flows this turns into a simple architecture:
LLM → Schema validation → Business rules → Execute
↑ rejects if invalid — human reviews
Cost of skipping this layer: one of our clients almost charged USD 2,800 instead of USD 280 on an invoice. An audit caught it 4 hours later, before book-close. Without the human eye it would have reached the customer.
Mistake 3: Not measuring real cost per interaction
It's the most silent and most expensive long-term mistake. The team launches the bot, the first days LLM token cost is negligible (USD 0.20/day), and nobody checks. Three months later, someone sees the bill: USD 1,200/month in GPT-4o because the bot is used way more than expected and each interaction consumes more tokens than expected.
Before going to production always calculate:
- Average tokens per interaction (input + output).
- Cost per interaction in dollars, with the chosen model.
- Estimated monthly volume (be conservative — multiply by 2).
- Estimated monthly cost = volume × cost per interaction.
If the estimated monthly cost doesn't fit your budget, change the model (from GPT-4o to GPT-4o-mini can divide the cost by 10 without losing much quality for medium tasks), or shorten the prompt (every word in the system prompt is paid every interaction).
| Model (May 2026) | Cost per 1M input tokens | Cost per 1M output tokens |
|---|---|---|
| GPT-4o | USD 5 | USD 15 |
| GPT-4o-mini | USD 0.15 | USD 0.60 |
| Claude Sonnet 4 | USD 3 | USD 15 |
| Claude Haiku 4 | USD 0.80 | USD 4 |
| Gemini 2.5 Flash | USD 0.30 | USD 2.50 |
For a typical conversational bot (1,000 tokens per interaction average), GPT-4o-mini costs ~USD 0.0008 per interaction. 10,000 interactions/month = USD 8. GPT-4o at the same volume = USD 200.
In summary
| Mistake | Symptom | Prevention |
|---|---|---|
| Automating without documenting | Bot fails in 40% of cases | 5-page document first |
| Loose prompts in critical flows | Wrong actions harming the customer | Schema + business rules in code |
| Not measuring cost per interact. | Surprise bill of USD 1,000+/month | Calculate tokens × volume BEFORE |
Well-done AI automation pays back its cost in weeks, not years. But only if it's designed with discipline, not with enthusiasm.
