v1.4: The Hallucination That Almost Made It to Production

Recently, I have been working on some new things that I always wanted to catch up on. In this AI space, it always feels like you’re way behind on concepts! But behind the constant flow of inventions (and sometimes disguised reinventions), there are very important core concepts that are interesting to know in order to build reliable systems.

One such concept that I wanted to work on for some time now is the multi-agent architecture.

I previously worked exclusively with single-agent systems, but recently started a project where multiple agents made more sense. The benefits were clear: separation of concerns meant each agent could focus on its specific task, specialization allowed optimization for particular domains, and the overall workflow became cleaner to maintain.
The architecture felt right on paper, and actually delivered good results.

I’d been curious about MCP servers for a while and saw this as the perfect opportunity to integrate them into my agents. To get my feet wet, I started by adding a few to my agentic coding assistant in my IDE. The AWS documentation MCP server was the main one—it gave my assistant access to current docs, which helped make more informed decisions about infrastructure choices. Worked perfectly on the first try.

Great.

Now on to the actual project.

Typically, when I need to connect my agents to external data sources or custom logic, I write Lambda functions and define them with OpenAPI schemas as action groups. I’ve done this successfully multiple times—it’s the standard Bedrock pattern and works well.
But this time, I wanted to try something different. Instead of the Lambda + action group approach, I’d write my function and expose it as an MCP server through AgentCore’s Gateway that lets Bedrock agents consume MCP servers as if they were native tools.

So far so good.

I wrote the code, started the implementation. But then, I had a thought:

Why not ask the IDE agentic assistant what it thinks of the approach?

And so I did.

The assistant worked dutifully. It read through the AgentCore documentation, examined my code, and confidently told me I was overengineering the solution. There was a direct database integration feature in the gateway that would let me point straight to my data source—no need for complex MCP server functions at all.
Curious, I asked to know more.
The coding assistant did its research, then delivered: Python scripts, beautiful IAC templates, documentation snippets, step-by-step implementation instructions. Everything from start to finish, done for me.

I just had to click “Approve”.

I was genuinely impressed and about to rip out my working implementation to use this cleaner approach.
Then a thought nagged at me:

How come I’d never heard of this database integration feature before?

The suspicion didn’t come from expertise—I’m new to AgentCore and MCP servers. But I’d spent an unbelievable amount of time in the official AgentCore GitHub repository, working through their tutorials, reading their code. I’d never seen this feature mentioned anywhere.
So I asked the assistant for a link so I could verify it myself.
The link took me to a page with no mention of the feature.
Incredulous, I opened a new script and tried importing the libraries the assistant had written. None of them existed.
All of it was fake.
Not even the IAC template was real. When I tried to deploy it, all I got were errors.

This was easily the greatest hallucination I’ve ever seen from an LLM up to this point. Everything looked pristine, well-written, in accordance with AWS documentation. But it was not. The code was so good, that for a moment, I genuinely wished the database integration did exist.

I felt like Jabez Wilson in “The Red-Headed League”—the Sherlock Holmes story where a man is hired for a seemingly legitimate but suspiciously easy job copying encyclopedia entries, only to discover the entire operation was an elaborate fabrication designed to keep him occupied while criminals robbed a bank. My AI assistant had sent me on an equally convincing wild goose chase, complete with official-looking documentation and plausible code.
It was a stark reminder: no matter how sophisticated these agentic systems are at present, their outputs need verification. The assistant had access to real documentation through the MCP server, confidently synthesized information, and presented it with authority. But confidence isn’t correctness.

Bamboozled, led astray, run amok and feeling a bit dumb, I kept my old, reliable MCP server implementation—the one that actually worked.

And I uninstalled my assistant, for good measure.

v1.4: The Hallucination That Almost Made It to Production

Share this:

Leave a comment Cancel reply