Tool calling: from party trick to the point

Tool calling has been around for a while now. In AI terms, that means about two years, which feels like a decade. Developers found it useful early on. You could give a model access to functions, let it decide when to call them, and suddenly the thing that only talked could actually do.

But here's the thing: even then, it wasn't always necessary. You could often get the same result by having the model output structured data and handling the logic yourself. Tool calling was elegant, but it was also a choice. One pattern among several.

Then MCP arrived and everyone jumped on the bandwagon.

The MCP moment

Model Context Protocol was Anthropic's answer to a real problem: how do you give AI access to the world in a standardised way? Instead of every integration being custom, you'd have a protocol. Servers that expose capabilities. Clients that consume them. A common language for AI to talk to tools.

The idea is good. The execution is genuinely useful. But the hype outran the understanding.

People heard "MCP" and thought it was magic. A new thing that would make AI suddenly capable of anything. They bolted MCP servers onto everything without asking whether they needed to. The protocol became a checkbox, something you had to have, rather than a tool you reached for when it solved a specific problem.

The truth is more mundane. MCP is infrastructure. It's plumbing. Good plumbing matters, but it's not the point. The point is what you build on top of it.

The real shift

The interesting thing that happened wasn't MCP. It was a change in how people thought about AI altogether.

For years, the focus was AGI. The race to build one model that could do everything. General intelligence. The machine that thinks like a human, or better. Every capability gap was framed as a step on the road to that destination.

Then, somewhere around autumn last year, the conversation shifted. Not officially, not with an announcement, but you could feel it. The focus moved from "how do we build AGI" to "how do we make this useful now, with what we have."

The answer turned out to be: stop trying to make one thing that does everything. Start combining specific capabilities into systems that do specific jobs well.

Tool calling went from a party trick to the point.

The specialisation

Different models went different directions. Some kept chasing benchmarks, raw capability, the AGI dream. Others leaned into the orchestration play. How do we make this model work well with tools, with other models, with external systems?

Claude went hard on the second path. The focus shifted to being genuinely useful in real workflows. Not just answering questions, but doing things. Reading files, searching the web, executing code, calling APIs. Not because any one of those capabilities is revolutionary, but because combining them thoughtfully is.

The model became a coordinator. Something that could look at a problem, figure out what tools it needed, use them in the right order, handle the results, and keep going until the job was done. That's not AGI. It's something more practical: a system that actually works.

What this means in practice

The possibilities now are genuinely mind-blowing, but not in the science fiction sense. Mind-blowing in the "I can't believe this actually works" sense. The "this would have taken me three hours and it took three minutes" sense.

You can have a conversation that triggers a search, that pulls data from an API, that runs some code to process it, that generates a document, that sends an email. Not as separate steps you orchestrate yourself, but as a fluid workflow where the AI figures out what needs to happen next.

The individual pieces aren't new. Search has existed forever. APIs have existed forever. Code execution, document generation, email, all old news. What's new is the orchestration layer. The thing that holds it together and makes it feel like one continuous action rather than a dozen separate tools.

The craft

This is where the craft comes in. Because "give AI access to tools" is easy to say and hard to do well.

Which tools? Too few and the system is limited. Too many and it gets confused, calling things it shouldn't, missing things it should.

When to call them? Some tools should be used proactively. Some only when explicitly needed. Some require permission. Getting this wrong means either an AI that does too much without asking or one that asks for permission so often it becomes useless.

How to handle failure? Tools fail. APIs time out. Searches return nothing useful. The system needs to handle this gracefully, retry when appropriate, fall back when necessary, know when to give up.

How to chain them? Sometimes the output of one tool is the input to another. Sometimes you need to run things in parallel. Sometimes the right next step depends on what you learned from the last one. The orchestration logic matters as much as the tools themselves.

Back to the beginning

Tool calling isn't new. MCP isn't magic. What's new is the shift in focus from building one thing that does everything to building systems that combine specific capabilities well.

That shift is why AI feels different now than it did a year ago. Not smarter in some abstract sense. More useful in a practical one. Less "look what it can do" and more "look what I can do with it."

That's the real story. The tools were always there. The change was deciding to actually use them.

Tool calling: from party trick to the point

The MCP moment

The real shift

The specialisation

What this means in practice

The craft

Back to the beginning

Related Insights

Prompt engineering: the skill that won't stay still

Fine-tuning: when you need it & when you don't

When AI is the wrong answer