Emergent AI
Posts
Demystifying Agents

Demystifying Agents

How can we make the LLM more than a chatbot?

Ilan Man
May 12, 2025

In today’s newsletter, we’ll unpack Agentic AI (sort of). But first…

🎯 tldr – Why this matters to you

When ChatGPT launched in late 2022, we went from “AI is boring” to “AI is taking over the world” almost overnight. But what many people still miss is this: the core of Generative AI (“GenAI”) is the LLM — and LLMs alone can’t do anything. They generate text. That’s it.

When people talk about Agentic AI, they’re not talking about GPT-4o, Claude, or Grok3 by themselves. They’re talking about what happens when you combine an LLM with tools, feedback loops, and orchestration layers that let the system act, reason, and improve.

In this issue, we break down those building blocks - what an “agent” actually is, how it works, and where it can fail. My hope is you leave with sharper intuition about what AI is, what it’s not, and where it’s going.

🤖 What Is Agentic AI?
🧱 The Core Ingredients
🧠 What LLMs Can (and Can’t) Do
🔧 What Are Tools?
🤝 What is an Orchestrator?
🔁 How It All Comes Together (Strava Example)
⚠️ What can go wrong?
👀 What’s Next: Tool Routing

🤖 What Is Agentic AI?

The biggest buzz in AI right now, unquestionably, is the rise of Agentic AI. We’ve moved beyond LLMs, chatbots and GenAI. Agents is where it’s at.

When we say “Agents” or “Agentic AI” we mean the ability to tell the AI something like "book a business lunch with Sam near our meeting next Wednesday", and the AI will go off and:

Check your calendar for the date and time
Find the location of the meeting (hopefully it’s there!)
Pull in relevant context on Sam, if it’s available (are they a vegan?)
Look at Yelp reviews for restaurants within some distance
Make a reservation on OpenTable under your name
Ensure you received the email confirmation

And if any of these steps fails, or the AI doesn’t have the information it needs, it will ask you for more information. Whether or not the AI does a good job is one thing, but the fact of doing the above series of tasks is what we refer to as Agentic AI.

Personally, I find it hard to follow all the hype and buzz around this stuff, and I hear that many of you feel similarly. So in a few part series, we’re going to break this stuff down into what you need to know, and why it matters.

🪜 Evolutionary ladder of Agentic AI

There are essentially 4 levels of sophistication in Generative AI applications:

LLM → Generic chatbot
RAG + LLM → Chatbot with added context
LLM powered Tool usage → Tool calling agents (today’s post)
Agentic AI → Autonomous AI with feedback loops (planning, routing, multi-step)

Today, we’re going straight to 3 - Tool calling agents. In a future post we may cover RAGs as well. Let’s go!

🧱 The Core Ingredients

Like most things in AI, there are lots of moving parts and a fair amount of hype. But at its core we’re talking about 3 basic components:

LLMs - the neural network that interprets and generates natural language
Tools - APIs or external services that actually do stuff in the world
Orchestration & Feedback Loops - logic that lets the system plan, act, and revise

🧠 What LLMs Can (and Can’t) Do

I won’t go deep into what an LLM is - that’s well covered. But here’s what you need to know for this context:

An LLM is a very smart autocomplete. It takes in a bunch of text - user input, instructions, data, schemas - and predicts the next token, or word, as its response. That’s it. And it’s magical. ✨
LLMs cannot send emails, click buttons, or call APIs on their own. They can’t actually do (i.e. execute commands) anything. That’s why the first wave of LLMs (ChatGPT, Bard, Claude) felt more like chatbots than assistants or agents.

To go beyond chat - to actually take action - we need a system that can take the text output from the LLM and hand it off to something that can execute. There’s where orchestrators and tools come in.

🔧 What Are Tools?

By "tools," I mean APIs - internet-based software that apps use to talk to each other. Briefly, an API is a structured (in code) way to say:

“Hey Gmail, send this message.”
“Hey Yelp, show me top-rated sushi near Brooklyn.”
“Hey Strava, what’s my fastest ride this year?”

Each software app structures its API differently and software developers spend a lot of time figuring out what the right way to “talk to” different APIs are, and updating their code whenever companies change their formatting.

As an aside, to help with the complexity of managing a bunch of different APIs for the LLMs to talk to, Anthropic released, Model Context Protocol (MCP), something I discussed a few weeks back.

🤝 What is an Orchestrator?

So if LLMs generate text, and “tools” take in API requests and output data, what connects the two? That would be an orchestrator. An orchestrator is responsible for interpreting the LLM’s structured output (usually in JSON), executing real-world actions like API calls, and passing the results back to the LLM.

In addition to connecting the two, it also handles a lot of the complexity around software - authentication, retries (when it fails), etc.

🔁 How It All Comes Together (Strava Example)

Let’s walk through the architecture diagram above. Say I wanted to know my fastest ride in the past 3 months, based on what I’ve uploaded to Strava. I would perform the following steps:

If you read nothing else, just read the 3 steps above, and you’re good.

Step 1: Use the LLM to structure the question
- Type into an LLM (say Claude) the question “What is my fastest ride in the past 3 months”. This isn’t something that Strava can handle natively.
- I would also feed into Claude Strava’s API structure (I can download it) and my credentials to authenticate in (so Strava doesn’t give me an error or someone else’s stats).
- The output of this step is a properly formatted API request to Strava.
Step 2: Use the output of the LLM to authenticate in and call the API
- The orchestrator (which is another tool, say LangChain) is hooked up to Claude, and pulls in the structured output, along with my credentials, and calls Strava’s API (called a GET request).
- Strava receives the request from the orchestrator, processes it, and assuming its formatted correctly, will return a structured output that contains all my information.
Step 3: Use the output of the API to generate a text response
- The orchestrator then receives this structured output and sends it to the LLM, exactly as you would send it a prompt. It takes the place of a user.
- The LLM receives this structured response (typically a JSON object) and converts it to natural language.
- Finally it’ll output “Your fastest ride was 12.1 miles around Prospect Park on March 2, 2025”.

That’s it! User → LLM → Orchestrator → Strava → Orchestrator → LLM → User

This setup is what powers modern AI assistants. The LLM never acts alone - it relies on clear schema (Strava’s API), secure access (authentication), and orchestration logic to make real-world things happen.

⚠️ What can go wrong?

Now that we’ve broken down what’s happening, you might be wondering: “Where can this fail?” A number of ways:

Authentication can fail - expired tokens, wrong scopes, or missing credentials.
The LLM may generate an incorrect function call - missing arguments, wrong formatting, or choosing the wrong tool altogether.
Timing issues between the orchestrator and the API (e.g. timeouts, retries) can break the chain.
The API response may be incomplete or structured in a way the LLM misinterprets.
The LLM might hallucinate when interpreting the API output, especially if it’s ambiguous or underspecified.

And as the system gets more complex - chaining tools, planning steps, using memory - these problems multiply. In a future post, we’ll go deeper into common failure modes, but suffice it to say: this Rube Goldberg–like setup is still evolving, and best practices are still being written.

👀 What’s Next: Tool Routing

In practice, agents are more powerful when they do multiple things, not just hit a single tool/API. Next time, we’ll go one level deeper into the agent stack - we’ll talk about tool routing: how the system decides which tool to use in the first place.

If you’re building, evaluating, or just trying to understand how agentic AI works, getting a clear mental model of these components is key.

Please let me know if this was helpful - always happy to chat AI!

Reply

or to participate.