Creating a trading agent

And some lessons learned

I’ve been using AI in my day to day for a while now. And beyond summarizing articles, search, my web browser, as a conversational assistant, I also build AI agents. Today I’ll share a little about my experience building a lightweight1 trading agent 2.

❓️ What is it?

I’ve been trading algorithmically for some time now using a platform called Alpaca. Instead of trading on a platform like Fidelity or E-trade which rely on UI based trades3, I could write my own trading algorithms, and execute a buy or sell via the Alpaca API and some code I have running on a server.

And it worked just fine! But with GenAI, and more recently MCP servers, I decided to level it up and re-write it4.

🎓️ Cool, what did you learn in the process?

A few things!

From the perspective of using AI to re-write code, one of the unexpected benefits I found is that I’m less tied to the code I (we?) wrote. In the past I’d hate to scrap some beautifully designed functions or bits of logic that took hours to pull together5. Now, it’s trivial to replace entire scripts with better versions. Of course, it leads to the slippery slope descent of critical thinking, but at least from an emotional detachment perspective, it’s great!

For my first pass, I used the set up I outlined here (adding the architecture diagram for simplicity):

Example of using an LLM and an API

This is the most basic setup for connecting an LLM to Alpaca - it’s quick to launch and gets the job done, but it comes with challenges. Mostly, I’m relying on the LLM to translate natural language into Alpaca’s required JSON format. There are two key steps here:

  1. Trust that the LLM interprets my intent correctly and

  2. Trust that the LLM converts that intent into the correct format that the Alpaca API can understand

Not only do I have to hope the LLM gets the syntax right (“buy 1 share of AAPL” vs. “buy 100 shares APPL”), but I’m also left managing a moving target as Alpaca updates their API. While I can add guardrails, I end up juggling a lot of complexity between a non-deterministic LLM and a frequently evolving API (a match made in complexity hell).

🐲 Enter the MCP6

That’s where Alpaca’s new MCP server comes in. As I’ve written before, MCP abstracts away all that integration headache by acting as a smart middleman between the LLM and Alpaca’s API. I still rely on the LLM to understand my intent (1. from above) and route requests to the right tool, but I no longer need to trust it to generate the exact API call or keep up with API changes. MCP takes over the technical translation and execution, making the system more robust and easier to maintain7.

🧩 Custom additions

Beyond the evolution from deterministic, one-way software → LLM + API → LLM + MCP, I also added a few helper functions for myself.

Cost 

I wanted to ensure my costs don’t blow up - recall that costs are based on the input tokens and output tokens sent to the LLM you use (I’m using GPT-4o). I added some logging to output how many tokens are being used with every message8, and the cumulative count.

Why cumulative? Because the LLM is going to include all historical messages in the context window. So the context window grows and grows over time; this is what we mean by the LLM’s “memory” - which is helpful in making it smarter for a given use case, but also means every LLM call gets successively more expensive (up to the context window limit9).

Disposable message

Per the above, not every message needs to be remembered10. To optimize further, I created a simple function: if I prefix a message with -d, it signals to the system not to include that message in the context window for future LLM calls. This way, I can keep costs down by excluding non-essential exchanges from the running tally of tokens.

Transparent routing 

Lastly, the way these agents work, is that either the LLM answers your question from its knowledge, or it routes your question to the tool (MCP server) and the answer is derived from the MCP. For example, when I ask my trading agent “What’s the weather like?” I expect it to answer from it’s training history. But when I ask it “What’s the price of AAPL today?” I expect it to answer from the MCP server. I added logging to ensure that I can see who11 (LLM or MCP) is answering my question. And sometimes it’s both! This helps to ensure I have transparency over decision making, can tweak the prompts if I see the wrong entity answering questions, and making debugging much easier.

🗺️ Next Steps

There are a lot more things to add to this. For starters, I’m going leverage some popular AI frameworks1 to better manage token counts and orchestration, and all this custom stuff I built. I’ll also update the prompt to be more specific, including to be more opinionated (especially when comparing backtesting algos).

Then I’ll add a UI and analytics on past trades, making it more user friendly. I’ll probably also include more backtesting strategies.

Then I want to get more up to date financial data by incorporating services like yfinance, and bring in up to date news information so I can trade off the sentiment, e.g. NewsAPI. By layering in these additional data sources, my trading agent will become much more intelligent and context-aware, allowing for more informed and adaptive trading decisions.

Ultimately, beyond generating crazy high returns (if only…), this is a fun exercise to learn these tools and potentially build something useful. Thanks to Alpaca for continuing to develop their platform.

Curious to try this yourself? Fork my code, break things, and let me know what you build!

1 Some Caveats:

1. This is a simple example of building, almost from scratch, a tool-powered LLM. Not quite an agent, but we’ll get there.
2. I made liberal use of Cursor and GPT-4.1 in writing the code. Though I did 100% of the prompt writing, and frustration induced huffing and puffing.
3. There are tons of frameworks out there (for orchestrating, integration, evaluating, and memory) to make much of what I built easier or unnecessary. That’s okay! I like to learn (and explain) by doing. All of those will come as we build incrementally.
4. From a software perspective, there’s lots of work to make this “production ready”, such as adding CI/CD, testing, better retry and timeout logic, dockerization, dependency management, etc. All with due time.

2 The returns have been slightly above market in the past few years. I’ve poured many hours to generate questionable alpha.

3 Most platforms given you some flexibility to execute trades based on rules, like “If AAPL exceeds $300, buy 10 shares”. But it’s very basic and doesn’t give you the option to use historical data to inform your strategy.

4 Me and my trusty friend, Cursor.

5 I know I need to “kill my darlings”, but sometimes after spending hours on StackOverflow and getting the code to finally work, hitting delete is easier said than done. With AI writing much of the code, it’s no longer my darling. Killing code has never been easier!

6 Part of the fun is coming up with emojis for these headers. The list is endless! And yes, I appreciate how this identifies me as a Millenial. So be it.

7 I’m lazy, so easier is better

8 Technically, it’s every LLM call. Remember, when dealing with agents:

  • User asks LLM a question → input tokens $$

  • Orchestrator routes to a tool → LLM generates output tokens $$

  • Orchestrator feeds the LLM output to the tool → generates a response

  • Orchestrator feeds the response to the LLM → input tokens $$

  • LLM produces output (in natural language) for the user → output tokens $$

So there are at least 4 passes. 

9 The context window maximum, which for GPT-4o is 128K. That’s a lot of tokens! It’s also a big bill if you keep hitting the API many times at this amount. Once the context window limit has been hit, it’ll drop old messages and “move forward” in your message history, to keep the most recent 128K tokens.

10 Because of the principle of it, but also because I’m cheap

11 The increasing anthropomorphizing as I work with these tools is bonkers wild.

Reply

or to participate.