Artificial Intelligence
Agentic AI
Document Intelligence

Agentic AI: Using Tool Calling to Go Beyond RAG

Written by :  
Matt Altberg, Lead ML Engineer
November 19, 2025

This is Part 2 of a three-part series:

“Agentic AI” is one of today’s most popular AI terms. But what does it actually mean?

At its core, Agentic AI describes systems that can make decisions and perform tasks on their own. They don’t just respond—they take action. That autonomy comes from two capabilities:

  • Decision-making: choosing what to do next
  • Taking action: performing the task

For humans, we can easily act on our decisions. For large language models (LLMs), it doesn't have that same capability. An LLM can reason about what should happen, but it can’t take action unless we give it a way to interact with the world.

That bridge is tool calling—the mechanism that turns a passive LLM into an active, task-performing agent.

Standard LLM Workflow

Standard LLM Workflow

Agentic LLM Workflow With Tool Calling

Agentic LLM Workflow With Tool Calling

What Is Tool Calling?

If Agentic AI is about helping LLMs act, tool calling is how they do it. Tool calling allows an LLM to use external functions or APIs (tools) to perform real-world tasks based on its reasoning. Instead of returning only text, the LLM can decide which tool to use, call it with specific inputs, and then continue reasoning using the results.

How tool calling works

  • Give the LLM a list of tools it has access to, along with the required inputs for each tool.
  • Let the LLM decide which tool to use.
  • When the LLM responds, it “calls” one of these tools, which your application then executes.
  • The result from the tool call is sent back to the LLM so it can generate a final response.

Putting it into practice

Consider the question: “What’s the weather today?”

An LLM doesn’t know the current weather. But if you give it a "getWeather" tool:

  • The LLM responds with a call to "getWeather."
  • We return the result from our "getWeather" tool.
  • The LLM synthesizes a natural language response for the user.

Taking tool calling a step further

Agents build on this by using recursive tool calling to achieve their results:

  • The agent receives a list of tools.
  • The agent decides which tool to use.
  • We return the tool’s result.
  • The agent then decides to either (a) continue using another tool, or (b) stop and provide a final response.

Tips for Better Tool Calling

Tool calling isn’t just about giving an LLM access to functions. You can also shape the LLM's behavior:

  1. Add structured arguments : For example, a "reasoning" field will make the LLM explain why it wants to use a tool, even for non-reasoning models.
  2. Use parallel calls : If the LLM can’t do parallel execution natively, create a 'wrapper tool' that gathers inputs for multiple functions at once, and handles the parallelization as part of the function.

These techniques can improve both clarity and speed.

Very Cool Tool Calling Example

(Say that 3 times fast!)

Automating dependency mapping

One of our clients needed to extract many different values from documents. Each value had its own instructions, often like this:

Value A can be found in Section X. If Value A is found, set Value B to ‘yes.’

This instruction means B depends on A.

Mapping a few dependencies by hand is simple. Mapping hundreds—written in natural language and full of cross-references—is not.

Instead of doing it manually, we used tool calling to:

  • Feed each value’s description to the LLM
  • Let the LLM generate dependencies as structured outputs
  • Build a dependency graph
  • Run a topological sort on that graph
  • Return the sorted order to the LLM
  • Let the LLM determine which values could be extracted in parallel

The result: a fully automated process for untangling human-written rules, and outputting a dependency tree that helped parallelize extractions.

Tool Calling vs. Structured Output

These two ideas work together, but they solve different problems:

  • Structured output creates a predictable format for the LLM’s response.
  • Tool calling gives the LLM a way to gather information or take action before producing that response.

In practice, you often use tool calls to collect what you need for the agent—and structured output to produce the final answer in a format fit for use outside of the agent.

Using Agents vs. Hard-Coded Tool Calls

Tool calling can be thought of as an expanded form of RAG. Instead of retrieving only documents, you’re giving the LLM access to any external capability. There are two ways to design this:

Approach 1: Pre-defined tool calls

You decide exactly which tools to call before sending your prompt to the LLM. You execute these tools as needed, and enrich the prompt with context from the pre-determined calls.

Pros:

  • Predictable behavior
  • Fixed cost
  • Fast execution

Cons:

  • No flexibility
  • The LLM can’t explore better inputs or alternative approaches

An example of this approach is basic RAG: you use a specific query to gather relevant documents, enrich your context with this data, and send it to the LLM.

Approach 2: Let the LLM decide

You give the LLM a list of tools, and their specifications, and let it choose.

Pros:

  • Highly flexible
  • Can adapt to unexpected situations

Cons:

  • Harder to control
  • More expensive (more LLM calls, more tokens)
  • Slower

Agents shine in complex, open-ended workflows—but they must be managed carefully. An example of this approach is to give the LLM a 'RAG tool', and let the LLM decide when to use the tool, and with what inputs.

Planning Tool Calls

LLMs don’t naturally reveal their thought process unless you use a dedicated reasoning model. But tool calling gives us a workaround. You can:

  • Ask the LLM to draft a plan before calling any tools
  • Use a reflection step where the LLM reviews previous actions and adjusts
  • Create a second agent whose only job is to critique the first
  • Set hard limits on tool calls and tokens
  • Filter message history to keep size under control

Good agents aren’t just powerful—they’re monitored and constrained.

Keeping Costs Under Control

Left unchecked, an agent can call tools endlessly to verify its own logic. To prevent runaway behavior:

  • Cap tool calls and/or token usage
  • Filter the message history
  • Deduplicate repeated information
  • Add extra arguments that force the LLM to summarize previous steps or tool calls

The goal is to keep the agent smart, efficient, and affordable.

The Bottom Line

Tool calling unlocks the “agentic” in Agentic AI. It lets LLMs make decisions, take action, and operate beyond their training data. But with that power comes the need for clear controls, thoughtful design, and cost-aware engineering.

Used well, tool calling turns LLMs from passive responders into active problem-solvers—capable of navigating complex tasks, coordinating multiple steps, and producing reliable, actionable outcomes.

If you’re building anything more ambitious than a single-prompt chatbot, tool calling is the key to taking your system beyond RAG and into true Agentic AI.