sequenceDiagram autonumber actor User participant WebSearch as Web Search participant Function as Tool participant ChatClient as Chat Client participant LLM User->>Function: Define the web search (API call) Function->>User: Retrieve JSON function definition User->>ChatClient: Create Chat Client including web search tool User->>ChatClient: Send prompt ChatClient->>LLM: Send prompt with web search tool loop Reasoning and Acting LLM->>LLM: Reasoning: Analyze prompt and tools LLM->>ChatClient: Acting: Generate tool call: web search ChatClient->>Function: Call web search function Function->>WebSearch: Call web search API WebSearch->>Function: Return web search result Function->>ChatClient: Return web search result ChatClient->>LLM: Acting: Pass on web search result LLM->>LLM: Reasoning: Incorporate result and continue reasoning end LLM->>ChatClient: Return final result ChatClient->>User: Output final result
Implementing Web Search for Large Language Models from Scratch
One major limitation of large language models (LLMs) is that their knowledge is typically constrained by a fixed cutoff date, beyond which they’re unaware of recent events or developments. Fortunately, there’s an effective solution to overcome this limitation: integrating real-time web search capabilities. Although web search functionality is now a standard feature in many LLM web interfaces, it isn’t typically available by default when interacting with an LLM through an API. At first glance, implementing such a feature yourself might seem complicated, but approaching it from first principles simplifies things considerably.
In this post, I’ll walk you through how you can quickly and effectively integrate web search functionality into your own LLM projects, from conceptualizing the solution to a practical, step-by-step implementation.

Conceptual Implementation
To break down the challenge, consider two key ideas:
- A web search is simply a tool available to the LLM.
- A web search operation is essentially just a straightforward API call.
Keeping these principles in mind, integrating web search fits neatly into the established “ReAct” [1] prompting framework, where an LLM iteratively reasons (thinks) and acts (uses tools) until it achieves its goal. The following diagram illustrates clearly how this integration works in practice:
Now that we have a clear conceptual understanding, let’s walk through the actual implementation step-by-step.
Step 1: Choosing the Right Search Engine
When choosing a web search API for this project, simplicity was my top priority. My goal was straightforward: I wanted an API where I could simply input a search query and immediately receive a response that was easy for both me and the LLM to parse.
Initially, I considered the obvious choices: popular search engines like Google and Bing. However, I quickly realized that both come with a level of complexity that exceeded my needs.
Continuing my search, I came across Tavily, a service I have no affiliation with, but found refreshingly straightforward. Tavily offers an API tailored specifically for LLM use cases, returning concise, well-structured results. It also provides a generous free tier of 1,000 requests per month, making it ideal for experimentation and quick prototyping.
Another potential option I considered was Brave Search, which also appears to offer an accessible API with minimal overhead. It may be worth exploring if you’re looking for alternatives with similar simplicity.
Ultimately, I chose Tavily because of its minimal setup, clean responses, and ease of integration—all of which aligned perfectly with the goals of this project.
Step 2: Test Driving the Tavily API
Let’s get started by installing the Tavily Python library: (As usual, there’s a Jupyter notebook version of this blog post if you want to run the code yourself.)
!pip install tavily-python
Before integrating Tavily with our LLM, it’s a good practice to test the service independently. While the free tier provides plenty of room for testing, I chose to initially use a cached mock response to minimize unnecessary API calls. This approach ensures our logic and parsing methods work correctly before we start consuming real API credits.
Here is an example of a mock response from Tavily for the query “Who is Leo Messi?”:
Code
= """{
mock_response "answer": null,
"follow_up_questions": null,
"images": [],
"query": "Who is Leo Messi?",
"response_time": 1.75,
"results": [
{
"content": "Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a keen pass distributor and can readily thread his way through packed defenses. He led Argentina’s national team to win the 2021 Copa América and the 2022 World Cup, when he again won the Golden Ball award.",
"raw_content": null,
"score": 0.84027237,
"title": "Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...",
"url": "https://www.britannica.com/biography/Lionel-Messi"
},
{
"content": "Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scored over 850 senior career goals and has provided over 380 assists for club and country. [16] Born in Rosario, Argentina, Messi relocated to Spain to join Barcelona at age 13, and made his competitive debut at age 17 in October 2004. An Argentine international, Messi is the national team's all-time leading goalscorer and most-capped player. His style of play as a diminutive, left-footed dribbler drew career-long comparisons with compatriot Diego Maradona, who described Messi as his successor.",
"raw_content": null,
"score": 0.8091708,
"title": "Lionel Messi - Wikipedia",
"url": "https://en.wikipedia.org/wiki/Lionel_Messi"
}
]
}"""
Code
from IPython.display import display, Code
def display_json(data):
"""
Nicely displays JSON content: indented + syntax-highlighted.
Args:
data (str | dict | list): The JSON or string to display.
"""
# Parse if input is a string
if isinstance(data, str):
try:
= ast.literal_eval(data)
data except Exception as e:
print("Failed to parse string input as JSON-like structure.")
print("Error:", e)
return
# Convert to pretty JSON string
= json.dumps(data, indent=4, sort_keys=True, ensure_ascii=False)
pretty_json
# Display with syntax highlighting
='json'))
display(Code(pretty_json, language
import json
import ast
def parse_mock_response(response_str: str):
try:
return json.loads(response_str)
except json.JSONDecodeError:
try:
return ast.literal_eval(response_str)
except Exception as e:
print("❌ Failed to parse mock response:", e)
return {}
Let’s run our first query.
Code
import os
from dotenv import load_dotenv
= False
use_api
load_dotenv()= os.getenv("TAVILY_API_KEY")
api_key if not api_key:
raise ValueError("TAVILY_API_KEY not found in .env file.")
from tavily import TavilyClient
= "Who is Leo Messi?"
query
if use_api:
= TavilyClient(api_key=api_key)
tavily_client = tavily_client.search(
response =query,
query="basic",
search_depth=False,
include_answer=False,
include_raw_content=2
max_results
)else:
= parse_mock_response(mock_response)
response
display_json(response)
{
"answer": null,
"follow_up_questions": null,
"images": [],
"query": "Who is Leo Messi?",
"response_time": 1.75,
"results": [
{
"content": "Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a keen pass distributor and can readily thread his way through packed defenses. He led Argentina’s national team to win the 2021 Copa América and the 2022 World Cup, when he again won the Golden Ball award.",
"raw_content": null,
"score": 0.84027237,
"title": "Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...",
"url": "https://www.britannica.com/biography/Lionel-Messi"
},
{
"content": "Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scored over 850 senior career goals and has provided over 380 assists for club and country. [16] Born in Rosario, Argentina, Messi relocated to Spain to join Barcelona at age 13, and made his competitive debut at age 17 in October 2004. An Argentine international, Messi is the national team's all-time leading goalscorer and most-capped player. His style of play as a diminutive, left-footed dribbler drew career-long comparisons with compatriot Diego Maradona, who described Messi as his successor.",
"raw_content": null,
"score": 0.8091708,
"title": "Lionel Messi - Wikipedia",
"url": "https://en.wikipedia.org/wiki/Lionel_Messi"
}
]
}
Step 3: Formatting Search Results for the LLM
Although large language models are capable of parsing raw JSON, this format isn’t ideal. It introduces unnecessary token overhead and lacks the readability and structure that both humans and LLMs benefit from. To make the results easier to consume, we’ll reformat the API response into clean, human-readable Markdown. This improves clarity, ensures more predictable behavior from the LLM, and also makes the output easier to debug and inspect during development.
Code
def format_tavily_results(response: dict, max_results: int = 5, snippet_length: int = 5000) -> str:
"""
Formats the Tavily search API JSON response into a readable, LLM-friendly string.
Args:
response (dict): The Tavily API response.
max_results (int): Maximum number of results to include.
snippet_length (int): Max number of characters to show from each result's content.
Returns:
str: Formatted, readable string for LLM consumption.
"""
= response.get("results", [])
results if not results:
return "No results found."
= "### Web Search Results:\n\n"
formatted for i, result in enumerate(results[:max_results], start=1):
= result.get("title", "Untitled")
title = result.get("url", "")
url = result.get("content", "") or ""
content = content.strip().replace("\n", " ")[:snippet_length].rstrip()
snippet
# Clean up unfinished sentences if needed
if snippet and not snippet.endswith(('.', '!', '?')):
+= "..."
snippet
+= f"{i}. **[{title}]({url})**\n - {snippet}\n\n"
formatted
return formatted.strip()
= format_tavily_results(response, snippet_length=300)
formatted print(formatted)
### Web Search Results:
1. **[Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...](https://www.britannica.com/biography/Lionel-Messi)**
- Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a ke...
2. **[Lionel Messi - Wikipedia](https://en.wikipedia.org/wiki/Lionel_Messi)**
- Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scor...
The LLM can now easily consume the text summaries.
Step 4: Defining Web Search Tool for the LLM
Next, we’ll encapsulate our functionality into a single, reusable function that performs both API calls and formatting. Additionally, we need to define a proper documentation so that the LLM can understand how to use our tool:
def search_web(query: str) -> str:
"""
Searches the web using Tavily and returns a formatted result.
Args:
query (str): The search query string.
Returns:
str: Formatted search results for LLM input.
"""
if use_api:
= TavilyClient(api_key=api_key)
tavily_client = tavily_client.search(
response =query,
query="basic",
search_depth=False,
include_answer=False,
include_raw_content=5
max_results
)else:
= parse_mock_response(mock_response)
response
return format_tavily_results(response)
Here is an example-call including the result the LLM would receive:
Code
= search_web("Who is Leo Messi?")
result print(result)
### Web Search Results:
1. **[Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...](https://www.britannica.com/biography/Lionel-Messi)**
- Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a keen pass distributor and can readily thread his way through packed defenses. He led Argentina’s national team to win the 2021 Copa América and the 2022 World Cup, when he again won the Golden Ball award.
2. **[Lionel Messi - Wikipedia](https://en.wikipedia.org/wiki/Lionel_Messi)**
- Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scored over 850 senior career goals and has provided over 380 assists for club and country. [16] Born in Rosario, Argentina, Messi relocated to Spain to join Barcelona at age 13, and made his competitive debut at age 17 in October 2004. An Argentine international, Messi is the national team's all-time leading goalscorer and most-capped player. His style of play as a diminutive, left-footed dribbler drew career-long comparisons with compatriot Diego Maradona, who described Messi as his successor.
Step 5: Exposing the Web Search Tool to the LLM
To expose our web search to an LLM, we need to provide the tool definition to the LLM. Jeremy Howard shared a practical and flexible approach for this in his Hacker’s Guide. The core idea is to use Python’s introspection capabilities to extract the function signature and documentation, and convert it into a schema the LLM can understand. The version used here builds on that idea, with minor updates to match recent changes in the OpenAI tools API.
The most important part of this process is clearly documenting the function’s interface: Its name, parameters, and behavior—so that the LLM knows how and when to call it. This allows the model to use the tool automatically, without any additional prompting logic or manual wiring.
from pydantic import create_model
import inspect, json
from inspect import Parameter
def get_schema(f):
= {n:(o.annotation, ... if o.default==Parameter.empty else o.default)
kw for n,o in inspect.signature(f).parameters.items()}
# update: schema -> model_json_schema
= create_model(f'Input for `{f.__name__}`', **kw).model_json_schema()
s # update: added function level in tools json
= dict(name=f.__name__, description=f.__doc__, parameters=s)
function_params return dict(type="function", function=function_params)
= {'search_web'} funcs_ok
def get_tools():
return [get_schema(search_web)]
get_tools()
[{'type': 'function',
'function': {'name': 'search_web',
'description': '\n Searches the web using Tavily and returns a formatted result.\n\n Args:\n query (str): The search query string.\n\n Returns:\n str: Formatted search results for LLM input.\n ',
'parameters': {'properties': {'query': {'title': 'Query',
'type': 'string'}},
'required': ['query'],
'title': 'Input for `search_web`',
'type': 'object'}}}]
Now, any compatible LLM can automatically invoke search_web
when needed.
Step 6: Reuse Chat Client for LLM communication
Let’s use a simple custom client (which you can learn more about in this blog post) to try out our search tool.
Code
from IPython.display import display, Markdown
class ChatMessages:
def __init__(self):
"""Initializes the Chat."""
self._messages = []
def _append_message(self, role, content):
"""Appends a message with specified role and content to messages list."""
self._messages.append({"role": role, "content": content})
def append_system_message(self, content):
"""Appends a system message with specified content to messages list."""
self._append_message("system", content)
def append_tool_message(self, content, tool_call_id):
"""Appends a tool message with specified content to messages list."""
self._messages.append({"role": "tool", "content": content, "tool_call_id": tool_call_id})
def append_user_message(self, content=None, base64_image=None):
"""Appends a user message with specified content to messages list."""
if base64_image:
= [
image_content "type": "text", "text": content},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{base64_image}"}}
{
]self._messages.append({"role": "user", "content": image_content})
else:
self._append_message("user", content)
# def append_assistant_message(self, content=None, tool_calls=None):
# """Appends an assistant message with specified content to messages list."""
# if content:
# self._append_message("assistant", content)
# else:
# self._messages.append({"role": "assistant", "tool_calls": tool_calls})
def append_assistant_message(self, content=None, tool_calls=None):
"""Appends an assistant message with optional content and tool calls."""
= {"role": "assistant"}
message
if content is not None:
"content"] = content
message[
if tool_calls is not None:
"tool_calls"] = tool_calls
message[
self._messages.append(message)
def get_messages(self):
"""Returns a shallow copy of the messages list."""
return self._messages[:]
def get_last_assistant_message(self):
"""Returns the content of the last assistant message"""
return self._messages[-1]['content']
def get_debug_view(self):
"""Returns the debug view of the chat messages formatted as Markdown."""
= []
debug_view for message in self._messages:
= message.get('role')
role = message.get('content', '')
content
if role == 'system' or role == 'user':
f"**{role}**: {content}\n")
debug_view.append(
elif role == 'assistant':
if 'tool_calls' in message:
"**tool calls**\n")
debug_view.append(for i, tool_call in enumerate(message['tool_calls'], start=1):
= tool_call.function.name
function_name = tool_call.function.arguments
arguments = tool_call.id
tool_call_id f"{i}. tool: {function_name}: {arguments} (tool call id: {tool_call_id})\n")
debug_view.append(else:
f"**assistant**: {content}\n")
debug_view.append(
elif role == 'tool':
= message.get('tool_call_id', '')
tool_call_id f"**tool result**: {content} (tool call id: {tool_call_id})\n")
debug_view.append(
return Markdown('\n'.join(debug_view))
Code
= "gpt-4o"
model_name
from dotenv import load_dotenv
import os
".env")
load_dotenv(
from openai import chat
class ChatClient:
def __init__(self, system_message=None, tools=None):
"""Initializes the Chat with the system message."""
self._chat_messages = ChatMessages()
if system_message:
self._chat_messages.append_system_message(system_message)
self._tools = tools
def call_tool(self, tool_call):
"""returns the result of an LLM tool call"""
= tool_call.function #Updated
fc if fc.name not in funcs_ok: return print(f'Not allowed: {fc.name}')
= globals()[fc.name]
f return f(**json.loads(fc.arguments))
def call_tools(self, tool_calls):
"""Processes the tool calls of the LLM response and calls the LLM API again"""
for tool_call in tool_calls:
chat_client._chat_messages.append_tool_message(=str(self.call_tool(tool_call)),
content=tool_call.id)
tool_call_id
self.ask_gpt()
def get_model_response(self):
"""Calls the LLM chat completion API"""
return chat.completions.create(
=model_name,
model=self._chat_messages.get_messages(),
messages=self._tools)
tools
def ask_gpt(self, prompt=None, base64_image=None):
if base64_image:
self._chat_messages.append_user_message(content=prompt, base64_image=base64_image)
if prompt:
self._chat_messages.append_user_message(prompt)
= self.get_model_response()
c = c.choices[0].message.content
content = c.choices[0].message.tool_calls
tool_calls
self._chat_messages.append_assistant_message(
=content,
content=tool_calls)
tool_calls
if tool_calls:
self.call_tools(tool_calls)
return Markdown(self._chat_messages.get_last_assistant_message())
Let’s quickly confirm that we can talk to the large language model:
= ChatClient("Answer in a very concise and accurate way")
chat_client "Name the planets in the solar system") chat_client.ask_gpt(
Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune.
Step 7: Chat with Mock Web Search
Now that we have established communication with the LLM, let’s try out our mock search
= """You are a helpful assistant. \
system_prompt When you search the web, make sure to cite your sources."""
= ChatClient(system_message=system_prompt, tools=get_tools())
chat_client "Search the web on a random topic and tell me what you find. \
chat_client.ask_gpt( - do not be surprised if the result does not match the query")
I searched for “bioluminescent algae” but received results about Lionel Messi, a famed Argentine-born football player. Messi is widely regarded as one of the greatest footballers of all time, having won numerous accolades including multiple Ballon d’Or and FIFA Men’s Player awards. Despite his achievements in football, my search did not yield any information relevant to bioluminescent algae. This kind of unexpected result can sometimes happen during searches. If you’d like to try another topic, feel free to ask!
Without real-time search (use_api = False
), the model always receives the search results about Messi.
Step 8: Chat with real Web Search
Let’s put everything to the test with a different search query: “Who won the German elections in 2025?”
Before enabling the real-time web search, we’ll first run this prompt with no tools attached. This allows us to confirm the baseline: The LLM cannot answer the question, because of its earlier cut-off date.
= ChatClient(system_message=system_prompt)
chat_client "Who won the German elections in 2025?") chat_client.ask_gpt(
I’m unable to provide information on events beyond October 2023, as my training data only goes up until that point. You may want to check the latest news or the official German election website for up-to-date information on the 2025 German elections.
When we activate tool use, we get an answer which is grounded in our Internet search.
= True
use_api = ChatClient(system_message=system_prompt, tools=get_tools())
chat_client "Who won the German elections in 2025?") chat_client.ask_gpt(
The German federal election in 2025 was won by the Christian Democratic Union (CDU), led by Friedrich Merz. The CDU secured 28.5% of the popular vote and won 208 seats in the Bundestag, making them the majority party in the election source.
The LLM now successfully retrieves and incorporates current information directly from the web.
Conclusion
When we set out to implement real-time web search for large language models, we defined two key principles:
- Web search is just a tool for the LLM.
- Web search is just a straightforward API call.
By sticking closely to these ideas, we’ve successfully implemented a real-time web search functionality for large language models in just a few lines of code. We created a practical and lightweight integration that significantly improves the usefulness of LLMs when accessed via APIs.
This approach shows that enhancing your model’s capabilities doesn’t require complicated setups or extensive boilerplate. With minimal effort, you can empower your models to get access to up-to-date, accurate information, making them even more valuable in everyday use.
Feel free to use this simple integration pattern as a starting point to extend your own LLM-based projects further.
References
[1] Yao, S., Yu, T., Wu, Y., Zhao, Z., Yu, K., & Liu, S. (2022). ReAct: Synergizing Reasoning and Acting in Language Models
[2] Howard, J. (2023). A Hackers’ Guide to Language Models