Implementing Web Search for Large Language Models from Scratch

One major limitation of large language models (LLMs) is that their knowledge is typically constrained by a fixed cutoff date, beyond which they’re unaware of recent events or developments. Fortunately, there’s an effective solution to overcome this limitation: integrating real-time web search capabilities. Although web search functionality is now a standard feature in many LLM web interfaces, it isn’t typically available by default when interacting with an LLM through an API. At first glance, implementing such a feature yourself might seem complicated, but approaching it from first principles simplifies things considerably.

In this post, I’ll walk you through how you can quickly and effectively integrate web search functionality into your own LLM projects, from conceptualizing the solution to a practical, step-by-step implementation.

Conceptual Implementation

To break down the challenge, consider two key ideas:

A web search is simply a tool available to the LLM.
A web search operation is essentially just a straightforward API call.

Keeping these principles in mind, integrating web search fits neatly into the established “ReAct” [1] prompting framework, where an LLM iteratively reasons (thinks) and acts (uses tools) until it achieves its goal. The following diagram illustrates clearly how this integration works in practice:

sequenceDiagram
    autonumber

    actor User
    participant WebSearch as Web Search
    participant Function as Tool
    participant ChatClient as Chat Client
    participant LLM


    User->>Function: Define the web search (API call)
    Function->>User: Retrieve JSON function definition
    User->>ChatClient: Create Chat Client including web search tool
    User->>ChatClient: Send prompt
    ChatClient->>LLM: Send prompt with web search tool

    loop Reasoning and Acting
        LLM->>LLM: Reasoning: Analyze prompt and tools
        LLM->>ChatClient: Acting: Generate tool call: web search
        ChatClient->>Function: Call web search function
        Function->>WebSearch: Call web search API
        WebSearch->>Function: Return web search result
        Function->>ChatClient: Return web search result
        ChatClient->>LLM: Acting: Pass on web search result
        LLM->>LLM: Reasoning: Incorporate result and continue reasoning
    end

    LLM->>ChatClient: Return final result
    ChatClient->>User: Output final result

Now that we have a clear conceptual understanding, let’s walk through the actual implementation step-by-step.

Step 1: Choosing the Right Search Engine

When choosing a web search API for this project, simplicity was my top priority. My goal was straightforward: I wanted an API where I could simply input a search query and immediately receive a response that was easy for both me and the LLM to parse.

Initially, I considered the obvious choices: popular search engines like Google and Bing. However, I quickly realized that both come with a level of complexity that exceeded my needs.

Continuing my search, I came across Tavily, a service I have no affiliation with, but found refreshingly straightforward. Tavily offers an API tailored specifically for LLM use cases, returning concise, well-structured results. It also provides a generous free tier of 1,000 requests per month, making it ideal for experimentation and quick prototyping.

Another potential option I considered was Brave Search, which also appears to offer an accessible API with minimal overhead. It may be worth exploring if you’re looking for alternatives with similar simplicity.

Ultimately, I chose Tavily because of its minimal setup, clean responses, and ease of integration—all of which aligned perfectly with the goals of this project.

Step 2: Test Driving the Tavily API

Let’s get started by installing the Tavily Python library: (As usual, there’s a Jupyter notebook version of this blog post if you want to run the code yourself.)

!pip install tavily-python

Before integrating Tavily with our LLM, it’s a good practice to test the service independently. While the free tier provides plenty of room for testing, I chose to initially use a cached mock response to minimize unnecessary API calls. This approach ensures our logic and parsing methods work correctly before we start consuming real API credits.

Here is an example of a mock response from Tavily for the query “Who is Leo Messi?”:

Code

mock_response = """{
    "answer": null,
    "follow_up_questions": null,
    "images": [],
    "query": "Who is Leo Messi?",
    "response_time": 1.75,
    "results": [
        {
            "content": "Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a keen pass distributor and can readily thread his way through packed defenses. He led Argentina’s national team to win the 2021 Copa América and the 2022 World Cup, when he again won the Golden Ball award.",
            "raw_content": null,
            "score": 0.84027237,
            "title": "Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...",
            "url": "https://www.britannica.com/biography/Lionel-Messi"
        },
        {
            "content": "Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scored over 850 senior career goals and has provided over 380 assists for club and country. [16] Born in Rosario, Argentina, Messi relocated to Spain to join Barcelona at age 13, and made his competitive debut at age 17 in October 2004. An Argentine international, Messi is the national team's all-time leading goalscorer and most-capped player. His style of play as a diminutive, left-footed dribbler drew career-long comparisons with compatriot Diego Maradona, who described Messi as his successor.",
            "raw_content": null,
            "score": 0.8091708,
            "title": "Lionel Messi - Wikipedia",
            "url": "https://en.wikipedia.org/wiki/Lionel_Messi"
        }
    ]
}"""

Code

from IPython.display import display, Code

def display_json(data):
    """
    Nicely displays JSON content: indented + syntax-highlighted.
    
    Args:
        data (str | dict | list): The JSON or string to display.
    """
    # Parse if input is a string
    if isinstance(data, str):
        try:
            data = ast.literal_eval(data)
        except Exception as e:
            print("Failed to parse string input as JSON-like structure.")
            print("Error:", e)
            return

    # Convert to pretty JSON string
    pretty_json = json.dumps(data, indent=4, sort_keys=True, ensure_ascii=False)

    # Display with syntax highlighting
    display(Code(pretty_json, language='json'))

import json
import ast

def parse_mock_response(response_str: str):
    try:
        return json.loads(response_str)
    except json.JSONDecodeError:
        try:
            return ast.literal_eval(response_str)
        except Exception as e:
            print("❌ Failed to parse mock response:", e)
            return {}

Let’s run our first query.

Code

import os
from dotenv import load_dotenv

use_api = False

load_dotenv()
api_key = os.getenv("TAVILY_API_KEY")
if not api_key:
    raise ValueError("TAVILY_API_KEY not found in .env file.")

from tavily import TavilyClient

query = "Who is Leo Messi?"

if use_api:
    tavily_client = TavilyClient(api_key=api_key)
    response = tavily_client.search(
        query=query,
        search_depth="basic",
        include_answer=False,
        include_raw_content=False,
        max_results=2
    )
else:
    response = parse_mock_response(mock_response)

display_json(response)

{
    "answer": null,
    "follow_up_questions": null,
    "images": [],
    "query": "Who is Leo Messi?",
    "response_time": 1.75,
    "results": [
        {
            "content": "Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a keen pass distributor and can readily thread his way through packed defenses. He led Argentina’s national team to win the 2021 Copa América and the 2022 World Cup, when he again won the Golden Ball award.",
            "raw_content": null,
            "score": 0.84027237,
            "title": "Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...",
            "url": "https://www.britannica.com/biography/Lionel-Messi"
        },
        {
            "content": "Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scored over 850 senior career goals and has provided over 380 assists for club and country. [16] Born in Rosario, Argentina, Messi relocated to Spain to join Barcelona at age 13, and made his competitive debut at age 17 in October 2004. An Argentine international, Messi is the national team's all-time leading goalscorer and most-capped player. His style of play as a diminutive, left-footed dribbler drew career-long comparisons with compatriot Diego Maradona, who described Messi as his successor.",
            "raw_content": null,
            "score": 0.8091708,
            "title": "Lionel Messi - Wikipedia",
            "url": "https://en.wikipedia.org/wiki/Lionel_Messi"
        }
    ]
}

Step 3: Formatting Search Results for the LLM

Although large language models are capable of parsing raw JSON, this format isn’t ideal. It introduces unnecessary token overhead and lacks the readability and structure that both humans and LLMs benefit from. To make the results easier to consume, we’ll reformat the API response into clean, human-readable Markdown. This improves clarity, ensures more predictable behavior from the LLM, and also makes the output easier to debug and inspect during development.

Code

def format_tavily_results(response: dict, max_results: int = 5, snippet_length: int = 5000) -> str:
    """
    Formats the Tavily search API JSON response into a readable, LLM-friendly string.

    Args:
        response (dict): The Tavily API response.
        max_results (int): Maximum number of results to include.
        snippet_length (int): Max number of characters to show from each result's content.

    Returns:
        str: Formatted, readable string for LLM consumption.
    """
    results = response.get("results", [])
    if not results:
        return "No results found."

    formatted = "### Web Search Results:\n\n"
    for i, result in enumerate(results[:max_results], start=1):
        title = result.get("title", "Untitled")
        url = result.get("url", "")
        content = result.get("content", "") or ""
        snippet = content.strip().replace("\n", " ")[:snippet_length].rstrip()
        
        # Clean up unfinished sentences if needed
        if snippet and not snippet.endswith(('.', '!', '?')):
            snippet += "..."

        formatted += f"{i}. **[{title}]({url})**\n   - {snippet}\n\n"

    return formatted.strip()

formatted = format_tavily_results(response, snippet_length=300)
print(formatted)

### Web Search Results:

1. **[Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...](https://www.britannica.com/biography/Lionel-Messi)**
   - Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a ke...

2. **[Lionel Messi - Wikipedia](https://en.wikipedia.org/wiki/Lionel_Messi)**
   - Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scor...

The LLM can now easily consume the text summaries.

Step 4: Defining Web Search Tool for the LLM

Next, we’ll encapsulate our functionality into a single, reusable function that performs both API calls and formatting. Additionally, we need to define a proper documentation so that the LLM can understand how to use our tool:

def search_web(query: str) -> str:
    """
    Searches the web using Tavily and returns a formatted result.

    Args:
        query (str): The search query string.

    Returns:
        str: Formatted search results for LLM input.
    """
    if use_api:
        tavily_client = TavilyClient(api_key=api_key)
        response = tavily_client.search(
            query=query,
            search_depth="basic",
            include_answer=False,
            include_raw_content=False,
            max_results=5
        )
    else:
        response = parse_mock_response(mock_response)

    return format_tavily_results(response)

Here is an example-call including the result the LLM would receive:

Code

result = search_web("Who is Leo Messi?")
print(result)

### Web Search Results:

1. **[Lionel Messi | Biography, Trophies, Records, Ballon d'Or, Inter Miami ...](https://www.britannica.com/biography/Lionel-Messi)**
   - Lionel Messi is an Argentine-born football (soccer) player who has been named the world’s best men’s player of the year seven times (2009–12, 2015, 2019, and 2021). In 2022 he helped Argentina win the World Cup. Naturally left-footed, quick, and precise in control of the ball, Messi is known as a keen pass distributor and can readily thread his way through packed defenses. He led Argentina’s national team to win the 2021 Copa América and the 2022 World Cup, when he again won the Golden Ball award.

2. **[Lionel Messi - Wikipedia](https://en.wikipedia.org/wiki/Lionel_Messi)**
   - Widely regarded as one of the greatest players of all time, Messi set numerous records for individual accolades won throughout his professional footballing career such as eight Ballon d'Or awards and four the Best FIFA Men's Player awards. A prolific goalscorer and creative playmaker, Messi has scored over 850 senior career goals and has provided over 380 assists for club and country. [16] Born in Rosario, Argentina, Messi relocated to Spain to join Barcelona at age 13, and made his competitive debut at age 17 in October 2004. An Argentine international, Messi is the national team's all-time leading goalscorer and most-capped player. His style of play as a diminutive, left-footed dribbler drew career-long comparisons with compatriot Diego Maradona, who described Messi as his successor.

Step 5: Exposing the Web Search Tool to the LLM

To expose our web search to an LLM, we need to provide the tool definition to the LLM. Jeremy Howard shared a practical and flexible approach for this in his Hacker’s Guide. The core idea is to use Python’s introspection capabilities to extract the function signature and documentation, and convert it into a schema the LLM can understand. The version used here builds on that idea, with minor updates to match recent changes in the OpenAI tools API.

The most important part of this process is clearly documenting the function’s interface: Its name, parameters, and behavior—so that the LLM knows how and when to call it. This allows the model to use the tool automatically, without any additional prompting logic or manual wiring.

from pydantic import create_model
import inspect, json
from inspect import Parameter

def get_schema(f):
    kw = {n:(o.annotation, ... if o.default==Parameter.empty else o.default)
          for n,o in inspect.signature(f).parameters.items()}
    # update: schema -> model_json_schema
    s = create_model(f'Input for `{f.__name__}`', **kw).model_json_schema()
    # update: added function level in tools json
    function_params = dict(name=f.__name__, description=f.__doc__, parameters=s)
    return dict(type="function", function=function_params)

funcs_ok = {'search_web'}

def get_tools():
    return [get_schema(search_web)]

get_tools()

[{'type': 'function',
  'function': {'name': 'search_web',
   'description': '\n    Searches the web using Tavily and returns a formatted result.\n\n    Args:\n        query (str): The search query string.\n\n    Returns:\n        str: Formatted search results for LLM input.\n    ',
   'parameters': {'properties': {'query': {'title': 'Query',
      'type': 'string'}},
    'required': ['query'],
    'title': 'Input for `search_web`',
    'type': 'object'}}}]

Now, any compatible LLM can automatically invoke search_web when needed.

Step 6: Reuse Chat Client for LLM communication

Let’s use a simple custom client (which you can learn more about in this blog post) to try out our search tool.

Code

from IPython.display import display, Markdown

class ChatMessages:

    def __init__(self):
        """Initializes the Chat."""
        self._messages = []

    def _append_message(self, role, content):
        """Appends a message with specified role and content to messages list."""
        self._messages.append({"role": role, "content": content})

    def append_system_message(self, content):
        """Appends a system message with specified content to messages list."""
        self._append_message("system", content)
    
    def append_tool_message(self, content, tool_call_id):
        """Appends a tool message with specified content to messages list."""
        self._messages.append({"role": "tool", "content": content, "tool_call_id": tool_call_id})

    def append_user_message(self, content=None, base64_image=None):
        """Appends a user message with specified content to messages list."""
        if base64_image:
            image_content = [
                {"type": "text", "text": content},
                {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{base64_image}"}}
            ]
            self._messages.append({"role": "user", "content": image_content})  
        else:
            self._append_message("user", content)

#    def append_assistant_message(self, content=None, tool_calls=None):
#        """Appends an assistant message with specified content to messages list."""
#        if content:
#            self._append_message("assistant", content)
#        else:
#            self._messages.append({"role": "assistant", "tool_calls": tool_calls})

    def append_assistant_message(self, content=None, tool_calls=None):
        """Appends an assistant message with optional content and tool calls."""
        message = {"role": "assistant"}
        
        if content is not None:
            message["content"] = content
        
        if tool_calls is not None:
            message["tool_calls"] = tool_calls

        self._messages.append(message)

    def get_messages(self):
        """Returns a shallow copy of the messages list."""
        return self._messages[:]
    
    def get_last_assistant_message(self):
        """Returns the content of the last assistant message"""
        return self._messages[-1]['content']
    
    def get_debug_view(self):
        """Returns the debug view of the chat messages formatted as Markdown."""
        debug_view = []
        for message in self._messages:
            role = message.get('role')
            content = message.get('content', '')

            if role == 'system' or role == 'user':
                debug_view.append(f"**{role}**: {content}\n")

            elif role == 'assistant':
                if 'tool_calls' in message:
                    debug_view.append("**tool calls**\n")
                    for i, tool_call in enumerate(message['tool_calls'], start=1):
                        function_name = tool_call.function.name
                        arguments = tool_call.function.arguments
                        tool_call_id = tool_call.id
                        debug_view.append(f"{i}. tool: {function_name}: {arguments} (tool call id: {tool_call_id})\n")
                else:
                    debug_view.append(f"**assistant**: {content}\n")

            elif role == 'tool':
                tool_call_id = message.get('tool_call_id', '')
                debug_view.append(f"**tool result**: {content} (tool call id: {tool_call_id})\n")

        return Markdown('\n'.join(debug_view))

Code

model_name = "gpt-4o"

from dotenv import load_dotenv
import os

load_dotenv(".env")

from openai import chat

class ChatClient:

    def __init__(self, system_message=None, tools=None):
        """Initializes the Chat with the system message."""
        self._chat_messages = ChatMessages()
        if system_message:
            self._chat_messages.append_system_message(system_message)
        self._tools = tools

    def call_tool(self, tool_call):
        """returns the result of an LLM tool call"""
        fc = tool_call.function #Updated
        if fc.name not in funcs_ok: return print(f'Not allowed: {fc.name}')
        f = globals()[fc.name]
        return f(**json.loads(fc.arguments))

    def call_tools(self, tool_calls):
        """Processes the tool calls of the LLM response and calls the LLM API again"""
        for tool_call in tool_calls:
            chat_client._chat_messages.append_tool_message(
                content=str(self.call_tool(tool_call)),
                tool_call_id=tool_call.id)
            
        self.ask_gpt()

    def get_model_response(self):
        """Calls the LLM chat completion API"""
        return chat.completions.create(
            model=model_name,
            messages=self._chat_messages.get_messages(),
            tools=self._tools)

    def ask_gpt(self, prompt=None, base64_image=None):
        
        if base64_image:
            self._chat_messages.append_user_message(content=prompt, base64_image=base64_image)

        if prompt:
            self._chat_messages.append_user_message(prompt)

        c = self.get_model_response()
        content = c.choices[0].message.content
        tool_calls = c.choices[0].message.tool_calls

        self._chat_messages.append_assistant_message(
            content=content,
            tool_calls=tool_calls)
        
        if tool_calls:
            self.call_tools(tool_calls)

        return Markdown(self._chat_messages.get_last_assistant_message())

Let’s quickly confirm that we can talk to the large language model:

chat_client = ChatClient("Answer in a very concise and accurate way")
chat_client.ask_gpt("Name the planets in the solar system")

Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune.

Step 7: Chat with Mock Web Search

Now that we have established communication with the LLM, let’s try out our mock search

system_prompt = """You are a helpful assistant. \
                   When you search the web, make sure to cite your sources."""
chat_client = ChatClient(system_message=system_prompt, tools=get_tools())
chat_client.ask_gpt("Search the web on a random topic and tell me what you find. \
                     - do not be surprised if the result does not match the query")

I searched for “bioluminescent algae” but received results about Lionel Messi, a famed Argentine-born football player. Messi is widely regarded as one of the greatest footballers of all time, having won numerous accolades including multiple Ballon d’Or and FIFA Men’s Player awards. Despite his achievements in football, my search did not yield any information relevant to bioluminescent algae. This kind of unexpected result can sometimes happen during searches. If you’d like to try another topic, feel free to ask!

Without real-time search (use_api = False), the model always receives the search results about Messi.

Step 8: Chat with real Web Search

Let’s put everything to the test with a different search query: “Who won the German elections in 2025?”

Before enabling the real-time web search, we’ll first run this prompt with no tools attached. This allows us to confirm the baseline: The LLM cannot answer the question, because of its earlier cut-off date.

chat_client = ChatClient(system_message=system_prompt)
chat_client.ask_gpt("Who won the German elections in 2025?")

I’m unable to provide information on events beyond October 2023, as my training data only goes up until that point. You may want to check the latest news or the official German election website for up-to-date information on the 2025 German elections.

When we activate tool use, we get an answer which is grounded in our Internet search.

use_api = True
chat_client = ChatClient(system_message=system_prompt, tools=get_tools())
chat_client.ask_gpt("Who won the German elections in 2025?")

The German federal election in 2025 was won by the Christian Democratic Union (CDU), led by Friedrich Merz. The CDU secured 28.5% of the popular vote and won 208 seats in the Bundestag, making them the majority party in the election source.

The LLM now successfully retrieves and incorporates current information directly from the web.

Conclusion

When we set out to implement real-time web search for large language models, we defined two key principles:

Web search is just a tool for the LLM.
Web search is just a straightforward API call.

By sticking closely to these ideas, we’ve successfully implemented a real-time web search functionality for large language models in just a few lines of code. We created a practical and lightweight integration that significantly improves the usefulness of LLMs when accessed via APIs.

This approach shows that enhancing your model’s capabilities doesn’t require complicated setups or extensive boilerplate. With minimal effort, you can empower your models to get access to up-to-date, accurate information, making them even more valuable in everyday use.

Feel free to use this simple integration pattern as a starting point to extend your own LLM-based projects further.

References

[1] Yao, S., Yu, T., Wu, Y., Zhao, Z., Yu, K., & Liu, S. (2022). ReAct: Synergizing Reasoning and Acting in Language Models

[2] Howard, J. (2023). A Hackers’ Guide to Language Models