Skip to main content

· 6 min read
Assaf Elovic

OpenAI has done it again with a groundbreaking DevDay showcasing some of the latest improvements to the OpenAI suite of tools, products and services. One major release was the new Assistants API that makes it easier for developers to build their own assistive AI apps that have goals and can call models and tools.

The new Assistants API currently supports three types of tools: Code Interpreter, Retrieval, and Function calling. Although you might expect the Retrieval tool to support online information retrieval (such as search APIs or as ChatGPT plugins), it only supports raw data for now such as text or CSV files.

This blog will demonstrate how to leverage the latest Assistants API with online information using the function calling tool.

To skip the tutorial below, feel free to check out the full Github Gist here.

At a high level, a typical integration of the Assistants API has the following steps:

  • Create an Assistant in the API by defining its custom instructions and picking a model. If helpful, enable tools like Code Interpreter, Retrieval, and Function calling.
  • Create a Thread when a user starts a conversation.
  • Add Messages to the Thread as the user ask questions.
  • Run the Assistant on the Thread to trigger responses. This automatically calls the relevant tools.

As you can see below, an Assistant object includes Threads for storing and handling conversation sessions between the assistant and users, and Run for invocation of an Assistant on a Thread.

OpenAI Assistant Object

Let’s go ahead and implement these steps one by one! For the example, we will build a finance GPT that can provide insights about financial questions. We will use the OpenAI Python SDK v1.2 and Tavily Search API.

First things first, let’s define the assistant’s instructions:

assistant_prompt_instruction = """You are a finance expert. 
Your goal is to provide answers based on information from the internet.
You must use the provided Tavily search API function to find relevant online information.
You should never use your own knowledge to answer questions.
Please include relevant url sources in the end of your answers.
"""

Next, let’s finalize step 1 and create an assistant using the latest GPT-4 Turbo model (128K context), and the call function using the Tavily web search API:

# Create an assistant
assistant = client.beta.assistants.create(
instructions=assistant_prompt_instruction,
model="gpt-4-1106-preview",
tools=[{
"type": "function",
"function": {
"name": "tavily_search",
"description": "Get information on recent events from the web.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query to use. For example: 'Latest news on Nvidia stock performance'"},
},
"required": ["query"]
}
}
}]
)

Step 2+3 are quite straight forward, we’ll initiate a new thread and update it with a user message:

thread = client.beta.threads.create()
user_input = input("You: ")
message = client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=user_input,
)

Finally, we’ll run the assistant on the thread to trigger the function call and get the response:

run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant_id,
)

So far so good! But this is where it gets a bit messy. Unlike with the regular GPT APIs, the Assistants API doesn’t return a synchronous response, but returns a status. This allows for asynchronous operations across assistants, but requires more overhead for fetching statuses and dealing with each manually.

Status Diagram

To manage this status lifecycle, let’s build a function that can be reused and handles waiting for various statuses (such as ‘requires_action’):

# Function to wait for a run to complete
def wait_for_run_completion(thread_id, run_id):
while True:
time.sleep(1)
run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run_id)
print(f"Current run status: {run.status}")
if run.status in ['completed', 'failed', 'requires_action']:
return run

This function will sleep as long as the run has not been finalized such as in cases where it’s completed or requires an action from a function call.

We’re almost there! Lastly, let’s take care of when the assistant wants to call the web search API:

# Function to handle tool output submission
def submit_tool_outputs(thread_id, run_id, tools_to_call):
tool_output_array = []
for tool in tools_to_call:
output = None
tool_call_id = tool.id
function_name = tool.function.name
function_args = tool.function.arguments

if function_name == "tavily_search":
output = tavily_search(query=json.loads(function_args)["query"])

if output:
tool_output_array.append({"tool_call_id": tool_call_id, "output": output})

return client.beta.threads.runs.submit_tool_outputs(
thread_id=thread_id,
run_id=run_id,
tool_outputs=tool_output_array
)

As seen above, if the assistant has reasoned that a function call should trigger, we extract the given required function params and pass back to the runnable thread. We catch this status and call our functions as seen below:

if run.status == 'requires_action':
run = submit_tool_outputs(thread.id, run.id, run.required_action.submit_tool_outputs.tool_calls)
run = wait_for_run_completion(thread.id, run.id)

That’s it! We now have a working OpenAI Assistant that can be used to answer financial questions using real time online information. Below is the full runnable code:

import os
import json
import time
from openai import OpenAI
from tavily import TavilyClient

# Initialize clients with API keys
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
tavily_client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

assistant_prompt_instruction = """You are a finance expert.
Your goal is to provide answers based on information from the internet.
You must use the provided Tavily search API function to find relevant online information.
You should never use your own knowledge to answer questions.
Please include relevant url sources in the end of your answers.
"""

# Function to perform a Tavily search
def tavily_search(query):
search_result = tavily_client.get_search_context(query, search_depth="advanced", max_tokens=8000)
return search_result

# Function to wait for a run to complete
def wait_for_run_completion(thread_id, run_id):
while True:
time.sleep(1)
run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run_id)
print(f"Current run status: {run.status}")
if run.status in ['completed', 'failed', 'requires_action']:
return run

# Function to handle tool output submission
def submit_tool_outputs(thread_id, run_id, tools_to_call):
tool_output_array = []
for tool in tools_to_call:
output = None
tool_call_id = tool.id
function_name = tool.function.name
function_args = tool.function.arguments

if function_name == "tavily_search":
output = tavily_search(query=json.loads(function_args)["query"])

if output:
tool_output_array.append({"tool_call_id": tool_call_id, "output": output})

return client.beta.threads.runs.submit_tool_outputs(
thread_id=thread_id,
run_id=run_id,
tool_outputs=tool_output_array
)

# Function to print messages from a thread
def print_messages_from_thread(thread_id):
messages = client.beta.threads.messages.list(thread_id=thread_id)
for msg in messages:
print(f"{msg.role}: {msg.content[0].text.value}")

# Create an assistant
assistant = client.beta.assistants.create(
instructions=assistant_prompt_instruction,
model="gpt-4-1106-preview",
tools=[{
"type": "function",
"function": {
"name": "tavily_search",
"description": "Get information on recent events from the web.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query to use. For example: 'Latest news on Nvidia stock performance'"},
},
"required": ["query"]
}
}
}]
)
assistant_id = assistant.id
print(f"Assistant ID: {assistant_id}")

# Create a thread
thread = client.beta.threads.create()
print(f"Thread: {thread}")

# Ongoing conversation loop
while True:
user_input = input("You: ")
if user_input.lower() == 'exit':
break

# Create a message
message = client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=user_input,
)

# Create a run
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant_id,
)
print(f"Run ID: {run.id}")

# Wait for run to complete
run = wait_for_run_completion(thread.id, run.id)

if run.status == 'failed':
print(run.error)
continue
elif run.status == 'requires_action':
run = submit_tool_outputs(thread.id, run.id, run.required_action.submit_tool_outputs.tool_calls)
run = wait_for_run_completion(thread.id, run.id)

# Print messages from the thread
print_messages_from_thread(thread.id)

The assistant can be further customized and improved using additional retrieval information, OpenAI’s coding interpreter and more. Also, you can go ahead and add more function tools to make the assistant even smarter.

Feel free to drop a comment below if you have any further questions!

· 7 min read
Assaf Elovic

After AutoGPT was published, we immediately took it for a spin. The first use case that came to mind was autonomous online research. Forming objective conclusions for manual research tasks can take time, sometimes weeks, to find the right resources and information. Seeing how well AutoGPT created tasks and executed them got me thinking about the great potential of using AI to conduct comprehensive research and what it meant for the future of online research.

But the problem with AutoGPT was that it usually ran into never-ending loops, required human interference for almost every step, constantly lost track of its progress, and almost never actually completed the task.

Nonetheless, the information and context gathered during the research task were lost (such as keeping track of sources), and sometimes hallucinated.

The passion for leveraging AI for online research and the limitations I found put me on a mission to try and solve it while sharing my work with the world. This is when I created GPT Researcher — an open source autonomous agent for online comprehensive research.

In this article, we will share the steps that guided me toward the proposed solution.

Moving from infinite loops to deterministic results

The first step in solving these issues was to seek a more deterministic solution that could ultimately guarantee completing any research task within a fixed time frame, without human interference.

This is when we stumbled upon the recent paper Plan and Solve. The paper aims to provide a better solution for the challenges stated above. The idea is quite simple and consists of two components: first, devising a plan to divide the entire task into smaller subtasks and then carrying out the subtasks according to the plan.

Planner-Excutor-Model

As it relates to research, first create an outline of questions to research related to the task, and then deterministically execute an agent for every outline item. This approach eliminates the uncertainty in task completion by breaking the agent steps into a deterministic finite set of tasks. Once all tasks are completed, the agent concludes the research.

Following this strategy has improved the reliability of completing research tasks to 100%. Now the challenge is, how to improve quality and speed?

Aiming for objective and unbiased results

The biggest challenge with LLMs is the lack of factuality and unbiased responses caused by hallucinations and out-of-date training sets (GPT is currently trained on datasets from 2021). But the irony is that for research tasks, it is crucial to optimize for these exact two criteria: factuality and bias.

To tackle this challenges, we assumed the following:

  • Law of large numbers — More content will lead to less biased results. Especially if gathered properly.
  • Leveraging LLMs for the summarization of factual information can significantly improve the overall better factuality of results.

After experimenting with LLMs for quite some time, we can say that the areas where foundation models excel are in the summarization and rewriting of given content. So, in theory, if LLMs only review given content and summarize and rewrite it, potentially it would reduce hallucinations significantly.

In addition, assuming the given content is unbiased, or at least holds opinions and information from all sides of a topic, the rewritten result would also be unbiased. So how can content be unbiased? The law of large numbers. In other words, if enough sites that hold relevant information are scraped, the possibility of biased information reduces greatly. So the idea would be to scrape just enough sites together to form an objective opinion on any topic.

Great! Sounds like, for now, we have an idea for how to create both deterministic, factual, and unbiased results. But what about the speed problem?

Speeding up the research process

Another issue with AutoGPT is that it works synchronously. The main idea of it is to create a list of tasks and then execute them one by one. So if, let’s say, a research task requires visiting 20 sites, and each site takes around one minute to scrape and summarize, the overall research task would take a minimum of +20 minutes. That’s assuming it ever stops. But what if we could parallelize agent work?

By levering Python libraries such as asyncio, the agent tasks have been optimized to work in parallel, thus significantly reducing the time to research.

# Create a list to hold the coroutine agent tasks
tasks = [async_browse(url, query, self.websocket) for url in await new_search_urls]

# Gather the results as they become available
responses = await asyncio.gather(*tasks, return_exceptions=True)

In the example above, we trigger scraping for all URLs in parallel, and only once all is done, continue with the task. Based on many tests, an average research task takes around three minutes (!!). That’s 85% faster than AutoGPT.

Finalizing the research report

Finally, after aggregating as much information as possible about a given research task, the challenge is to write a comprehensive report about it.

After experimenting with several OpenAI models and even open source, I’ve concluded that the best results are currently achieved with GPT-4. The task is straightforward — provide GPT-4 as context with all the aggregated information, and ask it to write a detailed report about it given the original research task.

The prompt is as follows:

"{research_summary}" Using the above information, answer the following question or topic: "{question}" in a detailed report — The report should focus on the answer to the question, should be well structured, informative, in depth, with facts and numbers if available, a minimum of 1,200 words and with markdown syntax and apa format. Write all source urls at the end of the report in apa format. You should write your report only based on the given information and nothing else.

The results are quite impressive, with some minor hallucinations in very few samples, but it’s fair to assume that as GPT improves over time, results will only get better.

The final architecture

Now that we’ve reviewed the necessary steps of GPT Researcher, let’s break down the final architecture, as shown below:

More specifically:

  • Generate an outline of research questions that form an objective opinion on any given task.
  • For each research question, trigger a crawler agent that scrapes online resources for information relevant to the given task.
  • For each scraped resource, keep track, filter, and summarize only if it includes relevant information.
  • Finally, aggregate all summarized sources and generate a final research report.

Going forward

The future of online research automation is heading toward a major disruption. As AI continues to improve, it is only a matter of time before AI agents can perform comprehensive research tasks for any of our day-to-day needs. AI research can disrupt areas of finance, legal, academia, health, and retail, reducing our time for each research by 95% while optimizing for factual and unbiased reports within an influx and overload of ever-growing online information.

Imagine if an AI can eventually understand and analyze any form of online content — videos, images, graphs, tables, reviews, text, audio. And imagine if it could support and analyze hundreds of thousands of words of aggregated information within a single prompt. Even imagine that AI can eventually improve in reasoning and analysis, making it much more suitable for reaching new and innovative research conclusions. And that it can do all that in minutes, if not seconds.

It’s all a matter of time and what GPT Researcher is all about.