We introduce tulip agent, an architecture for autonomous LLM-based agents with Create, Read, Update, and Delete access to a tool library containing a potentially large number of tools. In contrast to state-of-the-art implementations, tulip agent does not encode the descriptions of all available tools in the system prompt, which counts against the model's context window, or embed the entire prompt for retrieving suitable tools. Instead, the tulip agent can recursively search for suitable tools in its extensible tool library, implemented exemplarily as a vector store. The tulip agent architecture significantly reduces inference costs, allows using even large tool libraries, and enables the agent to adapt and extend its set of tools. We evaluate the architecture with several ablation studies in a mathematics context and demonstrate its generalizability with an application to robotics. A reference implementation and the benchmark are available at github.com/HRI-EU/tulip_agent.
The tulip agent architecture includes several key components: the language model, the tool library, the tools themselves, a search module, the function execution module, and a tool introspection module.
An evaluation on math tasks with several ablations yielded various insights: 1) tools are essential for solving complex tasks, 2) using a tool library significantly reduces costs, 3) task decomposition improves tool use, 4) language model performance influences the suitability of agent designs, 5) embedding model performance (on the high level of current OpenAI models) has little influence on tool retrieval, 6) better planning allows narrower search for tools, and 7) the tulip agent architecture is suited for continually creating tools and building a tool library on the fly.
Excerpt of the results for several agents variants; run with gpt-3.5-turbo-0125, text-embedding-3-large, top_k = 5, and averaged across 5 runs.
In addition to the math evaluation, we applied the tulip agent architecture for controlling a robot in several scenarios in simulation, showing promising results:
The CotTulipAgent controlling a supportive robot in simulation.