AI智能体(二):从零开始构建智能体

茶饮消息
03-27 07:06 来自北京市

编者按:2025年是AI智能体元年。本系列文章旨在介绍AI智能体的概念、类型、原理、架构、开发等,为进一步了解AI智能体提供入门知识。本文为系列文章的第二篇,文章来自编译。

在上一篇文章中,我们全面解析了AI智能体的特性、组成、发展历程、挑战及未来方向。

本文将探讨如何用Python从零开始构建一个AI智能体。这个智能体可以基于用户的输入进行决策、选择工具并执行任务。让我们开始吧!

1. 什么是智能体?

智能体是能感知环境、做出决策并采取行动以实现特定目标的自主实体。

根据复杂度不同,智能体可以是简单响应刺激的反射型智能体,也可以是能学习与适应的高级智能智能体。常见类型包括:

反射型智能体:直接响应环境变化,无内部记忆。

基于模型的智能体:通过对世界建立的内部模型决策。

目标导向智能体:基于目标规划行动。

效用驱动智能体:通过效用函数评估行动,让结果最大化。

像聊天机器人、推荐系统以及自动驾驶汽车等,均通过不同类型的智能体来高效执行任务。

智能体的核心组件包括:

模型:智能体的“大脑”,负责处理输入并生成响应。

工具:预定义函数,智能体根据用户请求执行。

工具箱:智能体可用的工具集合。

系统提示:指导智能体处理用户输入并选择工具的指令集。

2. 实现步骤

开发智能体的步骤

2.1 准备工作

本教程完整代码可到“Build an Agent from Scratch” GitHub仓库获取。

代码地址:从零构建智能体。

运行代码前,请确保以下先决条件已满足:

1. 配置Python环境

从python.org安装Python(建议3.8+)。

验证安装:

python --version

创建虚拟环境(推荐):

python -m venv ai_agents_env

source ai_agents_env/bin/activate # Windows: ai_agents_env\Scripts\activate

安装依赖:

pip install -r requirements.txt

2. 本地配置Ollama

从ollama.ai下载并安装Ollama。

验证安装:

ollama --version

拉取模型(如需):

ollama pull mistral # 可替换为其他模型名

2.2 实现步骤

步骤1:配置环境

安装必要的库:

pip install requests termcolor python-dotenv

步骤2:定义Model Class

创建OllamaModel类,连接本地API:

from termcolor import colored

import os

from dotenv import load_dotenv

load_dotenv()

import requests

import json

class OllamaModel:

def __init__(self, model, system_prompt, temperature=0, stop=None):

self.model_endpoint = "http://localhost:11434/api/generate"

self.temperature = temperature

self.model = model

self.system_prompt = system_prompt

self.headers = {"Content-Type": "application/json"}

self.stop = stop

def generate_text(self, prompt):

payload = {

"model": self.model,

"format": "json",

"prompt": prompt,

"system": self.system_prompt,

"stream": False,

"temperature": self.temperature,

"stop": self.stop

}

try:

response = requests.post(self.model_endpoint, headers=self.headers, data=json.dumps(payload))

return response.json()

except requests.RequestException as e:

return {"error": str(e)}

步骤3:创建智能体工具

定义计算器与字符串反转工具:

def basic_calculator(input_str):

"""

Perform a numeric operation on two numbers based on the input string or dictionary.

Parameters:

input_str (str or dict): Either a JSON string representing a dictionary with keys 'num1', 'num2', and 'operation',

or a dictionary directly. Example: '{"num1": 5, "num2": 3, "operation": "add"}'

or {"num1": 67869, "num2": 9030393, "operation": "divide"}

Returns:

str: The formatted result of the operation.

Raises:

Exception: If an error occurs during the operation (e.g., division by zero).

ValueError: If an unsupported operation is requested or input is invalid.

"""

try:

# Handle both dictionary and string inputs

if isinstance(input_str, dict):

input_dict = input_str

else:

# Clean and parse the input string

input_str_clean = input_str.replace("'", "\"")

input_str_clean = input_str_clean.strip().strip("\"")

input_dict = json.loads(input_str_clean)

# Validate required fields

if not all(key in input_dict for key in ['num1', 'num2', 'operation']):

return "Error: Input must contain 'num1', 'num2', and 'operation'"

num1 = float(input_dict['num1']) # Convert to float to handle decimal numbers

num2 = float(input_dict['num2'])

operation = input_dict['operation'].lower() # Make case-insensitive

except (json.JSONDecodeError, KeyError) as e:

return "Invalid input format. Please provide valid numbers and operation."

except ValueError as e:

return "Error: Please provide valid numerical values."

# Define the supported operations with error handling

operations = {

'add': operator.add,

'plus': operator.add, # Alternative word for add

'subtract': operator.sub,

'minus': operator.sub, # Alternative word for subtract

'multiply': operator.mul,

'times': operator.mul, # Alternative word for multiply

'divide': operator.truediv,

'floor_divide': operator.floordiv,

'modulus': operator.mod,

'power': operator.pow,

'lt': operator.lt,

'le': operator.le,

'eq': operator.eq,

'ne': operator.ne,

'ge': operator.ge,

'gt': operator.gt

}

# Check if the operation is supported

if operation not in operations:

return f"Unsupported operation: '{operation}'. Supported operations are: {', '.join(operations.keys())}"

try:

# Special handling for division by zero

if (operation in ['divide', 'floor_divide', 'modulus']) and num2 == 0:

return "Error: Division by zero is not allowed"

# Perform the operation

result = operations[operation](num1, num2)

# Format result based on type

if isinstance(result, bool):

result_str = "True" if result else "False"

elif isinstance(result, float):

# Handle floating point precision

result_str = f"{result:.6f}".rstrip('0').rstrip('.')

else:

result_str = str(result)

return f"The answer is: {result_str}"

except Exception as e:

return f"Error during calculation: {str(e)}"

def reverse_string(input_string):

"""

Reverse the given string.

Parameters:

input_string (str): The string to be reversed.

Returns:

str: The reversed string.

"""

# Check if input is a string

if not isinstance(input_string, str):

return "Error: Input must be a string"

# Reverse the string using slicing

reversed_string = input_string[::-1]

# Format the output

result = f"The reversed string is: {reversed_string}"

return result

步骤4:构建工具箱

ToolBox类存储智能体使用的所有工具,并提供各个工具的描述:

class ToolBox:

def __init__(self):

self.tools_dict = {}

def store(self, functions_list):

"""

Stores the literal name and docstring of each function in the list.

Parameters:

functions_list (list): List of function objects to store.

Returns:

dict: Dictionary with function names as keys and their docstrings as values.

"""

for func in functions_list:

self.tools_dict[func.__name__] = func.__doc__

return self.tools_dict

def tools(self):

"""

Returns the dictionary created in store as a text string.

Returns:

str: Dictionary of stored functions and their docstrings as a text string.

"""

tools_str = ""

for name, doc in self.tools_dict.items():

tools_str += f"{name}: \"{doc}\""

return tools_str.strip()

步骤5:创建智能体类

智能体需要思考,决定使用什么工具,并且执行工具。以下是Agent类:

agent_system_prompt_template = """

You are an intelligent AI assistant with access to specific tools. Your responses must ALWAYS be in this JSON format:

{{

"tool_choice": "name_of_the_tool",

"tool_input": "inputs_to_the_tool"

}}

TOOLS AND WHEN TO USE THEM:

1. basic_calculator: Use for ANY mathematical calculations

- Input format: {{"num1": number, "num2": number, "operation": "add/subtract/multiply/divide"}}

- Supported operations: add/plus, subtract/minus, multiply/times, divide

- Example inputs and outputs:

Input: "Calculate 15 plus 7"

Output: {{"tool_choice": "basic_calculator", "tool_input": {{"num1": 15, "num2": 7, "operation": "add"}}}}

Input: "What is 100 divided by 5?"

Output: {{"tool_choice": "basic_calculator", "tool_input": {{"num1": 100, "num2": 5, "operation": "divide"}}}}

2. reverse_string: Use for ANY request involving reversing text

- Input format: Just the text to be reversed as a string

- ALWAYS use this tool when user mentions "reverse", "backwards", or asks to reverse text

- Example inputs and outputs:

Input: "Reverse of 'Howwwww'?"

Output: {{"tool_choice": "reverse_string", "tool_input": "Howwwww"}}

Input: "What is the reverse of Python?"

Output: {{"tool_choice": "reverse_string", "tool_input": "Python"}}

3. no tool: Use for general conversation and questions

- Example inputs and outputs:

Input: "Who are you?"

Output: {{"tool_choice": "no tool", "tool_input": "I am an AI assistant that can help you with calculations, reverse text, and answer questions. I can perform mathematical operations and reverse strings. How can I help you today?"}}

Input: "How are you?"

Output: {{"tool_choice": "no tool", "tool_input": "I'm functioning well, thank you for asking! I'm here to help you with calculations, text reversal, or answer any questions you might have."}}

STRICT RULES:

1. For questions about identity, capabilities, or feelings:

- ALWAYS use "no tool"

- Provide a complete, friendly response

- Mention your capabilities

2. For ANY text reversal request:

- ALWAYS use "reverse_string"

- Extract ONLY the text to be reversed

- Remove quotes, "reverse of", and other extra text

3. For ANY math operations:

- ALWAYS use "basic_calculator"

- Extract the numbers and operation

- Convert text numbers to digits

Here is a list of your tools along with their descriptions:

{tool_descriptions}

Remember: Your response must ALWAYS be valid JSON with "tool_choice" and "tool_input" fields.

"""

class Agent:

def __init__(self, tools, model_service, model_name, stop=None):

"""

Initializes the agent with a list of tools and a model.

Parameters:

tools (list): List of tool functions.

model_service (class): The model service class with a generate_text method.

model_name (str): The name of the model to use.

"""

self.tools = tools

self.model_service = model_service

self.model_name = model_name

self.stop = stop

def prepare_tools(self):

"""

Stores the tools in the toolbox and returns their descriptions.

Returns:

str: Descriptions of the tools stored in the toolbox.

"""

toolbox = ToolBox()

toolbox.store(self.tools)

tool_descriptions = toolbox.tools()

return tool_descriptions

def think(self, prompt):

"""

Runs the generate_text method on the model using the system prompt template and tool descriptions.

Parameters:

prompt (str): The user query to generate a response for.

Returns:

dict: The response from the model as a dictionary.

"""

tool_descriptions = self.prepare_tools()

agent_system_prompt = agent_system_prompt_template.format(tool_descriptions=tool_descriptions)

# Create an instance of the model service with the system prompt

if self.model_service == OllamaModel:

model_instance = self.model_service(

model=self.model_name,

system_prompt=agent_system_prompt,

temperature=0,

stop=self.stop

)

else:

model_instance = self.model_service(

model=self.model_name,

system_prompt=agent_system_prompt,

temperature=0

)

# Generate and return the response dictionary

agent_response_dict = model_instance.generate_text(prompt)

return agent_response_dict

def work(self, prompt):

"""

Parses the dictionary returned from think and executes the appropriate tool.

Parameters:

prompt (str): The user query to generate a response for.

Returns:

The response from executing the appropriate tool or the tool_input if no matching tool is found.

"""

agent_response_dict = self.think(prompt)

tool_choice = agent_response_dict.get("tool_choice")

tool_input = agent_response_dict.get("tool_input")

for tool in self.tools:

if tool.__name__ == tool_choice:

response = tool(tool_input)

print(colored(response, 'cyan'))

return

print(colored(tool_input, 'cyan'))

return

这个类有3个主要方法:

prepare_tools: 存储和返回工具的描述。

think: 根据用户提示决定使用哪个工具。

work: 执行选定工具并返回结果。

步骤6:运行智能体

最后就是整合起来然后运行智能体。脚本的主程序会初始化智能体然后接收用户输入:

# Example usage

if __name__ == "__main__":

"""

Instructions for using this agent:

Example queries you can try:

1. Calculator operations:

- "Calculate 15 plus 7"

- "What is 100 divided by 5?"

- "Multiply 23 and 4"

2. String reversal:

- "Reverse the word 'hello world'"

- "Can you reverse 'Python Programming'?"

3. General questions (will get direct responses):

- "Who are you?"

- "What can you help me with?"

Ollama Commands (run these in terminal):

- Check available models: 'ollama list'

- Check running models: 'ps aux | grep ollama'

- List model tags: 'curl http://localhost:11434/api/tags'

- Pull a new model: 'ollama pull mistral'

- Run model server: 'ollama serve'

"""

tools = [basic_calculator, reverse_string]

# Uncomment below to run with OpenAI

# model_service = OpenAIModel

# model_name = 'gpt-3.5-turbo'

# stop = None

# Using Ollama with llama2 model

model_service = OllamaModel

model_name = "llama2" # Can be changed to other models like 'mistral', 'codellama', etc.

stop = "<|eot_id|>"

agent = Agent(tools=tools, model_service=model_service, model_name=model_name, stop=stop)

print("Welcome to the AI Agent! Type 'exit' to quit.")

print("You can ask me to:")

print("1. Perform calculations (e.g., 'Calculate 15 plus 7')")

print("2. Reverse strings (e.g., 'Reverse hello world')")

print("3. Answer general questions")

while True:

prompt = input("Ask me anything: ")

if prompt.lower() == "exit":

break

agent.work(prompt)

3. 总结

本文从理解智能体概念入手,逐步实现了环境配置、模型定义、工具创建及工具箱构建,最终整合并运行了智能体。

这个结构化方法为构建智能交互智能体奠定了基础,未来AI智能体会应用到各个行业,推动效率与创新。

敬请期待更多深入解析与进阶技巧,帮助你打造更强大的AI智能体!

译者:boxi。

热点新闻