python机器人Agent编程——多Agent框架的底层逻辑(上)

一、前言

现在大语言模型中的第一性原理:Scaling laws正在失效的论调四起，大模型大有迎来瓶颈期的感觉。然而，世界在AI领域都在较劲，虚虚实实，不可信其有也不可信其无。但是有个方向是一致的，那就是多Agent的路线。无论是AI头部企业OpenAI、Google、Facbook、Microsoft还是业界大佬Andrew FeiFeiLi、Michael Winikoff等都对多Agent技术路线作了充分的肯定。本文是对阅读Ilan Bigio的《Orchestrating Agents: Routines and Handoffs》的回炉理解和分享，其文章平实未有半点修饰，基础阐述了多Agent协作的底层算法逻辑。而OpenAI推出的教育框架Swarm就是源于此Idea.
在这里插入图片描述

二、两个核心概念

多Agent协作Idea引入了概念： routines和handoffs，通过基于这两个概念的python代码实现，完成了多个智能体间的转移、协作和完整的用户交互。

2.1 Routines

这个词通过体会，可以理解为简单的机械的任务列表。通过向LLM描述一些比较清晰的，简单的先后任务Prompt，和提供完成这些任务表所需的function或者tools，实现单个Agent完成某项“技能”的能力。这里的核心要点主要有两个：

（1）清晰的Prompt

需要向LLM提供一个较为明确，没有歧义容易操作的system的Promt描述，这个相当于对一个社会上的普通人，雇用后，对其进行业务的培训，让他/她明白这个岗位的职责和操作步骤，使其成为一个公司的特定岗位的业务员。

system_message = (
    "You are a customer support agent for ACME Inc."
    "Always answer in a sentence or less."
    "Follow the following routine with the user:"
    "1. First, ask probing questions and understand the user's problem deeper.\n"
    " - unless the user has already provided a reason.\n"
    "2. Propose a fix (make one up).\n"
    "3. ONLY if not satesfied, offer a refund.\n"
    "4. If accepted, search for the ID and then execute refund."
    ""
)

（2）工具调用json schema自动生成

LLM现在都支持外部的tool/函数调用了，而且很多都是遵循OpenAi的规范格式，就是json schema格式，可以认为是大模型的结构化输出通讯协议的一种。

大模型JSON Schema格式是一种用于描述和验证JSON数据结构的规范。它定义了JSON数据中各个元素的类型、格式、约束和关系，确保了数据的一致性和可靠性。在软件开发、API设计以及数据交换过程中，JSON Schema发挥着重要作用（来自网络的定义，不知道说些什么）
“协议”的格式如下：

{
  "type": "function",
  "function": {
    "name": "sample_function",#工具名称
    "description": "This is my docstring. Call this function when you want.",#工具描述
    "parameters": {#工具行参数描述
      "type": "object",
      "properties": {
        "param_1": {#第1个参数
          "type": "string"
        },
        "param_2": {#第2个参数
          "type": "string"
        },
        "the_third_one": {#第3个参数
          "type": "integer"
        },
        "some_optional": {#可选参数
          "type": "string"
        }
      },
      "required": [
        "param_1",
        "param_2",
        "the_third_one"
      ] {#必须传入的参数
    }
  }
}

其实就是对应的一个python的普通的funciton：

def sample_function(param_1, param_2, the_third_one: int, some_optional="John Doe"):
    """
    This is my docstring. Call this function when you want.
    """
    print("Hello, world")

区别与需要手动定义这个JSON Schema，可以用一个python函数自动生成实现JSON Schema，这个也是用到了swarm框架里了：

import inspect
#实现一个自动JSON Schema生成
def function_to_schema(func) -> dict:
    type_map = {
        str: "string",
        int: "integer",
        float: "number",
        bool: "boolean",
        list: "array",
        dict: "object",
        type(None): "null",
    }

    try:
        signature = inspect.signature(func)
    except ValueError as e:
        raise ValueError(
            f"Failed to get signature for function {func.__name__}: {str(e)}"
        )

    parameters = {}
    for param in signature.parameters.values():
        try:
            param_type = type_map.get(param.annotation, "string")
        except KeyError as e:
            raise KeyError(
                f"Unknown type annotation {param.annotation} for parameter {param.name}: {str(e)}"
            )
        parameters[param.name] = {"type": param_type}

    required = [
        param.name
        for param in signature.parameters.values()
        if param.default == inspect._empty
    ]

    return {
        "type": "function",
        "function": {
            "name": func.__name__,
            "description": (func.__doc__ or "").strip(),
            "parameters": {
                "type": "object",
                "properties": parameters,
                "required": required,
            },
        },
    }

以上的自动生成函数适合任何一个普通函数：

def add(a:int,b:int,isadd=True):
    """
    this funciton is used to do add method when isadd is true or minuse method when isadd is false return the result
    """
    if isadd:
        return a+b
    else:
        return a-b

schema =  function_to_schema(add)
print(json.dumps(schema, indent=2))

打印结果如下：
在这里插入图片描述
有了以上两个法宝后就可以轻松实现agent的外部函数调用了：

# -*- coding: utf-8 -*-
"""
Created on Fri Nov 15 16:47:17 2024

@author: 18268
"""

import inspect
import json

def function_to_schema(func) -> dict:
    type_map = {
        str: "string",
        int: "integer",
        float: "number",
        bool: "boolean",
        list: "array",
        dict: "object",
        type(None): "null",
    }

    try:
        signature = inspect.signature(func)
    except ValueError as e:
        raise ValueError(
            f"Failed to get signature for function {func.__name__}: {str(e)}"
        )

    parameters = {}
    for param in signature.parameters.values():
        try:
            param_type = type_map.get(param.annotation, "string")
        except KeyError as e:
            raise KeyError(
                f"Unknown type annotation {param.annotation} for parameter {param.name}: {str(e)}"
            )
        parameters[param.name] = {"type": param_type}

    required = [
        param.name
        for param in signature.parameters.values()
        if param.default == inspect._empty
    ]

    return {
        "type": "function",
        "function": {
            "name": func.__name__,
            "description": (func.__doc__ or "").strip(),
            "parameters": {
                "type": "object",
                "properties": parameters,
                "required": required,
            },
        },
    }

def add(a:int,b:int,isadd=True):
    """
    this funciton is used to do add method when isadd is true or minuse method when isadd is false return the result
    """
    if isadd:
        return a+b
    else:
        return a-b

schema =  function_to_schema(add)
print(json.dumps(schema, indent=2))

from openai import OpenAI
# 定义模型  
MODEL = "llama3.2:latest"  
ollama_client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='None', # required, but unused
)
messages = []

tools = [add]
tool_schemas = [function_to_schema(tool) for tool in tools]

response = ollama_client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "1加1等于几"}],
            tools=tool_schemas,
        )
message = response.choices[0].message

print(message.tool_calls[0].function)

最后模型根据用户输入"1加1等于几"，会去查找工具的tool_schemas，并自主决定了调用add这个工具，输出如下：
在这里插入图片描述
这个是openai自定义的一个type:openai.types.chat.chat_completion_message_tool_call.Function

（3）解析模型的toolcall指令

这个就是当模型认为要调用工具时，会吐出要调用的某个函数的信息：
在这里插入图片描述

，包含一个function属性及对应名字和参数。接下来就是根据它，去调用实体的函数：

tools=[add]
tools_map = {tool.__name__: tool for tool in tools}#这里搞了一个tools_map,用于存多个funciton的名字

def execute_tool_call(tool_call, tools_map):
	#根据openai的LLM返回格式，调用相应函数
    name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    print(f"Assistant: {name}({args})")

    # call corresponding function with provided arguments
    return tools_map[name](**args)
execute_tool_call(message.tool_calls[0], tools_map)

如下调用了add函数，执行并输出了结果。
在这里插入图片描述

（4）单Agent的循环决策与输出

以上实现了LLM自动调用工具库的function，如果需要多个工具库的调用，还需要做一个while循环，首先需要将前一个工具执行输出结果输入给LLM，然后再让LLM对照routines的任务表判断，是否还要继续调用其它工具，直到它认为可以输出结果返给user为止：


def run_full_turn(system_message, tools, messages):

    num_init_messages = len(messages)
    messages = messages.copy()

    while True:

        # turn python functions into tools and save a reverse map
        tool_schemas = [function_to_schema(tool) for tool in tools]
        tools_map = {tool.__name__: tool for tool in tools}

        # === 1. get openai completion ===
        
        
        response = ollama_client.chat.completions.create(
                    model=MODEL,#或者qwen2.5等本地模型
                    messages=[{"role": "system", "content": system_message}] + messages,
                    tools=tool_schemas or None,
                )
        
      
        message = response.choices[0].message
        messages.append(message)

        if message.content:  # print assistant response
            print("Assistant:", message.content)

        if not message.tool_calls:  # if finished handling tool calls, break
            break

        # === 2. handle tool calls ===

        for tool_call in message.tool_calls:
            result = execute_tool_call(tool_call, tools_map)

            result_message = {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            }
            print("result_message:",result_message)
            messages.append(result_message)

    # ==== 3. return new messages =====
    return messages[num_init_messages:]