Note

Go to the end to download the full example code.

Structured Output

AgentScope supports two structured output methods, as shown in the following figure:

Tool API: Construct a function with input parameters as the fields of the required structured data, and then ask the LLM to call the function to obtain the structured data.
Text Parsing: Call the LLM API to obtain plain text data, and then parse the structured data from the plain text locally.

The advantages and disadvantages of the two methods are as follows:

Method	Advantages	Disadvantages
Tool API	The model autonomously decides when to call the function/generate structured output, which can be well combined with the ReAct algorithm. Data parsing occurs at the LLM API provider, making local development easier. Supports complex constraints based on JSON Schema.	Requires an LLM API that supports tool invocation.
Text Parsing	Simple and easy to use. Can adjust the format and parsing method based on model capabilities and required structured data.	Relies on model capabilities, which may never produce structured data that meets the requirements. The model passively generates structured data, and the developer decides when to prompt the LLM to generate structured data.

Next we will introduce how AgentScope supports these two different parsing methods.

Tool API

The Tool API method combines tool invocation and structured output. For example, if we need the fields “thought”, “choice”, and “number”, we can construct a function as follows:

from typing import Literal
from pydantic import BaseModel, Field
import json


def generate_response(
    thought: str,
    choice: Literal["apple", "banana"],
    number: int,
) -> None:
    pass

The function signature serves as a constraint, and when the model correctly invokes the function, we can obtain the corresponding structured data.

Considering that some complex constraints cannot be expressed using Python’s type annotations, AgentScope supports defining complex constraints using Pydantic’s BaseModel Taking the following two models as an example:

class Model1(BaseModel):
    name: str = Field(min_length=0, max_length=20, description="The name")
    description: str = Field(
        min_length=0,
        max_length=200,
        description="The brief description",
    )
    age: int = Field(ge=0, le=100, description="The age")


class Model2(BaseModel):
    choice: Literal["apple", "banana"] = Field(description="Your choice")

The ReActAgentV2 class in AgentScope will combine the JSON Schema of the BaseModel subclass with the schema of a function named generate_response. This will generate a new schema that can be used to constrain the model’s output when calling the function.

For example, the following code demonstrates how to use ReActAgentV2 to combine the ReAct algorithm with structured output.

from agentscope.agents import ReActAgentV2
from agentscope.service import ServiceToolkit
from agentscope.message import Msg
import agentscope

agentscope.init(
    model_configs={
        "config_name": "my_config",
        "model_type": "dashscope_chat",
        "model_name": "qwen-max",
    },
)

toolkit = ServiceToolkit()

agent = ReActAgentV2(
    name="Friday",
    model_config_name="my_config",
    service_toolkit=toolkit,
)

msg1 = Msg("user", "Introduce Einstein", "user")
res_msg = agent(msg1, structured_model=Model1)

print("The structured output: ", res_msg.metadata)

Friday: [
    {
        "type": "tool_use",
        "id": "call_b4e14b67a9bf48268e951d",
        "name": "generate_response",
        "input": {
            "response": "Albert Einstein was a German-born theoretical physicist, widely acknowledged to be one of the greatest and most influential physicists of all time. He is best known for developing the theory of relativity, but he also made important contributions to the development of the theory of quantum mechanics. His mass–energy equivalence formula E=mc^2, which has been dubbed 'the world's most famous equation', has been replicated in popular culture countless times.",
            "name": "Albert Einstein",
            "description": "Theoretical Physicist, Developer of the Theory of Relativity",
            "age": 76
        }
    }
]
system: Execute function generate_response:
Success
Friday: Albert Einstein was a German-born theoretical physicist, widely acknowledged to be one of the greatest and most influential physicists of all time. He is best known for developing the theory of relativity, but he also made important contributions to the development of the theory of quantum mechanics. His mass–energy equivalence formula E=mc^2, which has been dubbed 'the world's most famous equation', has been replicated in popular culture countless times.
The structured output:  {'name': 'Albert Einstein', 'description': 'Theoretical Physicist, Developer of the Theory of Relativity', 'age': 76}

With different structured_model, we can achieve different structured output

msg2 = Msg("user", "Pick a fruit", "user")
res_msg = agent(msg2, structured_model=Model2)

print("The structured output: ", res_msg.metadata)

Friday: [
    {
        "type": "tool_use",
        "id": "call_b02a72a6b367444bb932d5",
        "name": "generate_response",
        "input": {
            "response": "I choose apple.",
            "choice": "apple"
        }
    }
]
system: Execute function generate_response:
Success
Friday: I choose apple.
The structured output:  {'choice': 'apple'}

To observe how ReActAgentV2 dynamically constructs the schema of the function, we remove a hook function that cleans up the structured output, allowing us to print the processed function schema.

# Clear the memory
agent.memory.clear()
# Remove the hook function that cleans up the structured output
agent.remove_hook("post_reply", "as_clear_structured_output")

# Observe the current schema of the target function
print(
    json.dumps(
        toolkit.json_schemas[agent._finish_function],
        indent=4,
        ensure_ascii=False,
    ),
)

{
    "type": "function",
    "function": {
        "name": "generate_response",
        "parameters": {
            "properties": {
                "response": {
                    "description": "The response to the user.",
                    "type": "string"
                }
            },
            "required": [
                "response"
            ],
            "type": "object"
        },
        "description": "Generate a response. You must call this function to interact with\n\nothers (e.g., users)."
    }
}

Now we call the agent once and observe the changes in the schema of the target function

res_msg = agent(msg1, structured_model=Model1)

print(
    json.dumps(
        toolkit.json_schemas[agent._finish_function],
        indent=4,
        ensure_ascii=False,
    ),
)

Friday: [
    {
        "type": "tool_use",
        "id": "call_1f2af68576c2413cb97442",
        "name": "generate_response",
        "input": {
            "response": "Albert Einstein was a German-born theoretical physicist, widely acknowledged to be one of the greatest and most influential physicists of all time. He is best known for developing the theory of relativity, but he also made important contributions to the development of the theory of quantum mechanics. His mass–energy equivalence formula E=mc^2, which has been dubbed 'the world's most famous equation', has been replicated in popular culture countless times.",
            "name": "Albert Einstein",
            "description": "Theoretical Physicist",
            "age": 76
        }
    }
]
system: Execute function generate_response:
Success
Friday: Albert Einstein was a German-born theoretical physicist, widely acknowledged to be one of the greatest and most influential physicists of all time. He is best known for developing the theory of relativity, but he also made important contributions to the development of the theory of quantum mechanics. His mass–energy equivalence formula E=mc^2, which has been dubbed 'the world's most famous equation', has been replicated in popular culture countless times.
{
    "type": "function",
    "function": {
        "name": "generate_response",
        "parameters": {
            "properties": {
                "response": {
                    "description": "The response to the user.",
                    "type": "string"
                },
                "name": {
                    "description": "The name",
                    "maxLength": 20,
                    "minLength": 0,
                    "type": "string"
                },
                "description": {
                    "description": "The brief description",
                    "maxLength": 200,
                    "minLength": 0,
                    "type": "string"
                },
                "age": {
                    "description": "The age",
                    "maximum": 100,
                    "minimum": 0,
                    "type": "integer"
                }
            },
            "required": [
                "response",
                "name",
                "description",
                "age"
            ],
            "type": "object"
        },
        "description": "Generate a response. You must call this function to interact with\n\nothers (e.g., users)."
    }
}

We can see that the schema of the generate_response function has been combined with the schema of the Model1 class. Therefore, when the model calls this function, it will generate the corresponding structured data.

Tip

More implementation details can be found in the ReActAgentV2 source code

Text Parsing

AgentScope’s parsers module provides various parser classes that developers can choose from based on the required structured data.

Here’s an example of using MarkdownJsonDictParser to parse structured data from Markdown-formatted text.

Defining the Parser

from agentscope.models import ModelResponse
from agentscope.parsers import MarkdownJsonDictParser


parser = MarkdownJsonDictParser(
    content_hint='{"thought": "What you thought", "speak": "What you speak to the user"}',
    required_keys=["thought", "speak"],
)

The parser will generate a format instruction according to your input. You can use the format_instruction property to in your prompt to guide LLM to generate the desired output.

print(parser.format_instruction)

Respond a JSON dictionary in a markdown's fenced code block as follows:
```json
{"thought": "What you thought", "speak": "What you speak to the user"}
```

Parsing the Output

When receiving output from LLM, use parse method to extract the structured data. It takes an object of agentscope.models.ModelResponse as input, parses the value of the text field, and returns a parsed dictionary in the parsed field.

dummy_response = ModelResponse(
    text="""```json
{
    "thought": "I should greet the user",
    "speak": "Hi! How can I help you?"
}
```""",
)

print(f"parsed field before parsing: {dummy_response.parsed}")

parsed_response = parser.parse(dummy_response)

print(f"parsed field after parsing: {parsed_response.parsed}")
print(type(parsed_response.parsed))

parsed field before parsing: None
parsed field after parsing: {'thought': 'I should greet the user', 'speak': 'Hi! How can I help you?'}
<class 'dict'>

Error Handling

If the LLM output does not match the expected format, the parser will raise an error with a detailed message. So developers can present the error message to LLM to guide it to correct the output.

error_response = ModelResponse(
    text="""```json
{
    "thought": "I should greet the user"
}
```""",
)

try:
    parsed_response = parser.parse(error_response)
except Exception as e:
    print(e)

RequiredFieldNotFoundError: Missing required field speak in the JSON dictionary object.

Advanced Usage

Auto Post-Processing

Within the parsed dictionary, different keys may require different post-processing steps. For example, in a werewolf game, the LLM is playing the role of a seer, and the output should contain the following keys:

thought: The seer’s thoughts
speak: The seer’s speech
use_ability: A boolean value indicating whether the seer should use its ability

In this case, the thought and speak contents should be stored in the agent’s memory to ensure the consistency of the agent’s behavior. The speak content should be spoken out to the user. The use_ability key should be accessed outside the agent easily to determine the game flow.

AgentScope supports automatic post-processing of the parsed dictionary by providing the following parameters when initializing the parser.

keys_to_memory: key(s) that should be stored in the agent’s memory
keys_to_content: key(s) that should be spoken out
keys_to_metadata: key(s) that should be stored in the metadata field of the agent’s response message

Note

If a string is provided, the parser will extract the value of the given key from the parsed dictionary. If a list of strings is provided, a sub-dictionary will be created with the given keys.

Here is an example of using the MarkdownJsonDictParser to automatically post-process the parsed dictionary.

parser = MarkdownJsonDictParser(
    content_hint='{"thought": "what you thought", "speak": "what you speak", "use_ability": "whether to use the ability"}',
    keys_to_memory=["thought", "speak"],
    keys_to_content="speak",
    keys_to_metadata="use_ability",
)

dummy_response = ModelResponse(
    text="""```json
{
    "thought": "I should ...",
    "speak": "I will not use my ability",
    "use_ability": false
}```
""",
)

parsed_response = parser.parse(dummy_response)

print("The parsed response: ", parsed_response.parsed)
print("To memory", parser.to_memory(parsed_response.parsed))
print("To message content: ", parser.to_content(parsed_response.parsed))
print("To message metadata: ", parser.to_metadata(parsed_response.parsed))

The parsed response:  {'thought': 'I should ...', 'speak': 'I will not use my ability', 'use_ability': False}
To memory {'thought': 'I should ...', 'speak': 'I will not use my ability'}
To message content:  I will not use my ability
To message metadata:  False

Here we show how to create an agent that can automatically post-process the parsed dictionary by the following core steps in the reply method.

Put the format instruction in prompt to guide LLM to generate the desired output
Parse the LLM response
Post-process the parsed dictionary using relevant methods

Tip

By changing different parsers, the agent can adapt to different scenarios and generate structured output in various formats.

from agentscope.models import DashScopeChatWrapper
from agentscope.agents import AgentBase
from agentscope.message import Msg


class Agent(AgentBase):
    def __init__(self):
        self.name = "Alice"
        super().__init__(name=self.name)

        self.sys_prompt = f"You're a helpful assistant named {self.name}."

        self.model = DashScopeChatWrapper(
            config_name="_",
            model_name="qwen-max",
        )

        self.parser = MarkdownJsonDictParser(
            content_hint='{"thought": "what you thought", "speak": "what you speak", "use_ability": "whether to use the ability"}',
            keys_to_memory=["thought", "speak"],
            keys_to_content="speak",
            keys_to_metadata="use_ability",
        )

        self.memory.add(Msg("system", self.sys_prompt, "system"))

    def reply(self, msg):
        self.memory.add(msg)

        prompt = self.model.format(
            self.memory.get_memory(),
            # Instruct the model to respond in the required format
            Msg("system", self.parser.format_instruction, "system"),
        )

        response = self.model(prompt)

        parsed_response = self.parser.parse(response)

        self.memory.add(
            Msg(
                name=self.name,
                content=self.parser.to_memory(parsed_response.parsed),
                role="assistant",
            ),
        )

        return Msg(
            name=self.name,
            content=self.parser.to_content(parsed_response.parsed),
            role="assistant",
            metadata=self.parser.to_metadata(parsed_response.parsed),
        )

Total running time of the script: (0 minutes 22.042 seconds)

Gallery generated by Sphinx-Gallery