Response Parser
Table of Contents
Background
In the process of building LLM-empowered application, parsing the LLM generated string into a specific format and extracting the required information is a very important step. However, due to the following reasons, this process is also a very complex process:
Diversity: The target format of parsing is diverse, and the information to be extracted may be a specific text, a JSON object, or a complex data structure.
Complexity: The result parsing is not only to convert the text generated by LLM into the target format, but also involves a series of issues such as prompt engineering (reminding LLM what format of output should be generated), error handling, etc.
Flexibility: Even in the same application, different stages may also require the agent to generate output in different formats.
For the convenience of developers, AgentScope provides a parser module to help developers parse LLM response into a specific format. By using the parser module, developers can easily parse the response into the target format by simple configuration, and switch the target format flexibly.
In AgentScope, the parser module features
Flexibility: Developers can flexibly set the required format, flexibly switch the parser without modifying the code of agent class. That is, the specific “target format” and the agent’s
reply
function are decoupled.Freedom: The format instruction, result parsing and prompt engineering are all explicitly finished in the
reply
function. Developers and users can freely choose to use the parser or parse LLM response by their own code.Transparency: When using the parser, the process and results of prompt construction are completely visible and transparent to developers in the
reply
function, and developers can precisely debug their applications.
Parser Module
Overview
The main functions of the parser module include:
Provide “format instruction”, that is, remind LLM where to generate what output, for example
You should generate python code in a fenced code block as follows
```python
{your_python_code}
```
Provide a parse function, which directly parses the text generated by LLM into the target data format,
Post-processing for dictionary format. After parsing the text into a dictionary, different fields may have different uses.
AgentScope provides multiple built-in parsers, and developers can choose according to their needs.
Target Format |
Parser Class |
Description |
---|---|---|
String |
|
Requires LLM to generate specified text within a Markdown code block marked by ```. The result is a string. |
Dictionary |
|
Requires LLM to produce a specified dictionary within the code block marked by ```json and ```. The result is a Python dictionary. |
|
Requires LLM to generate specified content within multiple tags. Contents from different tags will be parsed into a single Python dictionary with different key-value pairs. |
|
|
For uncertain tag names and quantities, allows users to modify regular expressions, and the return result is a dictionary. |
|
JSON / Python Object Type |
|
Requires LLM to produce specified content within the code block marked by ```json and ```. The result will be converted into a Python object via json.loads. |
NOTE: Compared to
MarkdownJsonDictParser
,MultiTaggedContentParser
is more suitable for weak LLMs and when the required format is too complex. For example, when LLM is required to generate Python code, if the code is returned directly within a dictionary, LLM needs to be aware of escaping characters (\t, \n, …), and the differences between double and single quotes when callingjson.loads
In contrast,
MultiTaggedContentParser
guides LLM to generate each key-value pair separately in individual tags and then combines them into a dictionary, thus reducing the difficulty.
NOTE: The built-in strategies to construct format instruction just provide some examples. In AgentScope, developer has complete control over prompt construction. So they can choose not to use the format instruction provided by parsers, customizing their format instruction by hand or implementing new parser class are all feasible.
In the following sections, we will introduce the usage of these parsers based on different target formats.
String Type
MarkdownCodeBlockParser
Initialization
MarkdownCodeBlockParser
requires LLM to generate specific text within a specified code block in Markdown format. Different languages can be specified with thelanguage_name
parameter to utilize the large model’s ability to produce corresponding outputs. For example, when asking the large model to produce Python code, initialize as follows:from agentscope.parsers import MarkdownCodeBlockParser parser = MarkdownCodeBlockParser(language_name="python", content_hint="your python code")
Format Instruction Template
MarkdownCodeBlockParser
provides the following format instruction template. When the user calls theformat_instruction
attribute,{language_name}
will be replaced with the string entered at initialization:You should generate {language_name} code in a {language_name} fenced code block as follows: ```{language_name} {content_hint} ```
For the above initialization with
language_name
as"python"
, when theformat_instruction
attribute is called, the following string will be returned:print(parser.format_instruction)
You should generate python code in a python fenced code block as follows ```python your python code ```
Parse Function
MarkdownCodeBlockParser
provides aparse
method to parse the text generated by LLM。Its input and output are bothModelResponse
objects, and the parsing result will be mounted on theparsed
attribute of the output object.res = parser.parse( ModelResponse( text="""The following is generated python code ```python print("Hello world!") ``` """ ) ) print(res.parsed)
print("hello world!")
Dictionary Type
Different from string and general JSON/Python object, as a powerful format in LLM applications, AgentScope provides additional post-processing functions for dictionary type.
When initializing the parser, you can set the keys_to_content
, keys_to_memory
, and keys_to_metadata
parameters to achieve filtering of key-value pairs when calling the parser’s to_content
, to_memory
, and to_metadata
methods.
keys_to_content
specifies the key-value pairs that will be placed in thecontent
field of the returnedMsg
object. The content field will be returned to other agents, participate in their prompt construction, and will also be called by theself.speak
function for display.keys_to_memory
specifies the key-value pairs that will be stored in the memory of the agent.keys_to_metadata
specifies the key-value pairs that will be placed in themetadata
field of the returnedMsg
object, which can be used for application control flow judgment, or mount some information that does not need to be returned to other agents.
The three parameters receive bool values, string and a list of strings. The meaning of their values is as follows:
False
: The corresponding filter function will returnNone
.True
: The whole dictionary will be returned.str
: The corresponding value will be directly returned.List[str]
: A filtered dictionary will be returned according to the list of keys.
By default, keys_to_content
and keys_to_memory
are True
, that is, the whole dictionary will be returned. keys_to_metadata
defaults to False
, that is, the corresponding filter function will return None
.
For example, the dictionary generated by the werewolf in the daytime discussion in a werewolf game. In this example,
"thought"
should not be returned to other agents, but should be stored in the agent’s memory to ensure the continuity of the werewolf strategy;"speak"
should be returned to other agents and stored in the agent’s memory;"finish_discussion"
is used in the application’s control flow to determine whether the discussion has ended. To save tokens, this field should not be returned to other agents or stored in the agent’s memory.{ "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", "speak": "I agree with you.", "finish_discussion": True }
In AgentScope, we achieve post-processing by calling the to_content
, to_memory
, and to_metadata
methods, as shown in the following code:
The code for the application’s control flow, create the corresponding parser object and load it
from agentscope.parsers import MarkdownJsonDictParser # ... agent = DictDialogAgent(...) # Take MarkdownJsonDictParser as example parser = MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "speak": "what you speak", "finish_discussion": "whether the discussion is finished" }, keys_to_content="speak", keys_to_memory=["thought", "speak"], keys_to_metadata=["finish_discussion"] ) # Load parser, which is equivalent to specifying the required format agent.set_parser(parser) # The discussion process while True: # ... x = agent(x) # Break the loop according to the finish_discussion field in metadata if x.metadata["finish_discussion"]: break
Filter the dictionary in the agent’s
reply
function# ... def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # ... res = self.model(prompt, parse_func=self.parser.parse) # Story the thought and speak fields into memory self.memory.add( Msg( self.name, content=self.parser.to_memory(res.parsed), role="assistant", ) ) # Store in content and metadata fields in the returned Msg object msg = Msg( self.name, content=self.parser.to_content(res.parsed), role="assistant", metadata=self.parser.to_metadata(res.parsed), ) self.speak(msg) return msg
Note:
keys_to_content
,keys_to_memory
, andkeys_to_metadata
parameters can be a string, a list of strings, or a bool value.
For
True
, theto_content
,to_memory
, andto_metadata
methods will directly return the whole dictionary.For
False
, theto_content
,to_memory
, andto_metadata
methods will directly returnNone
.For a string, the
to_content
,to_memory
, andto_metadata
methods will directly extract the corresponding value. For example, ifkeys_to_content="speak"
, theto_content
method will putres.parsed["speak"]
into thecontent
field of theMsg
object, and thecontent
field will be a string rather than a dictionary.For a list of string, the
to_content
,to_memory
, andto_metadata
methods will filter the dictionary according to the list of keys.parser = MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "speak": "what you speak", }, keys_to_content="speak", keys_to_memory=["thought", "speak"], ) example_dict = {"thought": "abc", "speak": "def"} print(parser.to_content(example_dict)) # def print(parser.to_memory(example_dict)) # {"thought": "abc", "speak": "def"} print(parser.to_metadata(example_dict)) # Nonedef {"thought": "abc", "speak": "def"} None
Parsers
For dictionary type return values, AgentScope provides multiple parsers for developers to choose from according to their needs.
RegexTaggedContentParser
Initialization
RegexTaggedContentParser
is designed for scenarios where 1) the tag name is uncertain, and 2) the number of tags is uncertain.
In this case, the parser cannot provide a general response format instruction, so developers need to provide the corresponding response format instruction (format_instruction
) when initializing.
Of course, the developers can handle the prompt engineering by themselves optionally.
from agentscope.parsers import RegexTaggedContentParser
parser = RegexTaggedContentParser(
format_instruction="""Respond with specific tags as outlined below
<thought>what you thought</thought>
<speak>what you speak</speak>
""",
try_parse_json=True, # Try to parse the content of the tag as JSON object
required_keys=["thought", "speak"] # Required keys in the returned dictionary
)
MarkdownJsonDictParser
Initialization & Format Instruction Template
MarkdownJsonDictParser
requires LLM to generate dictionary within a code block fenced by ```json and ``` tags.Except
keys_to_content
,keys_to_memory
andkeys_to_metadata
, thecontent_hint
parameter can be provided to give an example and explanation of the response result, that is, to remind LLM where and what kind of dictionary should be generated. This parameter can be a string or a dictionary. For dictionary, it will be automatically converted to a string when constructing the format instruction.from agentscope.parsers import MarkdownJsonDictParser # dictionary as content_hint MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "speak": "what you speak", } ) # or string as content_hint MarkdownJsonDictParser( content_hint="""{ "thought": "what you thought", "speak": "what you speak", }""" )
The corresponding
instruction_format
attribute
You should respond a json object in a json fenced code block as follows: ```json {content_hint} ```
Validation
The content_hint
parameter in MarkdownJsonDictParser
also supports type validation based on Pydantic. When initializing, you can set content_hint
to a Pydantic model class, and AgentScope will modify the instruction_format
attribute based on this class. Besides, Pydantic will be used to validate the dictionary returned by LLM during parsing.
A simple example is as follows, where "..."
can be filled with specific type validation rules, which can be referred to the Pydantic documentation.
from pydantic import BaseModel, Field
from agentscope.parsers import MarkdownJsonDictParser
class Schema(BaseModel):
thought: str = Field(..., description="what you thought")
speak: str = Field(..., description="what you speak")
end_discussion: bool = Field(..., description="whether the discussion is finished")
parser = MarkdownJsonDictParser(content_hint=Schema)
The corresponding
instruction_format
attribute
Respond a JSON dictionary in a markdown's fenced code block as follows:
```json
{a_JSON_dictionary}
```
The generated JSON dictionary MUST follow this schema:
{'properties': {'speak': {'description': 'what you speak', 'title': 'Speak', 'type': 'string'}, 'thought': {'description': 'what you thought', 'title': 'Thought', 'type': 'string'}, 'end_discussion': {'description': 'whether the discussion reached an agreement or not', 'title': 'End Discussion', 'type': 'boolean'}}, 'required': ['speak', 'thought', 'end_discussion'], 'title': 'Schema', 'type': 'object'}
During the parsing process, Pydantic will be used for type validation, and an exception will be thrown if the validation fails. Meanwhile, Pydantic also provides some fault tolerance capabilities, such as converting the string
"true"
to Python’sTrue
:
parser.parser("""
```json
{
"thought": "The others didn't realize I was a werewolf. I should end the discussion soon.",
"speak": "I agree with you.",
"end_discussion": "true"
}
```
""")
MultiTaggedContentParser
MultiTaggedContentParser
asks LLM to generate specific content within multiple tag pairs. The content from different tag pairs will be parsed into a single Python dictionary. Its usage is similar to MarkdownJsonDictParser
, but the initialization method is different, and it is more suitable for weak LLMs or complex return content.
Initialization & Format Instruction Template
Within MultiTaggedContentParser
, each tag pair will be specified by as TaggedContent
object, which contains
Tag name (
name
), the key value in the returned dictionaryStart tag (
tag_begin
)Hint for content (
content_hint
)End tag (
tag_end
)Content parsing indication (
parse_json
), default asFalse
. When set toTrue
, the parser will automatically add hint that requires JSON object between the tags, and its extracted content will be parsed into a Python object viajson.loads
from agentscope.parsers import MultiTaggedContentParser, TaggedContent
parser = MultiTaggedContentParser(
TaggedContent(
name="thought",
tag_begin="[THOUGHT]",
content_hint="what you thought",
tag_end="[/THOUGHT]"
),
TaggedContent(
name="speak",
tag_begin="[SPEAK]",
content_hint="what you speak",
tag_end="[/SPEAK]"
),
TaggedContent(
name="finish_discussion",
tag_begin="[FINISH_DISCUSSION]",
content_hint="true/false, whether the discussion is finished",
tag_end="[/FINISH_DISCUSSION]",
parse_json=True, # we expect the content of this field to be parsed directly into a Python boolean value
)
)
print(parser.format_instruction)
Respond with specific tags as outlined below, and the content between [FINISH_DISCUSSION] and [/FINISH_DISCUSSION] MUST be a JSON object:
[THOUGHT]what you thought[/THOUGHT]
[SPEAK]what you speak[/SPEAK]
[FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION]
Parse Function
MultiTaggedContentParser
’s parsing result is a dictionary, whose keys are the value ofname
in theTaggedContent
objects. The following is an example of parsing the LLM response in the werewolf game:
res_dict = parser.parse(
ModelResponse(
text="""As a werewolf, I should keep pretending to be a villager
[THOUGHT]The others didn't realize I was a werewolf. I should end the discussion soon.[/THOUGHT]
[SPEAK]I agree with you.[/SPEAK]
[FINISH_DISCUSSION]true[/FINISH_DISCUSSION]"""
)
)
print(res_dict)
{
"thought": "The others didn't realize I was a werewolf. I should end the discussion soon.",
"speak": "I agree with you.",
"finish_discussion": true
}
JSON / Python Object Type
MarkdownJsonObjectParser
MarkdownJsonObjectParser
also uses the ```json and ``` tags in Markdown, but does not limit the content type. It can be a list, dictionary, number, string, etc., which can be parsed into a Python object via json.loads
.
Initialization & Format Instruction Template
from agentscope.parsers import MarkdownJsonObjectParser
parser = MarkdownJsonObjectParser(
content_hint="{A list of numbers.}"
)
print(parser.format_instruction)
You should respond a json object in a json fenced code block as follows:
```json
{a list of numbers}
```
Parse Function
res = parser.parse(
ModelResponse(
text="""Yes, here is the generated list
```json
[1,2,3,4,5]
```
""")
)
print(type(res))
print(res)
<class 'list'>
[1, 2, 3, 4, 5]
Typical Use Cases
WereWolf Game
Werewolf game is a classic use case of dictionary parser. In different stages of the game, the same agent needs to generate different identification fields in addition to "thought"
and "speak"
, such as whether the discussion is over, whether the seer uses its ability, whether the witch uses the antidote and poison, and voting.
AgentScope has built-in examples of werewolf game, which uses DictDialogAgent
class and different parsers to achieve flexible target format switching. By using the post-processing function of the parser, it separates “thought” and “speak”, and controls the progress of the game successfully.
More details can be found in the werewolf game source code.
ReAct Agent and Tool Usage
ReActAgent
is an agent class built for tool usage in AgentScope, based on the ReAct algorithm, and can be used with different tool functions. The tool call, format parsing, and implementation of ReActAgent
are similar to the parser. For detailed implementation, please refer to the source code.
Customized Parser
AgentScope provides a base class ParserBase
for parsers. Developers can inherit this base class, and implement the format_instruction
attribute and parse
method to create their own parser.
For dictionary type parsing, you can also inherit the agentscope.parser.DictFilterMixin
class to implement post-processing for dictionary type.
from abc import ABC, abstractmethod
from agentscope.models import ModelResponse
class ParserBase(ABC):
"""The base class for model response parser."""
format_instruction: str
"""The instruction for the response format."""
@abstractmethod
def parse(self, response: ModelResponse) -> ModelResponse:
"""Parse the response text to a specific object, and stored in the
parsed field of the response object."""
# ...