agentscope.parsers package

Submodules

Module contents

Model response parser module.

class MarkdownCodeBlockParser(language_name: str, content_hint: str | None = None)[source]

Bases: ParserBase

The base class for parsing the response text by fenced block.

parse(response: ModelResponse) ModelResponse[source]

Extract the content between the tag_begin and tag_end in the response and store it in the parsed field of the response object.

content_hint: str = '${{your_{language_name}_code}}'

The hint of the content.

format_instruction: str = 'You should generate {language_name} code in a {language_name} fenced code block as follows: \n```{language_name}\n{content_hint}\n```'

The instruction for the format of the code block.

name: str = '{language_name} block'

The name of the parser.

tag_begin: str = '```{language_name}'

The beginning tag.

tag_end: str = '```'

The ending tag.

class MarkdownJsonDictParser(content_hint: Any | None = None, required_keys: List[str] | None = None, keys_to_memory: str | bool | Sequence[str] = True, keys_to_content: str | bool | Sequence[str] = True, keys_to_metadata: str | bool | Sequence[str] = False)[source]

Bases: MarkdownJsonObjectParser, DictFilterMixin

A class used to parse a JSON dictionary object in a markdown fenced code

parse(response: ModelResponse) ModelResponse[source]

Parse the text field of the response to a JSON dictionary object, store it in the parsed field of the response object, and check if the required keys exists.

content_hint: str = '{your_json_dictionary}'

The hint of the content.

property format_instruction: str

Get the format instruction for the json object, if the format_example is provided, it will be used as the example.

name: str = 'json block'

The name of the parser.

required_keys: List[str]

A list of required keys in the JSON dictionary object. If the response misses any of the required keys, it will raise a RequiredFieldNotFoundError.

tag_begin: str = '```json'

Opening tag for a code block.

tag_end: str = '```'

Closing end for a code block.

class MarkdownJsonObjectParser(content_hint: Any | None = None)[source]

Bases: ParserBase

A parser to parse the response text to a json object.

parse(response: ModelResponse) ModelResponse[source]

Parse the response text to a json object, and fill it in the parsed field in the response object.

content_hint: str = '{your_json_object}'

The hint of the content.

property format_instruction: str

Get the format instruction for the json object, if the format_example is provided, it will be used as the example.

name: str = 'json block'

The name of the parser.

tag_begin: str = '```json'

Opening tag for a code block.

tag_end: str = '```'

Closing end for a code block.

class MultiTaggedContentParser(*tagged_contents: TaggedContent, keys_to_memory: str | bool | Sequence[str] | None = True, keys_to_content: str | bool | Sequence[str] | None = True, keys_to_metadata: str | bool | Sequence[str] | None = False, keys_allow_missing: List[str] | None = None)[source]

Bases: ParserBase, DictFilterMixin

Parse response text by multiple tags, and return a dict of their content. Asking llm to generate JSON dictionary object directly maybe not a good idea due to involving escape characters and other issues. So we can ask llm to generate text with tags, and then parse the text to get the final JSON dictionary object.

parse(response: ModelResponse) ModelResponse[source]

Parse the response text by tags, and return a dict of their content in the parsed field of the model response object. If the tagged content requires to parse as a JSON object by parse_json equals to True, it will be parsed as a JSON object by json.loads.

format_instruction = 'Respond with specific tags as outlined below{json_required_hint}\n{tag_lines_format}'

The instruction for the format of the tagged content.

json_required_hint = ', and the content between {} MUST be a JSON object:'

If a tagged content is required to be a JSON object by parse_json equals to True, this instruction will be used to remind the model to generate JSON object.

class ParserBase[source]

Bases: ABC

The base class for model response parser.

abstract parse(response: ModelResponse) ModelResponse[source]

Parse the response text to a specific object, and stored in the parsed field of the response object.

class RegexTaggedContentParser(tagged_content_pattern: str = '<(?P<name>[^>]+)>(?P<content>.*?)</\\1?>', format_instruction: str | None = None, try_parse_json: bool = True, required_keys: List[str] | None = None, keys_to_memory: str | bool | Sequence[str] = True, keys_to_content: str | bool | Sequence[str] = True, keys_to_metadata: str | bool | Sequence[str] = False)[source]

Bases: ParserBase, DictFilterMixin

A regex tagged content parser, which extracts tagged content according to the provided regex pattern. Different from other parsers, this parser allows to extract multiple tagged content without knowing the keys in advance. The parsed result will be a dictionary within the parsed field of the model response.

Compared with other parsers, this parser is more flexible and can be used in dynamic scenarios where - the keys are not known in advance - the number of the tagged content is not fixed

Note: Without knowing the keys in advance, it’s hard to prepare a format instruction template for different scenarios. Therefore, we ask the user to provide the format instruction in the constructor. Of course, the user can construct and manage the prompt by themselves optionally.

Example

By default, the parser use a regex pattern to extract tagged content with the following format: ` <{name1}>{content1}</{name1}> <{name2}>{content2}</{name2}> ` The parser will extract the content as the following dictionary: ``` {

“name1”: content1, “name2”: content2,

}

parse(response: ModelResponse) ModelResponse[source]

Parse the response text by the regex pattern, and return a dict of the content in the parsed field of the response.

Parameters:

response (ModelResponse) – The response to be parsed.

Returns:

The response with the parsed field as the parsed result.

Return type:

ModelResponse

property format_instruction: str

The format instruction for the tagged content.

class TaggedContent(name: str, tag_begin: str, content_hint: str, tag_end: str, parse_json: bool = False)[source]

Bases: object

A tagged content object to store the tag name, tag begin, content hint and tag end.

content_hint: str

The hint of the content.

name: str

The name of the tagged content, which will be used as the key in extracted dictionary.

parse_json: bool

Whether to parse the content as a json object.

tag_begin: str

The beginning tag.

tag_end: str

The ending tag.