agentscope.parsers.regex_tagged_content_parser

The parser for dynamic tagged content

class RegexTaggedContentParser(tagged_content_pattern: str = '<(?P<name>[^>]+)>(?P<content>.*?)</\\1?>', format_instruction: str | None = None, try_parse_json: bool = True, required_keys: List[str] | None = None, keys_to_memory: str | bool | Sequence[str] = True, keys_to_content: str | bool | Sequence[str] = True, keys_to_metadata: str | bool | Sequence[str] = False)[source]

Bases: ParserBase, DictFilterMixin

A regex tagged content parser, which extracts tagged content according to the provided regex pattern. Different from other parsers, this parser allows to extract multiple tagged content without knowing the keys in advance. The parsed result will be a dictionary within the parsed field of the model response.

Compared with other parsers, this parser is more flexible and can be used in dynamic scenarios where - the keys are not known in advance - the number of the tagged content is not fixed

Note: Without knowing the keys in advance, it’s hard to prepare a format instruction template for different scenarios. Therefore, we ask the user to provide the format instruction in the constructor. Of course, the user can construct and manage the prompt by themselves optionally.

Example

By default, the parser use a regex pattern to extract tagged content with the following format: ` <{name1}>{content1}</{name1}> <{name2}>{content2}</{name2}> ` The parser will extract the content as the following dictionary: ``` {

“name1”: content1, “name2”: content2,

}

parse(response: ModelResponse) → ModelResponse[source]

Parse the response text by the regex pattern, and return a dict of the content in the parsed field of the response.

Parameters:: response (ModelResponse) – The response to be parsed.
Returns:: The response with the parsed field as the parsed result.
Return type:: ModelResponse

property format_instruction: str: The format instruction for the tagged content.