agentscope.parsers.regex_tagged_content_parser module

The parser for dynamic tagged content

class agentscope.parsers.regex_tagged_content_parser.RegexTaggedContentParser(tagged_content_pattern: str = '<(?P<name>[^>]+)>(?P<content>.*?)</\\1?>', format_instruction: str | None = None, try_parse_json: bool = True, required_keys: List[str] | None = None, keys_to_memory: str | bool | Sequence[str] = True, keys_to_content: str | bool | Sequence[str] = True, keys_to_metadata: str | bool | Sequence[str] = False)[源代码]

基类:ParserBase, DictFilterMixin

A regex tagged content parser, which extracts tagged content according to the provided regex pattern. Different from other parsers, this parser allows to extract multiple tagged content without knowing the keys in advance. The parsed result will be a dictionary within the parsed field of the model response.

Compared with other parsers, this parser is more flexible and can be used in dynamic scenarios where - the keys are not known in advance - the number of the tagged content is not fixed

Note: Without knowing the keys in advance, it’s hard to prepare a format instruction template for different scenarios. Therefore, we ask the user to provide the format instruction in the constructor. Of course, the user can construct and manage the prompt by themselves optionally.

示例

By default, the parser use a regex pattern to extract tagged content with the following format: ` <{name1}>{content1}</{name1}> <{name2}>{content2}</{name2}> ` The parser will extract the content as the following dictionary: ``` {

“name1”: content1, “name2”: content2,

}

__init__(tagged_content_pattern: str = '<(?P<name>[^>]+)>(?P<content>.*?)</\\1?>', format_instruction: str | None = None, try_parse_json: bool = True, required_keys: List[str] | None = None, keys_to_memory: str | bool | Sequence[str] = True, keys_to_content: str | bool | Sequence[str] = True, keys_to_metadata: str | bool | Sequence[str] = False) None[源代码]

Initialize the regex tagged content parser.

参数:
  • (Optional[str] (tagged_content_pattern)

  • to (defaults)

  • `"<

    The regex pattern to extract tagged content. The pattern should contain two named groups: name and content. The name group is used as the key of the tagged content, and the content group is used as the value.

  • format_instruction (Optional[str], defaults to None) – The instruction for the format of the tagged content, which will be attached to the end of the prompt messages to remind the LLM to follow the format.

  • try_parse_json (bool, defaults to True) – Whether to try to parse the tagged content as JSON. Note the parsing function won’t raise exceptions.

  • required_keys (Optional[List[str]], defaults to None) – The keys that are required in the tagged content.

  • (`Union[str (keys_to_memory) –

  • bool

  • Sequence[str]]`

:param : :param defaults to True): The keys to save to memory. :param keys_to_content (Union[str: :param bool: :param Sequence[str]]: :param : :param defaults to True): The keys to save to content. :param keys_to_metadata (Union[str: :param bool: :param Sequence[str]]: :param : :param defaults to False): The key or keys to be filtered in to_metadata method. If

it’s - False, None will be returned in the to_metadata method - str, the corresponding value will be returned - List[str], a filtered dictionary will be returned - True, the whole dictionary will be returned

property format_instruction: str

The format instruction for the tagged content.

parse(response: ModelResponse) ModelResponse[源代码]

Parse the response text by the regex pattern, and return a dict of the content in the parsed field of the response.

参数:

response (ModelResponse) – The response to be parsed.

返回:

The response with the parsed field as the parsed result.

返回类型:

ModelResponse