agentscope.tool¶

The tool module in agentscope.

class Toolkit[source]¶

Bases: StateModule

Toolkit is the core module to register, manage and delete tool functions, MCP clients, Agent skills in AgentScope.

About tool functions:

Register and parse JSON schemas from their docstrings automatically.
Group-wise tools management, and agentic tools activation/deactivation.
Extend the tool function JSON schema dynamically with Pydantic BaseModel.
Tool function execution with unified streaming interface.

About MCP clients:

Register tool functions from MCP clients directly.
Client-level tool functions removal.

About Agent skills:

Register agent skills from the given directory.
Provide prompt for the registered skills to the agent.

__init__(agent_skill_instruction=None, agent_skill_template=None)[source]¶

Initialize the toolkit.

Parameters:

agent_skill_instruction (str | None, optional) – The instruction for agent skills in the system prompt. If not provided, a default instruction will be used.
agent_skill_template (str | None, optional) – The template to present one agent skill in the system prompt, which should contain {name}, {description}, and {dir} placeholders. If not provided, a default template will be used.

Return type:

None

create_tool_group(group_name, description, active=False, notes=None)[source]¶

Create a tool group to organize tool functions

Parameters:

group_name (str) – The name of the tool group.
description (str) – The description of the tool group.
active (bool, defaults to False) – If the group is active, meaning the tool functions in this group are included in the JSON schema.
notes (str | None, optional) – The notes used to remind the agent how to use the tool functions properly, which can be combined into the system prompt.

Return type:

None

update_tool_groups(group_names, active)[source]¶

Update the activation status of the given tool groups.

Parameters:

group_names (list[str]) – The list of tool group names to be updated.
active (bool) – If the tool groups should be activated or deactivated.

Return type:

None

remove_tool_groups(group_names)[source]¶

Remove tool functions from the toolkit by their group names.

Parameters:: group_names (str | list[str]) – The group names to be removed from the toolkit.
Return type:: None

register_tool_function(tool_func, group_name='basic', preset_kwargs=None, func_description=None, json_schema=None, include_long_description=True, include_var_positional=False, include_var_keyword=False, postprocess_func=None, namesake_strategy='raise')[source]¶

Parameters:

tool_func (ToolFunction) – The tool function, which can be async or sync, streaming or not-streaming, but the response must be a ToolResponse object.
group_name (str | Literal[“basic”], defaults to “basic”) – The belonging group of the tool function. Tools in “basic” group is always included in the JSON schema, while the others are only included when their group is active.
preset_kwargs (dict[str, JSONSerializableObject] | None, optional) – Preset arguments by the user, which will not be included in the JSON schema, nor exposed to the agent.
func_description (str | None, optional) – The function description. If not provided, the description will be extracted from the docstring automatically.
json_schema (dict | None, optional) – Manually provided JSON schema for the tool function, which should be {“type”: “function”, “function”: {“name”: “function_name”: “xx”, “description”: “xx”, “parameters”: {…}}}
include_long_description (bool, defaults to True) – When extracting function description from the docstring, if the long description will be included.
include_var_positional (bool, defaults to False) – Whether to include the variable positional arguments (*args) in the function schema.
include_var_keyword (bool, defaults to False) – Whether to include the variable keyword arguments (**kwargs) in the function schema.
postprocess_func ((Callable[[ToolUseBlock, ToolResponse], ToolResponse | None] | Callable[[ToolUseBlock, ToolResponse], Awaitable[ToolResponse | None]]) | None, optional) – A post-processing function that will be called after the tool function is executed, taking the tool call block and tool response as arguments. The function can be either sync or async. If it returns None, the tool result will be returned as is. If it returns a ToolResponse, the returned block will be used as the final tool result.
namesake_strategy (Literal[‘raise’, ‘override’, ‘skip’, ‘rename’], defaults to ‘raise’) –
The strategy to handle the tool function name conflict: - ‘raise’: raise a ValueError (default behavior). - ‘override’: override the existing tool function with the new

one.
- ’skip’: skip the registration of the new tool function.
- ’rename’: rename the new tool function by appending a random suffix to make it unique.

Return type:

None

remove_tool_function(tool_name, allow_not_exist=True)[source]¶

Remove tool function from the toolkit by its name.

Parameters:

tool_name (str) – The name of the tool function to be removed.
allow_not_exist (bool) – Allow the tool function to not exist when removing.

Return type:

None

get_json_schemas()[source]¶

Get the JSON schemas from the tool functions that belong to the active groups.

Note

The preset keyword arguments is removed from the JSON schema, and the extended model is applied if it is set.

Example

Example of tool function JSON schemas¶

[
    {
        "type": "function",
        "function": {
            "name": "google_search",
            "description": "Search on Google.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query."
                    }
                },
                "required": ["query"]
            }
        }
    },
    ...
]

Returns:: A list of function JSON schemas.
Return type:: list[dict]

set_extended_model(func_name, model)[source]¶

Set the extended model for a tool function, so that the original JSON schema will be extended.

Parameters:

func_name (str) – The name of the tool function.
model (Union[Type[BaseModel], None]) – The extended model to be set.

Return type:

None

async remove_mcp_clients(client_names)[source]¶

Remove tool functions from the MCP clients by their names.

Parameters:: client_names (list[str]) – The names of the MCP client, which used to initialize the client instance.
Return type:: None

async call_tool_function(tool_call)[source]¶

Execute the tool function by the ToolUseBlock and return the tool response chunk in unified streaming mode, i.e. an async generator of ToolResponse objects.

Note

The tool response chunk is accumulated.

Parameters:: tool_call (ToolUseBlock) – A tool call block.
Yields:: ToolResponse – The tool response chunk, in accumulative manner.
Return type:: AsyncGenerator[ToolResponse, None]

async register_mcp_client(mcp_client, group_name='basic', enable_funcs=None, disable_funcs=None, preset_kwargs_mapping=None, postprocess_func=None, namesake_strategy='raise')[source]¶

Parameters:

mcp_client (MCPClientBase) – The MCP client instance to connect to the MCP server.
group_name (str, defaults to “basic”) – The group name that the tool functions will be added to.
enable_funcs (list[str] | None, optional) – The functions to be added into the toolkit. If None, all tool functions within the MCP servers will be added.
disable_funcs (list[str] | None, optional) – The functions that will be filtered out. If None, no tool functions will be filtered out.
preset_kwargs_mapping (dict[str, dict[str, Any]] | None) – (Optional[dict[str, dict[str, Any]]], defaults to None): The preset keyword arguments mapping, whose keys are the tool function names and values are the preset keyword arguments.
postprocess_func ((Callable[[ToolUseBlock, ToolResponse], ToolResponse | None] | Callable[[ToolUseBlock, ToolResponse], Awaitable[ToolResponse | None]]) | None, optional) – A post-processing function that will be called after the tool function is executed, taking the tool call block and tool response as arguments. The function can be either sync or async. If it returns None, the tool result will be returned as is. If it returns a ToolResponse, the returned block will be used as the final tool result.
namesake_strategy (Literal[‘raise’, ‘override’, ‘skip’, ‘rename’], defaults to ‘raise’) –
The strategy to handle the tool function name conflict: - ‘raise’: raise a ValueError (default behavior). - ‘override’: override the existing tool function with the new

one.
- ’skip’: skip the registration of the new tool function.
- ’rename’: rename the new tool function by appending a random suffix to make it unique.

Return type:

None

state_dict()[source]¶

Get the state dictionary of the toolkit.

Returns:: A dictionary containing the active tool group names.
Return type:: dict[str, Any]

load_state_dict(state_dict, strict=True)[source]¶

Load the state dictionary into the toolkit.

Parameters:

state_dict (dict) – The state dictionary to load, which should have “active_groups” key and its value must be a list of group names.
strict (bool, defaults to True) – If True, raises an error if any key in the module is not found in the state_dict. If False, skips missing keys.

Return type:

None

get_activated_notes()[source]¶

Get the notes from the active tool groups, which can be used to construct the system prompt for the agent.

Returns:: The combined notes from the active tool groups.
Return type:: str

reset_equipped_tools(**kwargs)[source]¶

This function allows you to activate or deactivate tool groups dynamically based on your current task requirements. Important: Each call sets the absolute final state of ALL tool groups, not incremental changes. Any group not explicitly set to True will be deactivated, regardless of its previous state.

Best practice: Actively manage your tool groups——activate only what you need for the current task, and promptly deactivate groups as soon as they are no longer needed to conserve context space.

The function will return the usage instructions for the activated tool groups, which you MUST pay attention to and follow. You can also reuse this function to check the notes of the tool groups.

Parameters:: kwargs (Any)
Return type:: ToolResponse

clear()[source]¶

Clear the toolkit, removing all tool functions and groups.

Return type:: None

register_agent_skill(skill_dir)[source]¶

Register agent skills from a given directory. This function will scan the directory, read metadata from the SKILL.md file, and add it to the skill related prompt. Developers can obtain the skills-related prompt by calling toolkit.get_agent_skill_prompt().

Note

This directory - Must include a SKILL.md file at the top level - The SKILL.md must have a YAML Front Matter including name and

description fields

All files must specify a common root directory in their paths

Parameters:: skill_dir (str) – The path to the skill directory.
Return type:: None

remove_agent_skill(name)[source]¶

Remove an agent skill by its name.

Parameters:: name (str) – The name of the agent skill to be removed.
Return type:: None

get_agent_skill_prompt()[source]¶

Get the prompt for all registered agent skills, which can be attached to the system prompt for the agent.

The prompt is consisted of an overall instruction and the detailed descriptions of each skill, including its name, description, and directory.

Note

If no skill is registered, None will be returned.

Returns:: The combined prompt for all registered agent skills, or None if no skill is registered.
Return type:: str | None

class ToolResponse[source]¶

Bases: object

The result chunk of a tool call.

content: List[TextBlock | ImageBlock | AudioBlock | VideoBlock]¶: The execution output of the tool function.

metadata: dict | None = None¶: The metadata to be accessed within the agent, so that we don’t need to parse the tool result block.

stream: bool = False¶: Whether the tool output is streamed.

__init__(content, metadata=None, stream=False, is_last=True, is_interrupted=False, id=<factory>)¶

Parameters:

content (List[TextBlock | ImageBlock | AudioBlock | VideoBlock])
metadata (dict | None)
stream (bool)
is_last (bool)
is_interrupted (bool)
id (str)

Return type:

None

is_last: bool = True¶: Whether this is the last response in a stream tool execution.

is_interrupted: bool = False¶: Whether the tool execution is interrupted.

id: str¶: The identity of the tool response.

async execute_python_code(code, timeout=300, **kwargs)[source]¶

Execute the given python code in a temp file and capture the return code, standard output and error. Note you must print the output to get the result, and the tmp file will be removed right after the execution.

Parameters:

code (str) – The Python code to be executed.
timeout (float, defaults to 300) – The maximum time (in seconds) allowed for the code to run.
kwargs (Any)

Returns:

The response containing the return code, standard output, and standard error of the executed code.

Return type:

ToolResponse

async execute_shell_command(command, timeout=300, **kwargs)[source]¶

Execute given command and return the return code, standard output and error within <returncode></returncode>, <stdout></stdout> and <stderr></stderr> tags.

Parameters:

command (str) – The shell command to execute.
timeout (float, defaults to 300) – The maximum time (in seconds) allowed for the command to run.
kwargs (Any)

Returns:

The tool response containing the return code, standard output, and standard error of the executed command.

Return type:

ToolResponse

async view_text_file(file_path, ranges=None)[source]¶

View the file content in the specified range with line numbers. If ranges is not provided, the entire file will be returned.

Parameters:

file_path (str) – The target file path.
ranges (list[int] | None) – The range of lines to be viewed (e.g. lines 1 to 100: [1, 100]), inclusive. If not provided, the entire file will be returned. To view the last 100 lines, use [-100, -1].

Returns:

The tool response containing the file content or an error message.

Return type:

ToolResponse

async write_text_file(file_path, content, ranges=None)[source]¶

Create/Replace/Overwrite content in a text file. When ranges is provided, the content will be replaced in the specified range. Otherwise, the entire file (if exists) will be overwritten.

Parameters:

file_path (str) – The target file path.
content (str) – The content to be written.
ranges (list[int] | None, defaults to None) – The range of lines to be replaced. If None, the entire file will be overwritten.

Returns:

The tool response containing the result of the writing operation.

Return type:

ToolResponse

async insert_text_file(file_path, content, line_number)[source]¶

Insert the content at the specified line number in a text file.

Parameters:

file_path (str) – The target file path.
content (str) – The content to be inserted.
line_number (int) – The line number at which the content should be inserted, starting from 1. If exceeds the number of lines in the file, it will be appended to the end of the file.

Returns:

The tool response containing the result of the insertion operation.

Return type:

ToolResponse

dashscope_text_to_image(prompt, api_key, n=1, size='1024*1024', model='wanx-v1', use_base64=False)[source]¶

Generate image(s) based on the given prompt, and return image url(s) or base64 data.

Parameters:

prompt (str) – The text prompt to generate image.
api_key (str) – The api key for the dashscope api.
n (int, defaults to 1) – The number of images to generate.
size (Literal[“1024*1024”, “720*1280”, “1280*720”], defaults to “1024*1024”) – Size of the image.
model (str, defaults to ‘“wanx-v1”’) – The model to use, such as “wanx-v1”, “qwen-image”, “wan2.2-t2i-flash”, etc.
use_base64 (bool, defaults to ‘False’) – Whether to use base64 data for images.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

dashscope_text_to_audio(text, api_key, model='sambert-zhichu-v1', sample_rate=48000)[source]¶

Convert the given text to audio.

Parameters:

text (str) – The text to be converted into audio.
api_key (str) – The api key for the dashscope API.
model (str, defaults to ‘sambert-zhichu-v1’) – The model to use. Full model list can be found in the official document.
sample_rate (int, defaults to 48000) – Sample rate of the audio.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

dashscope_image_to_text(image_urls, api_key, prompt='Describe the image', model='qwen-vl-plus')[source]¶

Generate text based on the given images.

Parameters:

image_urls (str | Sequence[str]) – The url of single or multiple images.
api_key (str) – The api key for the dashscope api.
prompt (str, defaults to ‘Describe the image’) – The text prompt.
model (str, defaults to ‘qwen-vl-plus’) – The model to use in DashScope MultiModal API.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

openai_text_to_image(prompt, api_key, n=1, model='dall-e-2', size='256x256', quality='auto', style='vivid', response_format='url')[source]¶

Generate image(s) based on the given prompt, and return image URL(s) or base64 data.

Parameters:

prompt (str) – The text prompt to generate images.
api_key (str) – The API key for the OpenAI API.
n (int, defaults to 1) – The number of images to generate.
model (Literal[“dall-e-2”, “dall-e-3”], defaults to “dall-e-2”) – The model to use for image generation.
size (Literal[“256x256”, “512x512”, “1024x1024”, “1792x1024”, “1024x1792”], defaults to “256x256”) – The size of the generated images. Must be one of 1024x1024, 1536x1024 (landscape), 1024x1536 ( portrait), or auto (default value) for gpt-image-1, one of 256x256, 512x512, or 1024x1024 for dall-e-2, and one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3.
quality (Literal[“auto”, “standard”, “hd”, “high”, “medium”, “low”], defaults to “auto”) –
The quality of the image that will be generated.
- auto (default value) will automatically select the best quality for the given model.
- high, medium and low are supported for gpt-image-1.
- hd and standard are supported for dall-e-3.
- standard is the only option for dall-e-2.
style (Literal[“vivid”, “natural”], defaults to “vivid”) –
The style of the generated images. This parameter is only supported for dall-e-3. Must be one of vivid or natural.
- Vivid causes the model to lean towards generating hyper-real and dramatic images.
- Natural causes the model to produce more natural, less hyper-real looking images.
response_format (Literal[“url”, “b64_json”], defaults to “url”) –
The format in which generated images with dall-e-2 and dall-e-3 are returned.
- Must be one of “url” or “b64_json”.
- URLs are only valid for 60 minutes after the image has been generated.
- This parameter isn’t supported for gpt-image-1 which will always return base64-encoded images.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

openai_text_to_audio(text, api_key, model='tts-1', voice='alloy', speed=1.0, res_format='mp3')[source]¶

Convert text to an audio file using a specified model and voice.

Parameters:

text (str) – The text to convert to audio.
api_key (str) – The API key for the OpenAI API.
model (Literal[“tts-1”, “tts-1-hd”], defaults to “tts-1”) – The model to use for text-to-speech conversion.
voice (Literal[“alloy”, “echo”, “fable”, “onyx”, “nova”, “shimmer”], defaults to “alloy”) – The voice to use for the audio output.
speed (float, defaults to 1.0) – The speed of the audio playback. A value of 1.0 is normal speed.
res_format (Literal[“mp3”, “wav”, “opus”, “aac”, “flac”, “wav”, “pcm”], defaults to “mp3”) – The format of the audio file.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

openai_edit_image(image_url, prompt, api_key, model='dall-e-2', mask_url=None, n=1, size='256x256', response_format='url')[source]¶

Edit an image based on the provided mask and prompt, and return the edited image URL(s) or base64 data.

Parameters:

image_url (str) – The file path or URL to the image that needs editing.
prompt (str) – The text prompt describing the edits to be made to the image.
api_key (str) – The API key for the OpenAI API.
model (Literal[“dall-e-2”, “gpt-image-1”], defaults to “dall-e-2”) – The model to use for image generation.
mask_url (str | None, defaults to None) – The file path or URL to the mask image that specifies the regions to be edited.
n (int, defaults to 1) – The number of edited images to generate.
size (Literal[“256x256”, “512x512”, “1024x1024”], defaults to “256x256”) – The size of the edited images.
response_format (Literal[“url”, “b64_json”], defaults to “url”) –
The format in which generated images are returned.
- Must be one of “url” or “b64_json”.
- URLs are only valid for 60 minutes after generation.
- This parameter isn’t supported for gpt-image-1 which will always return base64-encoded images.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

openai_create_image_variation(image_url, api_key, n=1, model='dall-e-2', size='256x256', response_format='url')[source]¶

Create variations of an image and return the image URL(s) or base64 data.

Parameters:

image_url (str) – The file path or URL to the image from which variations will be generated.
api_key (str) – The API key for the OpenAI API.
n (int, defaults to 1) – The number of image variations to generate.
model (` Literal[“dall-e-2”]`, default to dall-e-2) – The model to use for image variation.
size (Literal[“256x256”, “512x512”, “1024x1024”], defaults to “256x256”) – The size of the generated image variations.
response_format (Literal[“url”, “b64_json”], defaults to “url”) –
The format in which generated images are returned.
- Must be one of url or b64_json.
- URLs are only valid for 60 minutes after the image has been generated.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

openai_image_to_text(image_urls, api_key, prompt='Describe the image', model='gpt-4o')[source]¶

Generate descriptive text for given image(s) using a specified model, and return the generated text.

Parameters:

image_urls (str | list[str]) – The URL or list of URLs pointing to the images that need to be described.
api_key (str) – The API key for the OpenAI API.
prompt (str, defaults to “Describe the image”) – The prompt that instructs the model on how to describe the image(s).
model (str, defaults to “gpt-4o”) – The model to use for generating the text descriptions.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse

openai_audio_to_text(audio_file_url, api_key, language='en', temperature=0.2)[source]¶

Convert an audio file to text using OpenAI’s transcription service.

Parameters:

audio_file_url (str) – The file path or URL to the audio file that needs to be transcribed.
api_key (str) – The API key for the OpenAI API.
language (str, defaults to “en”) – The language of the input audio in ISO-639-1 format (e.g., “en”, “zh”, “fr”). Improves accuracy and latency.
temperature (float, defaults to 0.2) – The temperature for the transcription, which affects the randomness of the output.

Returns:

A ToolResponse containing the generated content (ImageBlock/TextBlock/AudioBlock) or error information if the operation failed.

Return type:

ToolResponse