agentscope.model¶

The model module.

class ChatModelBase[source]¶

Bases: object

Base class for chat models.

__init__(model_name, stream)[source]¶

Initialize the chat model base class.

Parameters:

model_name (str) – The name of the model
stream (bool) – Whether the model output is streaming or not

Return type:

None

model_name: str¶: The model name

stream: bool¶: Is the model output streaming or not

abstract async __call__(*args, **kwargs)[source]¶

Call self as a function.

Parameters:

args (Any)
kwargs (Any)

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class ChatResponse[source]¶

Bases: DictMixin

The response of chat models.

content: Sequence[TextBlock | ToolUseBlock | ThinkingBlock | AudioBlock]¶: The content of the chat response, which can include text blocks, tool use blocks, or thinking blocks.

id: str¶: The unique identifier formatter

created_at: str¶: When the response was created

__init__(content, id=<factory>, created_at=<factory>, type=<factory>, usage=<factory>, metadata=<factory>)¶

Parameters:

content (Sequence[TextBlock | ToolUseBlock | ThinkingBlock | AudioBlock])
id (str)
created_at (str)
type (Literal['chat'])
usage (ChatUsage | None)
metadata (dict[str, str | int | float | bool | None | list[JSONSerializableObject] | dict[str, JSONSerializableObject]] | None)

Return type:

None

type: Literal['chat']¶: The type of the response, which is always ‘chat’.

usage: ChatUsage | None¶: The usage information of the chat response, if available.

metadata: dict[str, str | int | float | bool | None | list[JSONSerializableObject] | dict[str, JSONSerializableObject]] | None¶: The metadata of the chat response

class DashScopeChatModel[source]¶

Bases: ChatModelBase

The DashScope chat model class, which unifies the Generation and MultimodalConversation APIs into one method.

__init__(model_name, api_key, stream=True, enable_thinking=None, generate_kwargs=None, base_http_api_url=None)[source]¶

Initialize the DashScope chat model.

Parameters:

model_name (str) – The model names.
api_key (str) – The dashscope API key.
stream (bool) – The streaming output or not
enable_thinking (bool | None, optional) – Enable thinking or not, only support Qwen3, QwQ, DeepSeek-R1. Refer to DashScope documentation for more details.
generate_kwargs (dict[str, JSONSerializableObject] | None, optional) – The extra keyword arguments used in DashScope API generation, e.g. temperature, seed.
base_http_api_url (str | None, optional) – The base URL for DashScope API requests. If not provided, the default base URL from the DashScope SDK will be used.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, structured_model=None, **kwargs)[source]¶

Get the response from the dashscope Generation/MultimodalConversation API by the given arguments.

Note

We unify the dashscope generation and multimodal conversation APIs into one method, since they support similar arguments and share the same functionality.

Parameters:

messages (list[dict[str, Any]]) – A list of dictionaries, where role and content fields are required.
tools (list[dict] | None, default None) – The tools JSON schemas that the model can use.
tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

Controls which (if any) tool is called by the model.
Can be “auto”, “none”, or specific tool name. For more details, please refer to https://help.aliyun.com/zh/model-studio/qwen-function-calling
structured_model (Type[BaseModel] | None, default None) –
A Pydantic BaseModel class that defines the expected structure for the model’s output. When provided, the model will be forced to return data that conforms to this schema by automatically converting the BaseModel to a tool function and setting tool_choice to enforce its usage. This enables structured output generation.

Note

When structured_model is specified, both tools and tool_choice parameters are ignored, and the model will only perform structured output generation without calling any other tools.
**kwargs (Any) –
The keyword arguments for DashScope chat completions API, e.g. temperature, max_tokens, top_p, etc. Please refer to DashScope documentation for more detailed arguments.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class OpenAIChatModel[source]¶

Bases: ChatModelBase

The OpenAI chat model class.

__init__(model_name, api_key=None, stream=True, reasoning_effort=None, organization=None, client_args=None, generate_kwargs=None)[source]¶

Initialize the openai client.

Parameters:

model_name (str, default None) – The name of the model to use in OpenAI API.
api_key (str, default None) – The API key for OpenAI API. If not specified, it will be read from the environment variable OPENAI_API_KEY.
stream (bool, default True) – Whether to use streaming output or not.
reasoning_effort (Literal[“low”, “medium”, “high”] | None, optional) – Reasoning effort, supported for o3, o4, etc. Please refer to OpenAI documentation for more details.
organization (str, default None) – The organization ID for OpenAI API. If not specified, it will be read from the environment variable OPENAI_ORGANIZATION.
client_args (dict, default None) – The extra keyword arguments to initialize the OpenAI client.
generate_kwargs (dict[str, JSONSerializableObject] | None, optional) –

The extra keyword arguments used in OpenAI API generation,
e.g. temperature, seed.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, structured_model=None, **kwargs)[source]¶

Get the response from OpenAI chat completions API by the given arguments.

Parameters:

messages (list[dict]) – A list of dictionaries, where role and content fields are required, and name field is optional.
tools (list[dict], default None) – The tools JSON schemas that the model can use.
tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

Controls which (if any) tool is called by the model.
Can be “auto”, “none”, “any”, “required”, or specific tool name. For more details, please refer to https://platform.openai.com/docs/api-reference/responses/create#responses_create-tool_choice
structured_model (Type[BaseModel] | None, default None) –
A Pydantic BaseModel class that defines the expected structure for the model’s output. When provided, the model will be forced to return data that conforms to this schema by automatically converting the BaseModel to a tool function and setting tool_choice to enforce its usage. This enables structured output generation.

Note

When structured_model is specified, both tools and tool_choice parameters are ignored, and the model will only perform structured output generation without calling any other tools.

For more details, please refer to the official document
**kwargs (Any) – The keyword arguments for OpenAI chat completions API, e.g. temperature, max_tokens, top_p, etc. Please refer to the OpenAI API documentation for more details.

Returns:

The response from the OpenAI chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class AnthropicChatModel[source]¶

Bases: ChatModelBase

The Anthropic model wrapper for AgentScope.

__init__(model_name, api_key=None, max_tokens=2048, stream=True, thinking=None, client_args=None, generate_kwargs=None)[source]¶

Initialize the Anthropic chat model.

Parameters:

model_name (str) – The model names.
api_key (str) – The anthropic API key.
stream (bool) – The streaming output or not
max_tokens (int) – Limit the maximum token count the model can generate.
thinking (dict | None, default None) –
Configuration for Claude’s internal reasoning process.
Example of thinking¶
```
{
    "type": "enabled" | "disabled",
    "budget_tokens": 1024
}
```
client_args (dict | None, optional) – The extra keyword arguments to initialize the Anthropic client.
generate_kwargs (dict[str, JSONSerializableObject] | None, optional) – The extra keyword arguments used in Gemini API generation, e.g. temperature, seed.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, structured_model=None, **generate_kwargs)[source]¶

Get the response from Anthropic chat completions API by the given arguments.

Parameters:

messages (list[dict]) – A list of dictionaries, where role and content fields are required, and name field is optional.

tools (list[dict], default None) –

The tools JSON schemas that in format of:

Example of tools JSON schemas¶

[
    {
        "type": "function",
        "function": {
            "name": "xxx",
            "description": "xxx",
            "parameters": {
                "type": "object",
                "properties": {
                    "param1": {
                        "type": "string",
                        "description": "..."
                    },
                    # Add more parameters as needed
                },
                "required": ["param1"]
        }
    },
    # More schemas here
]

tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

Controls which (if any) tool is called by the model.
Can be “auto”, “none”, “any”, “required”, or specific tool name. For more details, please refer to https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/implement-tool-use
structured_model (Type[BaseModel] | None, default None) –
A Pydantic BaseModel class that defines the expected structure for the model’s output. When provided, the model will be forced to return data that conforms to this schema by automatically converting the BaseModel to a tool function and setting tool_choice to enforce its usage. This enables structured output generation.

Note

When structured_model is specified, both tools and tool_choice parameters are ignored, and the model will only perform structured output generation without calling any other tools.
**generate_kwargs (Any) – The keyword arguments for Anthropic chat completions API, e.g. temperature, top_p, etc. Please refer to the Anthropic API documentation for more details.

Returns:

The response from the Anthropic chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class OllamaChatModel[source]¶

Bases: ChatModelBase

The Ollama chat model class in agentscope.

__init__(model_name, stream=False, options=None, keep_alive='5m', enable_thinking=None, host=None, **kwargs)[source]¶

Initialize the Ollama chat model.

Parameters:

model_name (str) – The name of the model.
stream (bool, default True) – Streaming mode or not.
options (dict, default None) – Additional parameters to pass to the Ollama API. These can include temperature etc.
keep_alive (str, default “5m”) – Duration to keep the model loaded in memory. The format is a number followed by a unit suffix (s for seconds, m for minutes , h for hours).
enable_thinking (bool | None, default None) – Whether enable thinking or not, only for models such as qwen3, deepseek-r1, etc. For more details, please refer to https://ollama.com/search?c=thinking
host (str | None, default None) – The host address of the Ollama server. If None, uses the default address (typically http://localhost:11434).
**kwargs (Any) – Additional keyword arguments to pass to the base chat model class.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, structured_model=None, **kwargs)[source]¶

Get the response from Ollama chat completions API by the given arguments.

Parameters:

messages (list[dict]) – A list of dictionaries, where role and content fields are required, and name field is optional.
tools (list[dict], default None) – The tools JSON schemas that the model can use.
tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

Controls which (if any) tool is called by the model.
Can be “auto”, “none”, “any”, “required”, or specific tool name.
structured_model (Type[BaseModel] | None, default None) – A Pydantic BaseModel class that defines the expected structure for the model’s output.
**kwargs (Any) – The keyword arguments for Ollama chat completions API, e.g. `think`etc. Please refer to the Ollama API documentation for more details.

Returns:

The response from the Ollama chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class GeminiChatModel[source]¶

Bases: ChatModelBase

The Google Gemini chat model class in agentscope.

__init__(model_name, api_key, stream=True, thinking_config=None, client_args=None, generate_kwargs=None)[source]¶

Initialize the Gemini chat model.

Parameters:

model_name (str) – The name of the Gemini model to use, e.g. “gemini-2.5-flash”.
api_key (str) – The API key for Google Gemini.
stream (bool, default True) – Whether to use streaming output or not.
thinking_config (dict | None, optional) –
Thinking config, supported models are 2.5 Pro, 2.5 Flash, etc. Refer to https://ai.google.dev/gemini-api/docs/thinking for more details.
Example of thinking_config¶
```
{
    "include_thoughts": True, # enable thoughts or not
    "thinking_budget": 1024   # Max tokens for reasoning
}
```
client_args (dict, default None) – The extra keyword arguments to initialize the OpenAI client.
generate_kwargs (dict[str, JSONSerializableObject] | None, optional) – The extra keyword arguments used in Gemini API generation, e.g. temperature, seed.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, structured_model=None, **config_kwargs)[source]¶

Call the Gemini model with the provided arguments.

Parameters:

messages (list[dict[str, Any]]) – A list of dictionaries, where role and content fields are required.
tools (list[dict] | None, default None) – The tools JSON schemas that the model can use.
tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

Controls which (if any) tool is called by the model.
Can be “auto”, “none”, “any”, “required”, or specific tool name. For more details, please refer to https://ai.google.dev/gemini-api/docs/function-calling?hl=en&example=meeting#function_calling_modes
structured_model (Type[BaseModel] | None, default None) –
A Pydantic BaseModel class that defines the expected structure for the model’s output.

Note

When structured_model is specified, both tools and tool_choice parameters are ignored, and the model will only perform structured output generation without calling any other tools.

For more details, please refer to
https://ai.google.dev/gemini-api/docs/structured-output
**config_kwargs (Any) – The keyword arguments for Gemini chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class TrinityChatModel[source]¶

Bases: OpenAIChatModel

A model class for RL Training with Trinity-RFT.

__init__(openai_async_client, generate_kwargs=None, enable_thinking=None)[source]¶

Initialize the Trinity model class.

Parameters:

openai_async_client (AsyncOpenAI) – The OpenAI async client instance provided by Trinity-RFT.
generate_kwargs (dict[str, JSONSerializableObject] | None, optional) – Additional keyword arguments to pass to the model’s generate method. Defaults to None.
enable_thinking (bool, optional) – Whether to enable the model’s thinking capability. Only applicable for Qwen3 series models. Defaults to None.

Return type:

None