agentscope.model

The model module.

class ChatModelBase[source]

Bases: object

Base class for chat models.

__init__(model_name, stream)[source]

Initialize the chat model base class.

Parameters:
  • model_name (str) – The name of the model

  • stream (bool) – Whether the model output is streaming or not

Return type:

None

model_name: str

The model name

stream: bool

Is the model output streaming or not

abstract async __call__(*args, **kwargs)[source]

Call self as a function.

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class ChatResponse[source]

Bases: DictMixin

The response of chat models.

content: Sequence[TextBlock | ToolUseBlock | ThinkingBlock]

The content of the chat response, which can include text blocks, tool use blocks, or thinking blocks.

id: str

The unique identifier formatter

__init__(content, id=<factory>, created_at=<factory>, type=<factory>, usage=<factory>)
Parameters:
  • content (Sequence[TextBlock | ToolUseBlock | ThinkingBlock])

  • id (str)

  • created_at (str)

  • type (Literal['chat'])

  • usage (ChatUsage | None)

Return type:

None

created_at: str

When the response was created

type: Literal['chat']

The type of the response, which is always ‘chat’.

usage: ChatUsage | None

The usage information of the chat response, if available.

class DashScopeChatModel[source]

Bases: ChatModelBase

The DashScope chat model class, which unifies the Generation and MultimodalConversation APIs into one method.

__init__(model_name, api_key, stream=True, enable_thinking=None, generate_kwargs=None)[source]

Initialize the DashScope chat model.

Parameters:
  • model_name (str) – The model names.

  • api_key (str) – The dashscope API key.

  • stream (bool) – The streaming output or not

  • enable_thinking (bool | None, optional) – Enable thinking or not, only support Qwen3, QwQ, DeepSeek-R1. Refer to DashScope documentation for more details.

  • generate_kwargs (dict[str, JSONSerializableObject] | None, optional) – The extra keyword arguments used in DashScope API generation, e.g. temperature, seed.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, **kwargs)[source]

Get the response from the dashscope Generation/MultimodalConversation API by the given arguments.

Note

We unify the dashscope generation and multimodal conversation APIs into one method, since they support similar arguments and share the same functionality.

Parameters:
  • messages (list[dict[str, Any]]) – A list of dictionaries, where role and content fields are required.

  • tools (list[dict] | None, default None) – The tools JSON schemas that the model can use.

  • tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

    Controls which (if any) tool is called by the model.

    Can be “auto”, “none”, or specific tool name. For more details, please refer to https://help.aliyun.com/zh/model-studio/qwen-function-calling

  • **kwargs (Any) –

    The keyword arguments for DashScope chat completions API, e.g. temperature, max_tokens, top_p, etc. Please refer to DashScope documentation for more detailed arguments.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class OpenAIChatModel[source]

Bases: ChatModelBase

The OpenAI chat model class.

__init__(model_name, api_key=None, stream=True, reasoning_effort=None, organization=None, client_args=None, generate_kwargs=None)[source]

Initialize the openai client.

Parameters:
  • model_name (str, default None) – The name of the model to use in OpenAI API.

  • api_key (str, default None) – The API key for OpenAI API. If not specified, it will be read from the environment variable OPENAI_API_KEY.

  • stream (bool, default True) – Whether to use streaming output or not.

  • reasoning_effort (Literal[“low”, “medium”, “high”] | None, optional) – Reasoning effort, supported for o3, o4, etc. Please refer to OpenAI documentation for more details.

  • organization (str, default None) – The organization ID for OpenAI API. If not specified, it will be read from the environment variable OPENAI_ORGANIZATION.

  • client_args (dict, default None) – The extra keyword arguments to initialize the OpenAI client.

  • generate_kwargs (dict[str, JSONSerializableObject] | None, optional) –

    The extra keyword arguments used in OpenAI API generation,

    e.g. temperature, seed.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, **kwargs)[source]

Get the response from OpenAI chat completions API by the given arguments.

Parameters:
  • messages (list[dict]) – A list of dictionaries, where role and content fields are required, and name field is optional.

  • tools (list[dict], default None) – The tools JSON schemas that the model can use.

  • tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

    Controls which (if any) tool is called by the model.

    Can be “auto”, “none”, “any”, “required”, or specific tool name. For more details, please refer to https://platform.openai.com/docs/api-reference/responses/create#responses_create-tool_choice

  • **kwargs (Any) – The keyword arguments for OpenAI chat completions API, e.g. temperature, max_tokens, top_p, etc. Please refer to the OpenAI API documentation for more details.

Returns:

The response from the OpenAI chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class AnthropicChatModel[source]

Bases: ChatModelBase

The Anthropic model wrapper for AgentScope.

__init__(model_name, api_key=None, max_tokens=2048, stream=True, thinking=None, client_args=None, generate_kwargs=None)[source]

Initialize the Anthropic chat model.

Parameters:
  • model_name (str) – The model names.

  • api_key (str) – The anthropic API key.

  • stream (bool) – The streaming output or not

  • max_tokens (int) – Limit the maximum token count the model can generate.

  • thinking (dict | None, default None) –

    Configuration for Claude’s internal reasoning process.

    Example of thinking
    {
        "type": "enabled" | "disabled",
        "budget_tokens": 1024
    }
    

  • client_args (dict | None, optional) – The extra keyword arguments to initialize the Anthropic client.

  • generate_kwargs (dict[str, JSONSerializableObject] | None, optional) – The extra keyword arguments used in Gemini API generation, e.g. temperature, seed.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, **generate_kwargs)[source]

Get the response from Anthropic chat completions API by the given arguments.

Parameters:
  • messages (list[dict]) – A list of dictionaries, where role and content fields are required, and name field is optional.

  • tools (list[dict], default None) –

    The tools JSON schemas that in format of:

    Example of tools JSON schemas
    [
        {
            "type": "function",
            "function": {
                "name": "xxx",
                "description": "xxx",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "param1": {
                            "type": "string",
                            "description": "..."
                        },
                        # Add more parameters as needed
                    },
                    "required": ["param1"]
            }
        },
        # More schemas here
    ]
    

  • tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

    Controls which (if any) tool is called by the model.

    Can be “auto”, “none”, “any”, “required”, or specific tool name. For more details, please refer to https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/implement-tool-use

  • **generate_kwargs (Any) – The keyword arguments for Anthropic chat completions API, e.g. temperature, top_p, etc. Please refer to the Anthropic API documentation for more details.

Returns:

The response from the Anthropic chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class OllamaChatModel[source]

Bases: ChatModelBase

The Ollama chat model class in agentscope.

__init__(model_name, stream=False, options=None, keep_alive='5m', enable_thinking=None, host=None, **kwargs)[source]

Initialize the Ollama chat model.

Parameters:
  • model_name (str) – The name of the model.

  • stream (bool, default True) – Streaming mode or not.

  • options (dict, default None) – Additional parameters to pass to the Ollama API. These can include temperature etc.

  • keep_alive (str, default “5m”) – Duration to keep the model loaded in memory. The format is a number followed by a unit suffix (s for seconds, m for minutes , h for hours).

  • enable_thinking (bool | None, default None) – Whether enable thinking or not, only for models such as qwen3, deepseek-r1, etc. For more details, please refer to https://ollama.com/search?c=thinking

  • host (str | None, default None) – The host address of the Ollama server. If None, uses the default address (typically http://localhost:11434).

  • **kwargs (Any) – Additional keyword arguments to pass to the base chat model class.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, **kwargs)[source]

Get the response from Ollama chat completions API by the given arguments.

Parameters:
  • messages (list[dict]) – A list of dictionaries, where role and content fields are required, and name field is optional.

  • tools (list[dict], default None) – The tools JSON schemas that the model can use.

  • tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

    Controls which (if any) tool is called by the model.

    Can be “auto”, “none”, “any”, “required”, or specific tool name.

  • **kwargs (Any) – The keyword arguments for Ollama chat completions API, e.g. `think`etc. Please refer to the Ollama API documentation for more details.

Returns:

The response from the Ollama chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]

class GeminiChatModel[source]

Bases: ChatModelBase

The Google Gemini chat model class in agentscope.

__init__(model_name, api_key, stream=True, thinking_config=None, client_args=None, generate_kwargs=None)[source]

Initialize the Gemini chat model.

Parameters:
  • model_name (str) – The name of the Gemini model to use, e.g. “gemini-2.5-flash”.

  • api_key (str) – The API key for Google Gemini.

  • stream (bool, default True) – Whether to use streaming output or not.

  • thinking_config (dict | None, optional) –

    Thinking config, supported models are 2.5 Pro, 2.5 Flash, etc. Refer to https://ai.google.dev/gemini-api/docs/thinking for more details.

    Example of thinking_config
    {
        "include_thoughts": True, # enable thoughts or not
        "thinking_budget": 1024   # Max tokens for reasoning
    }
    

  • client_args (dict, default None) – The extra keyword arguments to initialize the OpenAI client.

  • generate_kwargs (dict[str, JSONSerializableObject] | None, optional) – The extra keyword arguments used in Gemini API generation, e.g. temperature, seed.

Return type:

None

async __call__(messages, tools=None, tool_choice=None, **config_kwargs)[source]

Call the Gemini model with the provided arguments.

Parameters:
  • messages (list[dict[str, Any]]) – A list of dictionaries, where role and content fields are required.

  • tools (list[dict] | None, default None) – The tools JSON schemas that the model can use.

  • tool_choice (Literal[“auto”, “none”, “any”, “required”] | str | None, default None) –

    Controls which (if any) tool is called by the model.

    Can be “auto”, “none”, “any”, “required”, or specific tool name. For more details, please refer to https://ai.google.dev/gemini-api/docs/function-calling?hl=en&example=meeting#function_calling_modes

  • **config_kwargs (Any) – The keyword arguments for Gemini chat completions API.

Return type:

ChatResponse | AsyncGenerator[ChatResponse, None]