Note
Go to the end to download the full example code.
Model¶
In this tutorial, we introduce the model APIs integrated in AgentScope, how to use them and how to integrate new model APIs. The supported model APIs and providers include:
API |
Class |
Compatible |
Streaming |
Tools |
Vision |
Reasoning |
---|---|---|---|---|---|---|
OpenAI |
|
vLLM, DeepSeek |
✅ |
✅ |
✅ |
✅ |
DashScope |
|
✅ |
✅ |
✅ |
✅ |
|
Anthropic |
|
✅ |
✅ |
✅ |
✅ |
|
Gemini |
|
✅ |
✅ |
✅ |
✅ |
|
Ollama |
|
✅ |
✅ |
✅ |
✅ |
Note
When using vLLM, you need to configure the appropriate tool calling parameters for different models during deployment, such as --enable-auto-tool-choice
, --tool-call-parser
, etc. For more details, refer to the official vLLM documentation.
Note
For OpenAI-compatible models (e.g. vLLM, Deepseek), developers can use the OpenAIChatModel
class, and specify the API endpoint by the client_args
parameter: client_args={"base_url": "http://your-api-endpoint"}
. For example:
OpenAIChatModel(client_args={"base_url": "http://localhost:8000/v1"})
Note
Model behavior parameters (such as temperature, maximum length, etc.) can be preset in the constructor function via the generate_kwargs
parameter. For example:
OpenAIChatModel(generate_kwargs={"temperature": 0.3, "max_tokens": 1000})
To provide unified model interfaces, the above model classes has the following common methods:
The first three arguments of the
__call__
method aremessages
,tools
andtool_choice
, representing the input messages, JSON schema of tool functions, and tool selection mode, respectively.The return type are either a
ChatResponse
instance or an async generator ofChatResponse
in streaming mode.
Note
Different model APIs differ in the input message format, refer to Prompt Formatter for more details.
The ChatResponse
instance contains the generated thinking/text/tool use content, identity, created time and usage information.
import asyncio
import json
import os
from agentscope.message import TextBlock, ToolUseBlock, ThinkingBlock, Msg
from agentscope.model import ChatResponse, DashScopeChatModel
response = ChatResponse(
content=[
ThinkingBlock(
type="thinking",
thinking="I should search for AgentScope on Google.",
),
TextBlock(type="text", text="I'll search for AgentScope on Google."),
ToolUseBlock(
type="tool_use",
id="642n298gjna",
name="google_search",
input={"query": "AgentScope?"},
),
],
)
print(response)
ChatResponse(content=[{'type': 'thinking', 'thinking': 'I should search for AgentScope on Google.'}, {'type': 'text', 'text': "I'll search for AgentScope on Google."}, {'type': 'tool_use', 'id': '642n298gjna', 'name': 'google_search', 'input': {'query': 'AgentScope?'}}], id='2025-10-17 10:33:33.997_9e9f66', created_at='2025-10-17 10:33:33.997', type='chat', usage=None, metadata=None)
Taking DashScopeChatModel
as an example, we can use it to create a chat model instance and call it with messages and tools:
async def example_model_call() -> None:
"""An example of using the DashScopeChatModel."""
model = DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=False,
)
res = await model(
messages=[
{"role": "user", "content": "Hi!"},
],
)
# You can directly create a ``Msg`` object with the response content
msg_res = Msg("Friday", res.content, "assistant")
print("The response:", res)
print("The response as Msg:", msg_res)
asyncio.run(example_model_call())
The response: ChatResponse(content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], id='2025-10-17 10:33:35.666_f2790f', created_at='2025-10-17 10:33:35.667', type='chat', usage=ChatUsage(input_tokens=10, output_tokens=9, time=1.668544, type='chat'), metadata=None)
The response as Msg: Msg(id='7KzLbFj2dsZv36u7UAzfjt', name='Friday', content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], role='assistant', metadata=None, timestamp='2025-10-17 10:33:35.667', invocation_id='None')
Streaming¶
To enable streaming model, set the stream
parameter in the model constructor to True
.
When streaming is enabled, the __call__
method will return an async generator that yields ChatResponse
instances as they are generated by the model.
Note
The streaming mode in AgentScope is designed to be cumulative, meaning the content in each chunk contains all the previous content plus the newly generated content.
async def example_streaming() -> None:
"""An example of using the streaming model."""
model = DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=True,
)
generator = await model(
messages=[
{
"role": "user",
"content": "Count from 1 to 20, and just report the number without any other information.",
},
],
)
print("The type of the response:", type(generator))
i = 0
async for chunk in generator:
print(f"Chunk {i}")
print(f"\ttype: {type(chunk.content)}")
print(f"\t{chunk}\n")
i += 1
asyncio.run(example_streaming())
The type of the response: <class 'async_generator'>
Chunk 0
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1'}], id='2025-10-17 10:33:36.933_0d6379', created_at='2025-10-17 10:33:36.933', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=1, time=1.265137, type='chat'), metadata=None)
Chunk 1
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n'}], id='2025-10-17 10:33:37.034_40fe55', created_at='2025-10-17 10:33:37.034', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=4, time=1.36589, type='chat'), metadata=None)
Chunk 2
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4'}], id='2025-10-17 10:33:37.131_92a902', created_at='2025-10-17 10:33:37.131', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=7, time=1.463065, type='chat'), metadata=None)
Chunk 3
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n'}], id='2025-10-17 10:33:37.225_3c122b', created_at='2025-10-17 10:33:37.225', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=10, time=1.557228, type='chat'), metadata=None)
Chunk 4
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n'}], id='2025-10-17 10:33:37.437_4913ef', created_at='2025-10-17 10:33:37.438', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=16, time=1.769359, type='chat'), metadata=None)
Chunk 5
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n1'}], id='2025-10-17 10:33:37.684_b4f576', created_at='2025-10-17 10:33:37.684', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=22, time=2.016106, type='chat'), metadata=None)
Chunk 6
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n1'}], id='2025-10-17 10:33:37.870_9a07de', created_at='2025-10-17 10:33:37.870', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=28, time=2.201397, type='chat'), metadata=None)
Chunk 7
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1'}], id='2025-10-17 10:33:38.363_790a8a', created_at='2025-10-17 10:33:38.363', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=34, time=2.694774, type='chat'), metadata=None)
Chunk 8
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n1'}], id='2025-10-17 10:33:38.739_e020db', created_at='2025-10-17 10:33:38.739', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=40, time=3.070521, type='chat'), metadata=None)
Chunk 9
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n1'}], id='2025-10-17 10:33:38.926_63f55a', created_at='2025-10-17 10:33:38.926', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=46, time=3.257741, type='chat'), metadata=None)
Chunk 10
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='2025-10-17 10:33:39.216_779bfd', created_at='2025-10-17 10:33:39.216', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.548289, type='chat'), metadata=None)
Chunk 11
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='2025-10-17 10:33:39.239_f934bf', created_at='2025-10-17 10:33:39.239', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.570888, type='chat'), metadata=None)
Reasoning¶
AgentScope supports reasoning models by providing the ThinkingBlock
.
async def example_reasoning() -> None:
"""An example of using the reasoning model."""
model = DashScopeChatModel(
model_name="qwen-turbo",
api_key=os.environ["DASHSCOPE_API_KEY"],
enable_thinking=True,
)
res = await model(
messages=[
{"role": "user", "content": "Who am I?"},
],
)
last_chunk = None
async for chunk in res:
last_chunk = chunk
print("The final response:")
print(last_chunk)
asyncio.run(example_reasoning())
The final response:
ChatResponse(content=[{'type': 'thinking', 'thinking': 'Okay, the user is asking "Who am I?" which is a deep philosophical question. I need to approach this carefully.\n\nFirst, I should acknowledge that this is a complex question with no simple answer. Different fields like philosophy, psychology, and neuroscience have various perspectives. I should mention some of these to give a well-rounded response.\n\nIn philosophy, there\'s the concept of self from thinkers like Descartes, who said "I think, therefore I am." Then there\'s the Buddhist view of no-self, which challenges the idea of a permanent identity. I should explain these briefly.\n\nFrom a psychological standpoint, the self is shaped by experiences, relationships, and personal growth. Maybe mention Carl Jung\'s persona and shadow aspects. Also, the idea that identity is dynamic and changes over time.\n\nNeuroscience might look at the brain\'s role in creating the sense of self, like the default mode network. But I should note that this is still an area of ongoing research.\n\nI should also consider the user\'s possible intent. They might be seeking self-reflection or existential insights. It\'s important to encourage them to explore their own experiences and values. Maybe suggest questions they can ask themselves to better understand their identity.\n\nI need to avoid making assumptions about their personal situation. Keep the response open-ended and supportive. Make sure it\'s clear that the answer depends on their perspective and experiences.\n\nCheck if there\'s any cultural context I should be aware of, but since it\'s a general question, stick to universal concepts. Avoid jargon to keep it accessible. Conclude by emphasizing that the journey of self-discovery is personal and ongoing.'}, {'type': 'text', 'text': 'The question "Who am I?" is one of the most profound and enduring inquiries in human history, touching on philosophy, psychology, spirituality, and science. Here’s a reflection on this question from multiple perspectives:\n\n### 1. **Philosophical Perspectives** \n - **Western Philosophy**: Thinkers like Descartes ("I think, therefore I am") emphasize the mind or consciousness as the core of identity. Others, like existentialists (e.g., Sartre), argue that identity is not fixed but shaped by choices and actions. \n - **Eastern Thought**: Buddhism teaches the concept of *anatta* (no-self), suggesting that the "self" is an illusion—a collection of impermanent physical and mental components. Hinduism, by contrast, often speaks of the *atman* (eternal soul) as the true self. \n\n### 2. **Psychological View** \n - The self is shaped by experiences, relationships, and cultural influences. Psychologists like Carl Jung discussed the "persona" (the mask we present to the world) and the "shadow" (unacknowledged aspects of the psyche). Identity is dynamic, evolving through growth, trauma, and self-reflection. \n\n### 3. **Neuroscience** \n - The brain constructs the sense of self through neural networks, particularly the *default mode network* (DMN), which is active during self-referential thinking. However, this "self" is a construct—part of the brain’s storytelling mechanism to create coherence in experience. \n\n### 4. **Spiritual/Existential** \n - Many spiritual traditions view the self as more than the physical or ego-driven identity. For example, in mysticism, the "true self" might be seen as connected to something greater (e.g., God, the universe, or collective consciousness). \n - Existentialists argue that the self is defined by freedom and responsibility—what you choose to become. \n\n### 5. **Personal Reflection** \n - The answer to "Who am I?" may depend on your values, passions, relationships, and how you navigate the world. It’s not a static answer but a continuous process of discovery. Questions like *"What do I value?"* or *"What gives my life meaning?"* can help uncover layers of your identity. \n\n### Final Thought: \nYou are both a product of your history and a creator of your future. The "you" that exists today is shaped by your experiences, but it is also capable of growth, change, and transformation. The journey of self-discovery is deeply personal—what matters is how you engage with it. \n\nIf you’re exploring this question, it might be worth reflecting on what aspects of yourself feel most authentic, and what you hope to become. 🌱'}], id='2025-10-17 10:33:48.325_1c78f5', created_at='2025-10-17 10:33:48.325', type='chat', usage=ChatUsage(input_tokens=12, output_tokens=909, time=9.081152, type='chat'), metadata=None)
Tools API¶
Different model providers differ in their tools APIs, e.g. the tools JSON schema, the tool call/response format. To provide a unified interface, AgentScope solves the problem by:
Providing unified tool call block ToolUseBlock and tool response block ToolResultBlock, respectively.
Providing a unified tools interface in the
__call__
method of the model classes, that accepts a list of tools JSON schemas as follows:
json_schemas = [
{
"type": "function",
"function": {
"name": "google_search",
"description": "Search for a query on Google.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query.",
},
},
"required": ["query"],
},
},
},
]
Further Reading¶
Total running time of the script: (0 minutes 14.333 seconds)