Note
Go to the end to download the full example code.
Model¶
In this tutorial, we introduce the model APIs integrated in AgentScope, how to use them and how to integrate new model APIs. The supported model APIs and providers include:
API |
Class |
Compatible |
Streaming |
Tools |
Vision |
Reasoning |
---|---|---|---|---|---|---|
OpenAI |
|
vLLM, DeepSeek |
✅ |
✅ |
✅ |
✅ |
DashScope |
|
✅ |
✅ |
✅ |
✅ |
|
Anthropic |
|
✅ |
✅ |
✅ |
✅ |
|
Gemini |
|
✅ |
✅ |
✅ |
✅ |
|
Ollama |
|
✅ |
✅ |
✅ |
✅ |
Note
When using vLLM, you need to configure the appropriate tool calling parameters for different models during deployment, such as --enable-auto-tool-choice
, --tool-call-parser
, etc. For more details, refer to the official vLLM documentation.
To provide unified model interfaces, the above model classes has the following common methods:
The first three arguments of the
__call__
method aremessages
,tools
andtool_choice
, representing the input messages, JSON schema of tool functions, and tool selection mode, respectively.The return type are either a
ChatResponse
instance or an async generator ofChatResponse
in streaming mode.
Note
Different model APIs differ in the input message format, refer to Prompt Formatter for more details.
The ChatResponse
instance contains the generated thinking/text/tool use content, identity, created time and usage information.
import asyncio
import json
import os
from agentscope.message import TextBlock, ToolUseBlock, ThinkingBlock, Msg
from agentscope.model import ChatResponse, DashScopeChatModel
response = ChatResponse(
content=[
ThinkingBlock(
type="thinking",
thinking="I should search for AgentScope on Google.",
),
TextBlock(type="text", text="I'll search for AgentScope on Google."),
ToolUseBlock(
type="tool_use",
id="642n298gjna",
name="google_search",
input={"query": "AgentScope?"},
),
],
)
print(response)
ChatResponse(content=[{'type': 'thinking', 'thinking': 'I should search for AgentScope on Google.'}, {'type': 'text', 'text': "I'll search for AgentScope on Google."}, {'type': 'tool_use', 'id': '642n298gjna', 'name': 'google_search', 'input': {'query': 'AgentScope?'}}], id='2025-09-27 13:33:58.701_63d42c', created_at='2025-09-27 13:33:58.701', type='chat', usage=None, metadata=None)
Taking DashScopeChatModel
as an example, we can use it to create a chat model instance and call it with messages and tools:
async def example_model_call() -> None:
"""An example of using the DashScopeChatModel."""
model = DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=False,
)
res = await model(
messages=[
{"role": "user", "content": "Hi!"},
],
)
# You can directly create a ``Msg`` object with the response content
msg_res = Msg("Friday", res.content, "assistant")
print("The response:", res)
print("The response as Msg:", msg_res)
asyncio.run(example_model_call())
The response: ChatResponse(content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], id='2025-09-27 13:34:00.008_5d5b9d', created_at='2025-09-27 13:34:00.009', type='chat', usage=ChatUsage(input_tokens=10, output_tokens=9, time=1.306145, type='chat'), metadata=None)
The response as Msg: Msg(id='PC7nHudGKQEuvaZSJWxUrw', name='Friday', content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], role='assistant', metadata=None, timestamp='2025-09-27 13:34:00.009', invocation_id='None')
Streaming¶
To enable streaming model, set the stream
parameter in the model constructor to True
.
When streaming is enabled, the __call__
method will return an async generator that yields ChatResponse
instances as they are generated by the model.
Note
The streaming mode in AgentScope is designed to be cumulative, meaning the content in each chunk contains all the previous content plus the newly generated content.
async def example_streaming() -> None:
"""An example of using the streaming model."""
model = DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=True,
)
generator = await model(
messages=[
{
"role": "user",
"content": "Count from 1 to 20, and just report the number without any other information.",
},
],
)
print("The type of the response:", type(generator))
i = 0
async for chunk in generator:
print(f"Chunk {i}")
print(f"\ttype: {type(chunk.content)}")
print(f"\t{chunk}\n")
i += 1
asyncio.run(example_streaming())
The type of the response: <class 'async_generator'>
Chunk 0
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1'}], id='2025-09-27 13:34:00.968_7b4e4c', created_at='2025-09-27 13:34:00.968', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=1, time=0.957984, type='chat'), metadata=None)
Chunk 1
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n'}], id='2025-09-27 13:34:01.508_c04c54', created_at='2025-09-27 13:34:01.508', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=4, time=1.498095, type='chat'), metadata=None)
Chunk 2
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4'}], id='2025-09-27 13:34:01.879_fbf1d0', created_at='2025-09-27 13:34:01.879', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=7, time=1.868638, type='chat'), metadata=None)
Chunk 3
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n'}], id='2025-09-27 13:34:01.955_3c6745', created_at='2025-09-27 13:34:01.955', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=10, time=1.944528, type='chat'), metadata=None)
Chunk 4
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n'}], id='2025-09-27 13:34:02.141_0b6767', created_at='2025-09-27 13:34:02.141', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=16, time=2.130855, type='chat'), metadata=None)
Chunk 5
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n1'}], id='2025-09-27 13:34:02.360_765fb2', created_at='2025-09-27 13:34:02.360', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=22, time=2.349501, type='chat'), metadata=None)
Chunk 6
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n1'}], id='2025-09-27 13:34:02.525_4a2038', created_at='2025-09-27 13:34:02.525', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=28, time=2.514623, type='chat'), metadata=None)
Chunk 7
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1'}], id='2025-09-27 13:34:02.746_3e97da', created_at='2025-09-27 13:34:02.746', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=34, time=2.735373, type='chat'), metadata=None)
Chunk 8
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n1'}], id='2025-09-27 13:34:03.208_5544ec', created_at='2025-09-27 13:34:03.208', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=40, time=3.198175, type='chat'), metadata=None)
Chunk 9
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n1'}], id='2025-09-27 13:34:03.366_d7b553', created_at='2025-09-27 13:34:03.366', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=46, time=3.355353, type='chat'), metadata=None)
Chunk 10
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='2025-09-27 13:34:03.549_cd499b', created_at='2025-09-27 13:34:03.549', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.539147, type='chat'), metadata=None)
Chunk 11
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='2025-09-27 13:34:03.574_3a0944', created_at='2025-09-27 13:34:03.574', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.564132, type='chat'), metadata=None)
Reasoning¶
AgentScope supports reasoning models by providing the ThinkingBlock
.
async def example_reasoning() -> None:
"""An example of using the reasoning model."""
model = DashScopeChatModel(
model_name="qwen-turbo",
api_key=os.environ["DASHSCOPE_API_KEY"],
enable_thinking=True,
)
res = await model(
messages=[
{"role": "user", "content": "Who am I?"},
],
)
last_chunk = None
async for chunk in res:
last_chunk = chunk
print("The final response:")
print(last_chunk)
asyncio.run(example_reasoning())
The final response:
ChatResponse(content=[{'type': 'thinking', 'thinking': 'Okay, the user asked "Who am I?" I need to figure out how to respond. First, I should consider that this is a philosophical question, which can have many interpretations. The user might be seeking self-awareness, identity, or existential meaning.\n\nI should start by acknowledging the depth of the question and the various perspectives. It\'s important to mention different angles, like philosophical, psychological, and spiritual. For example, in philosophy, thinkers like Descartes and Sartre have explored identity. Psychologically, it might relate to self-concept and personal experiences. Spiritually, it could involve the soul or higher self.\n\nI should also consider if the user is looking for a more personal answer, but since I don\'t have access to their personal information, I need to keep it general. Maybe include examples of how different disciplines approach this question. Also, note that the answer can be subjective and vary based on individual beliefs.\n\nMake sure the response is open-ended, encouraging the user to reflect on their own experiences and beliefs. Avoid making assumptions and keep the tone supportive and thought-provoking. Check for clarity and ensure that the answer is comprehensive without being too verbose.'}, {'type': 'text', 'text': 'The question "Who am I?" is one of the most profound and enduring inquiries in philosophy, spirituality, and personal reflection. It touches on identity, consciousness, and the nature of existence. Here are some perspectives to consider:\n\n### 1. **Philosophical Perspective** \n - **Descartes** famously said, *"I think, therefore I am,"* suggesting that self-awareness is the core of identity. \n - **Existentialists** like Sartre argued that identity is not fixed but shaped by choices and actions. \n - **Eastern philosophies** (e.g., Buddhism, Advaita Vedanta) often emphasize that the "self" is not a fixed entity but a transient collection of experiences, thoughts, and attachments. \n\n### 2. **Psychological Perspective** \n - Your identity is shaped by memories, relationships, values, and experiences. It evolves over time as you grow and encounter new challenges. \n - Carl Jung spoke of the "self" as the integration of the conscious and unconscious mind, while modern psychology highlights the role of social roles and personal narratives. \n\n### 3. **Spiritual/Existential Perspective** \n - Many spiritual traditions suggest that the "true self" transcends the physical body and ego. For example: \n - In **Buddhism**, the self is seen as impermanent and illusory (*anatta*). \n - In **Hinduism**, the "Atman" is the eternal soul, distinct from the temporary physical form. \n - Some might see the question as a path to inner discovery, where the answer lies in self-acceptance, purpose, or connection to something greater. \n\n### 4. **Personal Reflection** \n - You might define yourself through your passions, relationships, goals, or the impact you have on others. \n - The answer could be fluid, changing as you learn, grow, and experience life. \n\n### A Simple Truth: \nYou are **the sum of your experiences, thoughts, and choices**—but also more than that. The search for "who you are" is a journey, not a destination. What matters is how you engage with that question and what you choose to create from it. \n\nWould you like to explore this further? 🌱'}], id='2025-09-27 13:34:11.356_b9576a', created_at='2025-09-27 13:34:11.356', type='chat', usage=ChatUsage(input_tokens=12, output_tokens=719, time=7.776397, type='chat'), metadata=None)
Tools API¶
Different model providers differ in their tools APIs, e.g. the tools JSON schema, the tool call/response format. To provide a unified interface, AgentScope solves the problem by:
Providing unified tool call block ToolUseBlock and tool response block ToolResultBlock, respectively.
Providing a unified tools interface in the
__call__
method of the model classes, that accepts a list of tools JSON schemas as follows:
json_schemas = [
{
"type": "function",
"function": {
"name": "google_search",
"description": "Search for a query on Google.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query.",
},
},
"required": ["query"],
},
},
},
]
Further Reading¶
Total running time of the script: (0 minutes 12.659 seconds)