.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorial/task_model.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorial_task_model.py: .. _model: Model ==================== In this tutorial, we introduce the model APIs integrated in AgentScope, how to use them and how to integrate new model APIs. The supported model APIs and providers include: .. list-table:: :header-rows: 1 * - API - Class - Compatible - Streaming - Tools - Vision - Reasoning * - OpenAI - ``OpenAIChatModel`` - vLLM, DeepSeek - ✅ - ✅ - ✅ - ✅ * - DashScope - ``DashScopeChatModel`` - - ✅ - ✅ - ✅ - ✅ * - Anthropic - ``AnthropicChatModel`` - - ✅ - ✅ - ✅ - ✅ * - Gemini - ``GeminiChatModel`` - - ✅ - ✅ - ✅ - ✅ * - Ollama - ``OllamaChatModel`` - - ✅ - ✅ - ✅ - ✅ .. note:: When using vLLM, you need to configure the appropriate tool calling parameters for different models during deployment, such as ``--enable-auto-tool-choice``, ``--tool-call-parser``, etc. For more details, refer to the `official vLLM documentation `_. .. note:: For OpenAI-compatible models (e.g. vLLM, Deepseek), developers can use the ``OpenAIChatModel`` class, and specify the API endpoint by the ``client_kwargs`` parameter: ``client_kwargs={"base_url": "http://your-api-endpoint"}``. For example: .. code-block:: python OpenAIChatModel(client_kwargs={"base_url": "http://localhost:8000/v1"}) .. note:: Model behavior parameters (such as temperature, maximum length, etc.) can be preset in the constructor function via the ``generate_kwargs`` parameter. For example: .. code-block:: python OpenAIChatModel(generate_kwargs={"temperature": 0.3, "max_tokens": 1000}) To provide unified model interfaces, the above model classes has the following common methods: - The first three arguments of the ``__call__`` method are ``messages`` , ``tools`` and ``tool_choice``, representing the input messages, JSON schema of tool functions, and tool selection mode, respectively. - The return type are either a ``ChatResponse`` instance or an async generator of ``ChatResponse`` in streaming mode. .. note:: Different model APIs differ in the input message format, refer to :ref:`prompt` for more details. The ``ChatResponse`` instance contains the generated thinking/text/tool use content, identity, created time and usage information. .. GENERATED FROM PYTHON SOURCE LINES 80-105 .. code-block:: Python import asyncio import json import os from agentscope.message import TextBlock, ToolUseBlock, ThinkingBlock, Msg from agentscope.model import ChatResponse, DashScopeChatModel response = ChatResponse( content=[ ThinkingBlock( type="thinking", thinking="I should search for AgentScope on Google.", ), TextBlock(type="text", text="I'll search for AgentScope on Google."), ToolUseBlock( type="tool_use", id="642n298gjna", name="google_search", input={"query": "AgentScope?"}, ), ], ) print(response) .. rst-class:: sphx-glr-script-out .. code-block:: none ChatResponse(content=[{'type': 'thinking', 'thinking': 'I should search for AgentScope on Google.'}, {'type': 'text', 'text': "I'll search for AgentScope on Google."}, {'type': 'tool_use', 'id': '642n298gjna', 'name': 'google_search', 'input': {'query': 'AgentScope?'}}], id='2026-04-15 04:16:05.552_908fa5', created_at='2026-04-15 04:16:05.552', type='chat', usage=None, metadata=None) .. GENERATED FROM PYTHON SOURCE LINES 106-107 Taking ``DashScopeChatModel`` as an example, we can use it to create a chat model instance and call it with messages and tools: .. GENERATED FROM PYTHON SOURCE LINES 107-132 .. code-block:: Python async def example_model_call() -> None: """An example of using the DashScopeChatModel.""" model = DashScopeChatModel( model_name="qwen-max", api_key=os.environ["DASHSCOPE_API_KEY"], stream=False, ) res = await model( messages=[ {"role": "user", "content": "Hi!"}, ], ) # You can directly create a ``Msg`` object with the response content msg_res = Msg("Friday", res.content, "assistant") print("The response:", res) print("The response as Msg:", msg_res) asyncio.run(example_model_call()) .. rst-class:: sphx-glr-script-out .. code-block:: none The response: ChatResponse(content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], id='f1c9be2a-bb89-9409-93de-0271de1d0c0a', created_at='2026-04-15 04:16:06.936', type='chat', usage=ChatUsage(input_tokens=10, output_tokens=9, time=1.383779, type='chat', metadata=GenerationUsage(input_tokens=10, output_tokens=9)), metadata=None) The response as Msg: Msg(id='eiVRQg45FJYbbsTAxQywMB', name='Friday', content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], role='assistant', metadata={}, timestamp='2026-04-15 04:16:06.937', invocation_id='None') .. GENERATED FROM PYTHON SOURCE LINES 133-140 Streaming ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To enable streaming model, set the ``stream`` parameter in the model constructor to ``True``. When streaming is enabled, the ``__call__`` method will return an **async generator** that yields ``ChatResponse`` instances as they are generated by the model. .. note:: The streaming mode in AgentScope is designed to be **cumulative**, meaning the content in each chunk contains all the previous content plus the newly generated content. .. GENERATED FROM PYTHON SOURCE LINES 140-170 .. code-block:: Python async def example_streaming() -> None: """An example of using the streaming model.""" model = DashScopeChatModel( model_name="qwen-max", api_key=os.environ["DASHSCOPE_API_KEY"], stream=True, ) generator = await model( messages=[ { "role": "user", "content": "Count from 1 to 20, and just report the number without any other information.", }, ], ) print("The type of the response:", type(generator)) i = 0 async for chunk in generator: print(f"Chunk {i}") print(f"\ttype: {type(chunk.content)}") print(f"\t{chunk}\n") i += 1 asyncio.run(example_streaming()) .. rst-class:: sphx-glr-script-out .. code-block:: none The type of the response: Chunk 0 type: ChatResponse(content=[{'type': 'text', 'text': '1'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:08.439', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=1, time=1.500653, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=1)), metadata=None) Chunk 1 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:08.530', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=4, time=1.591905, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=4)), metadata=None) Chunk 2 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:08.637', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=7, time=1.699145, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=7)), metadata=None) Chunk 3 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:08.731', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=10, time=1.792784, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=10)), metadata=None) Chunk 4 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:08.949', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=16, time=2.010623, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=16)), metadata=None) Chunk 5 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n1'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:09.119', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=22, time=2.180876, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=22)), metadata=None) Chunk 6 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n1'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:09.304', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=28, time=2.36589, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=28)), metadata=None) Chunk 7 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:09.643', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=34, time=2.70504, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=34)), metadata=None) Chunk 8 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n1'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:10.274', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=40, time=3.336189, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=40)), metadata=None) Chunk 9 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n1'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:10.471', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=46, time=3.532911, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=46)), metadata=None) Chunk 10 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:10.698', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.759568, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=50)), metadata=None) Chunk 11 type: ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='c49746a7-d4b2-94a5-93eb-06a26335eb67', created_at='2026-04-15 04:16:10.716', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.777345, type='chat', metadata=GenerationUsage(input_tokens=27, output_tokens=50)), metadata=None) .. GENERATED FROM PYTHON SOURCE LINES 171-175 Reasoning ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ AgentScope supports reasoning models by providing the ``ThinkingBlock``. .. GENERATED FROM PYTHON SOURCE LINES 175-200 .. code-block:: Python async def example_reasoning() -> None: """An example of using the reasoning model.""" model = DashScopeChatModel( model_name="qwen-turbo", api_key=os.environ["DASHSCOPE_API_KEY"], enable_thinking=True, ) res = await model( messages=[ {"role": "user", "content": "Who am I?"}, ], ) last_chunk = None async for chunk in res: last_chunk = chunk print("The final response:") print(last_chunk) asyncio.run(example_reasoning()) .. rst-class:: sphx-glr-script-out .. code-block:: none The final response: ChatResponse(content=[{'type': 'thinking', 'thinking': 'Okay, the user asked "Who am I?" That\'s a pretty broad question. Let me think about how to approach this.\n\nFirst, I need to consider the context. The user might be asking about their identity in a personal sense, but since they\'re interacting with an AI, maybe they\'re curious about the nature of the AI itself. However, the question is phrased as "Who am I?" which typically refers to the person asking the question.\n\nBut since I\'m an AI, I can\'t know the user\'s identity unless they provide information. So I should respond by asking for more details. However, the user might be testing me or just being philosophical. \n\nI should also check if there\'s any cultural or linguistic nuance I\'m missing. In some contexts, "Who am I?" could be a rhetorical question or part of a larger conversation. But without more context, it\'s hard to say.\n\nI need to make sure my response is helpful and guides the user to provide more information if needed. Maybe ask them to clarify what they mean by "who am I?" and offer to help based on the context they provide.\n\nAlso, considering privacy, I shouldn\'t make assumptions about the user\'s identity. It\'s important to be respectful and not overstep. So the best approach is to ask for clarification and offer assistance in a way that\'s open-ended.'}, {'type': 'text', 'text': 'The question "Who am I?" is profound and can be interpreted in many ways. Here are a few possibilities based on context:\n\n1. **Philosophical/Existential**: If you\'re asking about your identity in a broader sense, it relates to self-awareness, purpose, and the essence of your being. This is a timeless question explored by philosophers, scientists, and spiritual traditions.\n\n2. **Personal/Individual**: If you\'re seeking to understand your own identity, it involves reflecting on your values, experiences, relationships, and goals. It’s a deeply personal journey.\n\n3. **AI Context**: If you’re asking about *me* (Qwen), I am an AI assistant designed to help with information, tasks, and conversations. My "identity" is defined by my programming, training data, and purpose to assist users like you.\n\n4. **Rhetorical/Playful**: Sometimes the question is used to provoke thought or express confusion. If that’s the case, feel free to share more about what you’re reflecting on!\n\nIf you’d like to explore this further, could you clarify what aspect of "who I am" you’re curious about? I’m here to help! 😊'}], id='b9e71429-60c7-97e6-9f0e-e80ad38b834e', created_at='2026-04-15 04:16:19.133', type='chat', usage=ChatUsage(input_tokens=12, output_tokens=527, time=8.413336, type='chat', metadata=GenerationUsage(input_tokens=12, output_tokens=527)), metadata=None) .. GENERATED FROM PYTHON SOURCE LINES 201-209 Tools API ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Different model providers differ in their tools APIs, e.g. the tools JSON schema, the tool call/response format. To provide a unified interface, AgentScope solves the problem by: - Providing unified tool call block :ref:`ToolUseBlock ` and tool response block :ref:`ToolResultBlock `, respectively. - Providing a unified tools interface in the ``__call__`` method of the model classes, that accepts a list of tools JSON schemas as follows: .. GENERATED FROM PYTHON SOURCE LINES 209-230 .. code-block:: Python json_schemas = [ { "type": "function", "function": { "name": "google_search", "description": "Search for a query on Google.", "parameters": { "type": "object", "properties": { "query": { "type": "string", "description": "The search query.", }, }, "required": ["query"], }, }, }, ] .. GENERATED FROM PYTHON SOURCE LINES 231-236 Further Reading ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - :ref:`message` - :ref:`prompt` .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 13.589 seconds) .. _sphx_glr_download_tutorial_task_model.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: task_model.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: task_model.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: task_model.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_