Streaming Mode

AgentScope supports streaming output for the following APIs in both terminal and AgentScope Studio.

API

Class

Streaming

OpenAI Chat API

OpenAIChatWrapper

DashScope Chat API

DashScopeChatWrapper

Gemini Chat API

GeminiChatWrapper

ZhipuAI Chat API

ZhipuAIChatWrapper

Ollama Chat API

OllamaChatWrapper

LiteLLM Chat API

LiteLLMChatWrapper

Anthropic Chat API

AnthropicChatWrapper

This section will show how to enable streaming mode in AgentScope and handle the streaming response within an agent.

Enabling Streaming Output

AgentScope supports streaming output by providing a stream parameter in model wrapper class. You can directly specify the stream parameter in initialization or configuration.

  • Specifying in Initialization

from agentscope.models import DashScopeChatWrapper
import os

model = DashScopeChatWrapper(
    config_name="_",
    model_name="qwen-max",
    api_key=os.environ["DASHSCOPE_API_KEY"],
    stream=True,  # Enabling the streaming output
)
  • Specifying in Configuration

model_config = {
    "model_type": "dashscope_chat",
    "config_name": "qwen_config",
    "model_name": "qwen-max",
    "stream": True,
}

With the above configuration, we can obtain streaming output with built-in agents in AgentScope.

Next, we show how to handle the streaming output within an agent.

Handling Streaming Response

Once we enable the streaming output, the returned model response will contain a generator in its stream field.

prompt = [{"role": "user", "content": "Hi!"}]

response = model(prompt)
print("The type of response.stream:", type(response.stream))
The type of response.stream: <class 'generator'>

We can iterate over the generator to get the streaming text. A boolean value will also be yielded to indicate whether the current chunk is the last one.

for index, chunk in enumerate(response.stream):
    print(f"{index}.", chunk)
    print(f"Current text field:", response.text)
0. (False, 'Hello')
Current text field: Hello
1. (False, 'Hello! How can I')
Current text field: Hello! How can I
2. (False, 'Hello! How can I assist you today?')
Current text field: Hello! How can I assist you today?
3. (True, 'Hello! How can I assist you today?')
Current text field: Hello! How can I assist you today?

Note

Note the generator is incremental and one-time.

During the iterating, the text field in the response will concatenate sub strings automatically.

To be compatible with non-streaming mode, you can also directly use response.text to obtain all text at once.

prompt = [{"role": "user", "content": "Hi!"}]
response = model(prompt)
print(response.text)
Hello! How can I assist you today?

Displaying Like Typewriter

To display the streaming text like a typewriter, AgentScope provides a speak function within the AgentBase class. If a generator is given, the speak function will iterate over the generator and print the text like a typewriter in terminal or AgentScope Studio.

def reply(*args, **kwargs):
    # ...
    self.speak(response.stream)
    # ...

To be compatible with both streaming and non-streaming mode, we use the following code snippet for all built-in agents in AgentScope.

def reply(*args, **kwargs):
    # ...
    self.speak(response.stream or response.text)
    # ...

Total running time of the script: (0 minutes 2.699 seconds)

Gallery generated by Sphinx-Gallery