Streaming Mode

AgentScope supports streaming output for the following APIs in both terminal and AgentScope Studio.

API	Class	Streaming
OpenAI Chat API	OpenAIChatWrapper	✓
DashScope Chat API	DashScopeChatWrapper	✓
Gemini Chat API	GeminiChatWrapper	✓
ZhipuAI Chat API	ZhipuAIChatWrapper	✓
Ollama Chat API	OllamaChatWrapper	✓
LiteLLM Chat API	LiteLLMChatWrapper	✓
Anthropic Chat API	AnthropicChatWrapper	✓

This section will show how to enable streaming mode in AgentScope and handle the streaming response within an agent.

Enabling Streaming Output

AgentScope supports streaming output by providing a stream parameter in model wrapper class. You can directly specify the stream parameter in initialization or configuration.

Specifying in Initialization

from agentscope.models import DashScopeChatWrapper
import os

model = DashScopeChatWrapper(
    config_name="_",
    model_name="qwen-max",
    api_key=os.environ["DASHSCOPE_API_KEY"],
    stream=True,  # Enabling the streaming output
)

Specifying in Configuration

model_config = {
    "model_type": "dashscope_chat",
    "config_name": "qwen_config",
    "model_name": "qwen-max",
    "stream": True,
}

With the above configuration, we can obtain streaming output with built-in agents in AgentScope.

Next, we show how to handle the streaming output within an agent.

Handling Streaming Response

Once we enable the streaming output, the returned model response will contain a generator in its stream field.

prompt = [{"role": "user", "content": "Hi!"}]

response = model(prompt)
print("The type of response.stream:", type(response.stream))

The type of response.stream: <class 'generator'>

We can iterate over the generator to get the streaming text. A boolean value will also be yielded to indicate whether the current chunk is the last one.

for index, chunk in enumerate(response.stream):
    print(f"{index}.", chunk)
    print(f"Current text field:", response.text)

0. (False, 'Hello')
Current text field: Hello
1. (False, 'Hello! How can')
Current text field: Hello! How can
2. (False, 'Hello! How can I assist you')
Current text field: Hello! How can I assist you
3. (False, 'Hello! How can I assist you today?')
Current text field: Hello! How can I assist you today?
4. (True, 'Hello! How can I assist you today?')
Current text field: Hello! How can I assist you today?

Note

Note the generator is incremental and one-time.

During the iterating, the text field in the response will concatenate sub strings automatically.

To be compatible with non-streaming mode, you can also directly use response.text to obtain all text at once.

prompt = [{"role": "user", "content": "Hi!"}]
response = model(prompt)
print(response.text)

Hello! How can I assist you today?

Displaying Like Typewriter

To display the streaming text like a typewriter, AgentScope provides a speak function within the AgentBase class. If a generator is given, the speak function will iterate over the generator and print the text like a typewriter in terminal or AgentScope Studio.

def reply(*args, **kwargs):
    # ...
    self.speak(response.stream)
    # ...

To be compatible with both streaming and non-streaming mode, we use the following code snippet for all built-in agents in AgentScope.

def reply(*args, **kwargs):
    # ...
    self.speak(response.stream or response.text)
    # ...

Total running time of the script: (0 minutes 2.536 seconds)

Gallery generated by Sphinx-Gallery