agentscope.service.multi_modality.dashscope_services module
Use DashScope API to generate images, convert text to audio, and convert images to text. Please refer to the official documentation for more details: https://dashscope.aliyun.com/
- dashscope_image_to_text(image_urls: str | Sequence[str], api_key: str, prompt: str = 'Describe the image', model: str = 'qwen-vl-plus') ServiceResponse [source]
Generate text based on the given images.
- Parameters:
image_urls (Union[str, Sequence[str]]) – The url of single or multiple images.
api_key (str) – The api key for the dashscope api.
prompt (str, defaults to ‘Describe the image’) – The text prompt.
model (str, defaults to ‘qwen-vl-plus’) – The model to use in DashScope MultiModal API.
- Returns:
A dictionary with two variables: status and`content`. If status is ServiceExecStatus.SUCCESS, the content is the generated text.
- Return type:
ServiceResponse
Example
image_url = "image.jpg" prompt = "Describe the image" print(image_to_text(image_url, prompt))
> {‘status’: ‘SUCCESS’, ‘content’: ‘A beautiful sunset in the mountains’}
- dashscope_text_to_audio(text: str, api_key: str, save_dir: str, model: str = 'sambert-zhichu-v1', sample_rate: int = 48000) ServiceResponse [source]
Convert the given text to audio.
- Parameters:
text (str) – The text to be converted into audio.
api_key (str) – The api key for the dashscope API.
save_dir (str) – The directory to save the generated audio.
model (str, defaults to ‘sambert-zhichu-v1’) – The model to use. Full model list can be found in https://help.aliyun.com/zh/dashscope/model-list
sample_rate (int, defaults to 48000) – Samplerate of the audio.
- Returns:
A dictionary with two variables: status and`content`. If status is ServiceExecStatus.SUCCESS, the content contains a dictionary with key “audio_path” and value is the path to the generated audio.
- Return type:
ServiceResponse
Example
text = "How is the weather today?" print(text_to_audio(text)) gives:
> {‘status’: ‘SUCCESS’, ‘content’: {“audio_path”: “AUDIO_PATH”}}
- dashscope_text_to_image(prompt: str, api_key: str, n: int = 1, size: Literal['1024*1024', '720*1280', '1280*720'] = '1024*1024', model: str = 'wanx-v1', save_dir: str | None = None) ServiceResponse [source]
Generate image(s) based on the given prompt, and return image url(s).
- Parameters:
prompt (str) – The text prompt to generate image.
api_key (str) – The api key for the dashscope api.
n (int, defaults to 1) – The number of images to generate.
(`Literal["1024*1024" (size) –
"720*1280"
"1280*720"]`
to (defaults)
"1024*1024") – Size of the image.
model (str, defaults to ‘“wanx-v1”’) – The model to use.
save_dir (Optional[str], defaults to ‘None’) – The directory to save the generated images. If not specified, will return the web urls.
- Returns:
A dictionary with two variables: status and`content`. If status is ServiceExecStatus.SUCCESS, the content is a dict with key ‘fig_paths” and value is a list of the paths to the generated images.
- Return type:
Example
prompt = "A beautiful sunset in the mountains" print(dashscope_text_to_image(prompt, "{api_key}"))
> { > ‘status’: ‘SUCCESS’, > ‘content’: {‘image_urls’: [‘IMAGE_URL1’, ‘IMAGE_URL2’]} > }