agentscope.tuner¶

The learning module of AgentScope, including RL and SFT.

tune(*, workflow_func, judge_func=None, train_dataset=None, eval_dataset=None, model=None, auxiliary_models=None, algorithm=None, project_name=None, experiment_name=None, monitor_type=None, config_path=None)[source]¶

Train the agent workflow with the specific configuration.

Parameters:

workflow_func (WorkflowType) – The learning workflow function to execute.
judge_func (JudgeType, optional) – The judge function used to evaluate the workflow output. Defaults to None.
train_dataset (DatasetConfig, optional) – The training dataset for the learning process. Defaults to None.
eval_dataset (DatasetConfig, optional) – The evaluation dataset for the learning process. Defaults to None.
model (TunerModelConfig, optional) – The model to be tuned. Defaults to None.
auxiliary_models (dict[str, TunerModelConfig], optional) – A dictionary of auxiliary models for LLM-as-a-Judge or acting other agents in multi-agent scenarios. Defaults to None.
algorithm (AlgorithmConfig, optional) – The tuning algorithm configuration. Defaults to None.
project_name (str, optional) – Name of the project. Defaults to None.
experiment_name (str, optional) – Name of the experiment. Leave None to use timestamp. Defaults to None.
monitor_type (str, optional) – Type of the monitor to use. Could be one of ‘tensorboard’, ‘wandb’, ‘mlflow’, ‘swanlab’. Leave None to use tensorboard. Defaults to None.
config_path (str, optional) – Path to a trinity yaml configuration file. If provided, only workflow_func is necessary, other arguments will override the corresponding fields in the config. Defaults to None.

Return type:

None

class AlgorithmConfig[source]¶

Bases: BaseModel

Algorithm configuration for tuning.

algorithm_type: str¶

learning_rate: float¶

group_size: int¶

batch_size: int¶

save_interval_steps: int¶

eval_interval_steps: int¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class WorkflowOutput[source]¶

Bases: BaseModel

The output of a workflow function.

reward: float | None¶

response: Any | None¶

metrics: Dict[str, float] | None¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class JudgeOutput[source]¶

Bases: BaseModel

The output of a judge function.

reward: float¶

metrics: Dict[str, float] | None¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DatasetConfig[source]¶

Bases: BaseModel

Dataset configuration for tuning. Compatible with huggingface dataset format. Agentscope will load the dataset from the given path using datasets.load_dataset.

path: str¶

name: str | None¶

split: str | None¶

total_epochs: int¶

total_steps: int | None¶

preview(n=5)[source]¶

Preview the dataset information.

Parameters:: n (int) – Number of samples to preview. Defaults to 5.
Return type:: List

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class TunerModelConfig[source]¶

Bases: BaseModel

Model configuration for tuning.

model_path: str¶

max_model_len: int¶

temperature: float¶

top_p: float¶

max_tokens: int¶

enable_thinking: bool | None¶

tensor_parallel_size: int¶

inference_engine_num: int¶

tool_call_parser: str¶

reasoning_parser: str¶

get_config()[source]¶

Get the model configuration.

Returns:: The model configuration dictionary.
Return type:: Dict[str, Any]

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

check_workflow_function(func)[source]¶

Check if the given function is a valid WorkflowType.

Parameters:: func (Callable) – The function to check.
Return type:: None

check_judge_function(func)[source]¶

Check if the given function is a valid JudgeType.

Parameters:: func (Callable) – The function to check.
Return type:: None