agentscope.tuner¶
The learning module of AgentScope, including RL and SFT.
- tune(*, workflow_func, judge_func=None, train_dataset=None, eval_dataset=None, model=None, auxiliary_models=None, algorithm=None, project_name=None, experiment_name=None, monitor_type=None, config_path=None)[source]¶
Train the agent workflow with the specific configuration.
- Parameters:
workflow_func (WorkflowType) – The learning workflow function to execute.
judge_func (JudgeType, optional) – The judge function used to evaluate the workflow output. Defaults to None.
train_dataset (DatasetConfig, optional) – The training dataset for the learning process. Defaults to None.
eval_dataset (DatasetConfig, optional) – The evaluation dataset for the learning process. Defaults to None.
model (TunerModelConfig, optional) – The model to be tuned. Defaults to None.
auxiliary_models (dict[str, TunerModelConfig], optional) – A dictionary of auxiliary models for LLM-as-a-Judge or acting other agents in multi-agent scenarios. Defaults to None.
algorithm (AlgorithmConfig, optional) – The tuning algorithm configuration. Defaults to None.
project_name (str, optional) – Name of the project. Defaults to None.
experiment_name (str, optional) – Name of the experiment. Leave None to use timestamp. Defaults to None.
monitor_type (str, optional) – Type of the monitor to use. Could be one of ‘tensorboard’, ‘wandb’, ‘mlflow’, ‘swanlab’. Leave None to use tensorboard. Defaults to None.
config_path (str, optional) – Path to a trinity yaml configuration file. If provided, only workflow_func is necessary, other arguments will override the corresponding fields in the config. Defaults to None.
- Return type:
None
- class AlgorithmConfig[source]¶
Bases:
BaseModelAlgorithm configuration for tuning.
- algorithm_type: str¶
- learning_rate: float¶
- group_size: int¶
- batch_size: int¶
- save_interval_steps: int¶
- eval_interval_steps: int¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class WorkflowOutput[source]¶
Bases:
BaseModelThe output of a workflow function.
- reward: float | None¶
- response: Any | None¶
- metrics: Dict[str, float] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class JudgeOutput[source]¶
Bases:
BaseModelThe output of a judge function.
- reward: float¶
- metrics: Dict[str, float] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class DatasetConfig[source]¶
Bases:
BaseModelDataset configuration for tuning. Compatible with huggingface dataset format. Agentscope will load the dataset from the given path using datasets.load_dataset.
- path: str¶
- name: str | None¶
- split: str | None¶
- total_epochs: int¶
- total_steps: int | None¶
- preview(n=5)[source]¶
Preview the dataset information.
- Parameters:
n (int) – Number of samples to preview. Defaults to 5.
- Return type:
List
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class TunerModelConfig[source]¶
Bases:
BaseModelModel configuration for tuning.
- model_path: str¶
- max_model_len: int¶
- temperature: float¶
- top_p: float¶
- max_tokens: int¶
- enable_thinking: bool | None¶
- tensor_parallel_size: int¶
- inference_engine_num: int¶
- tool_call_parser: str¶
- reasoning_parser: str¶
- get_config()[source]¶
Get the model configuration.
- Returns:
The model configuration dictionary.
- Return type:
Dict[str, Any]
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].