evaluators#
Classes#
Base class for all evaluators in the PRT-RL framework.
Evaluator that evaluates the agent's performance to reach a minimum reward threshold within the lowest number of steps.
Evaluators are used to assess the performance of agents or policies.
- class prt_rl.common.evaluators.Evaluator(eval_freq: int = 1)[source]#
Base class for all evaluators in the PRT-RL framework. This class provides a common interface for evaluating agents in different environments with different objectives.
- Parameters:
eval_freq (int) – Frequency of evaluation in terms of steps, iterations, or optimization steps.
Initialize the evaluator with the evaluation frequency.
- Parameters:
eval_freq (int) – Frequency of evaluation in terms of steps, iterations, or optimization steps.
- __init__(eval_freq: int = 1) None[source]#
Initialize the evaluator with the evaluation frequency.
- Parameters:
eval_freq (int) – Frequency of evaluation in terms of steps, iterations, or optimization steps.
- class prt_rl.common.evaluators.NumberOfStepsEvaluator(env: EnvironmentInterface, reward_threshold: float, num_episodes: int = 1, logger: Logger | None = None, keep_best: bool = False, eval_freq: int = 1, deterministic: bool = False)[source]#
Evaluator that evaluates the agent’s performance to reach a minimum reward threshold within the lowest number of steps. This evaluator is intended to be used when an agent is able to achieve a maximum desired reward and you want to evaluate which agent learns the fastest.
- Parameters:
env (EnvironmentInterface) – The environment to evaluate the agent in.
reward_threshold (float) – The minimum reward threshold to achieve.
num_episodes (int) – The number of episodes to run for evaluation.
logger (Optional[Logger]) – Logger for evaluation metrics.
keep_best (bool) – Whether to keep the best agent based on evaluation performance.
eval_freq (int) – Frequency of evaluation in terms of steps, iterations, or optimization steps.
deterministic (bool) – Whether to use a deterministic policy during evaluation.
Initialize the evaluator with the evaluation frequency.
- Parameters:
eval_freq (int) – Frequency of evaluation in terms of steps, iterations, or optimization steps.
- class prt_rl.common.evaluators.RewardEvaluator(env: EnvironmentInterface, num_episodes: int = 1, logger: Logger | None = None, keep_best: bool = False, eval_freq: int = 1, deterministic: bool = False)[source]#
Evaluators are used to assess the performance of agents or policies.
It is important that the eval_freq value is the same units as the iteration value passed to the evaluate method. For example, if the eval_freq is set in steps then num_steps should be used as the iteration value. This ensures the evaluations occur at the correct time.
- Parameters:
env (EnvironmentInterface) – The environment to evaluate the agent in.
num_episodes (int) – The number of episodes to run for evaluation.
logger (Optional[Logger]) – Logger for evaluation metrics.
keep_best (bool) – Whether to keep the best agent based on evaluation performance.
eval_freq (int) – Frequency of evaluation in terms of steps, iterations, or optimization steps.
deterministic (bool) – Whether to use a deterministic policy during evaluation.
Initialize the evaluator with the evaluation frequency.
- Parameters:
eval_freq (int) – Frequency of evaluation in terms of steps, iterations, or optimization steps.