historical_observation#
Classes#
- class prt_rl.env.adapters.historical_observation.HistoricalObservationAdapter(env: EnvironmentInterface, num_steps: int = 4, include_actions: bool = True, append_last_action: bool = False)[source]#
Adapter that augments observations with a fixed-length observation history and optional action history.
- Parameters:
env (EnvironmentInterface) – The environment to adapt
num_steps (int) – Number of observations to stack in the augmented observation.
include_actions (bool) – If True, include previous actions between stacked observations.
append_last_action (bool) –
If True and include_actions is True, append the most recent action to the end of the observation stack. Example (num_steps=3): - False: [o_{t-2}, o_{t-1}, o_t] - True: [o_{t-2}, a_{t-2}, o_{t-1}, a_{t-1}, o_t] - True + append_last_action=True:
[o_{t-2}, a_{t-2}, o_{t-1}, a_{t-1}, o_t, a_{t-1}]
- get_num_envs() int#
Returns the number of environments in the interface.
- Returns:
Number of environments
- Return type:
- get_parameters()#
Returns the EnvParams object which contains information about the sizes of observations and actions needed for setting up RL agents.
- Returns:
environment parameters object
- Return type:
- reset(*args, **kwargs)[source]#
Resets the environment to the initial state and returns the initial observation.
- Parameters:
seed (int | None) – Sets the random seed.
- Returns:
Tuple of tensors containing the initial observation and info dictionary
- Return type:
Tuple
- step(action)#
Steps the simulation using the action tensor and returns the new trajectory.
- Parameters:
action (torch.Tensor) – Tensor with “action” key that is a tensor with shape (# env, # actions)
- Returns:
Tuple of tensors containing the next state, reward, done, and info dictionary
- Return type:
Tuple