historical_observation

historical_observation#

Classes#

class prt_rl.env.adapters.historical_observation.HistoricalObservationAdapter(env: EnvironmentInterface, num_steps: int = 4, include_actions: bool = True, append_last_action: bool = False)[source]#

Adapter that augments observations with a fixed-length observation history and optional action history.

Parameters:

env (EnvironmentInterface) – The environment to adapt
num_steps (int) – Number of observations to stack in the augmented observation.
include_actions (bool) – If True, include previous actions between stacked observations.
append_last_action (bool) –
If True and include_actions is True, append the most recent action to the end of the observation stack. Example (num_steps=3): - False: [o_{t-2}, o_{t-1}, o_t] - True: [o_{t-2}, a_{t-2}, o_{t-1}, a_{t-1}, o_t] - True + append_last_action=True:

[o_{t-2}, a_{t-2}, o_{t-1}, a_{t-1}, o_t, a_{t-1}]

close() → None#: Closes the environment and cleans up any resources.

get_num_envs() → int#

Returns the number of environments in the interface.

Returns:: Number of environments
Return type:: int

get_parameters()#

Returns the EnvParams object which contains information about the sizes of observations and actions needed for setting up RL agents.

Returns:: environment parameters object
Return type:: EnvParams

reset(*args, **kwargs)[source]#

Resets the environment to the initial state and returns the initial observation.

Parameters:: seed (int | None) – Sets the random seed.
Returns:: Tuple of tensors containing the initial observation and info dictionary
Return type:: Tuple

step(action)#

Steps the simulation using the action tensor and returns the new trajectory.

Parameters:: action (torch.Tensor) – Tensor with “action” key that is a tensor with shape (# env, # actions)
Returns:: Tuple of tensors containing the next state, reward, done, and info dictionary
Return type:: Tuple

historical_observation

Contents

historical_observation#

Classes#