historical_observation#

Classes#

class prt_rl.env.adapters.historical_observation.HistoricalObservationAdapter(env: EnvironmentInterface, num_steps: int = 4, include_actions: bool = True, append_last_action: bool = False)[source]#

Adapter that augments observations with a fixed-length observation history and optional action history.

Parameters:
  • env (EnvironmentInterface) – The environment to adapt

  • num_steps (int) – Number of observations to stack in the augmented observation.

  • include_actions (bool) – If True, include previous actions between stacked observations.

  • append_last_action (bool) –

    If True and include_actions is True, append the most recent action to the end of the observation stack. Example (num_steps=3): - False: [o_{t-2}, o_{t-1}, o_t] - True: [o_{t-2}, a_{t-2}, o_{t-1}, a_{t-1}, o_t] - True + append_last_action=True:

    [o_{t-2}, a_{t-2}, o_{t-1}, a_{t-1}, o_t, a_{t-1}]

close() None#

Closes the environment and cleans up any resources.

get_num_envs() int#

Returns the number of environments in the interface.

Returns:

Number of environments

Return type:

int

get_parameters()#

Returns the EnvParams object which contains information about the sizes of observations and actions needed for setting up RL agents.

Returns:

environment parameters object

Return type:

EnvParams

reset(*args, **kwargs)[source]#

Resets the environment to the initial state and returns the initial observation.

Parameters:

seed (int | None) – Sets the random seed.

Returns:

Tuple of tensors containing the initial observation and info dictionary

Return type:

Tuple

step(action)#

Steps the simulation using the action tensor and returns the new trajectory.

Parameters:

action (torch.Tensor) – Tensor with “action” key that is a tensor with shape (# env, # actions)

Returns:

Tuple of tensors containing the next state, reward, done, and info dictionary

Return type:

Tuple