jhu_envs#

Classes#

class prt_rl.env.wrappers.jhu_envs.JhuWrapper(jhu_name: str, *, render_mode: str | None = None, device: str = 'cpu', **kwargs)[source]#

Wraps the JHU environments in the Environment interface.

The JHU environments are games and puzzles that were used in the JHU 705.741 RL course.

Parameters:
  • jhu_name (str) – JHU Environment name

  • env_args (dict) – Arguments to pass to the JHU environment constructor

  • render_mode (str, optional) – Sets the render mode [‘human’, ‘rgb_array’]. Default: None.

Examples

```python from prt_sim.jhu.bandits import KArmBandits from prt_rl.env.wrappers import JhuWrapper from prt_rl.common.policy import RandomPolicy

env = JhuWrapper(environment=KArmBandits()) policy = RandomPolicy(env_params=env.get_parameters())

state = env.reset(seed=0) done = False

while not done:

action = policy.get_action(state) next_state, reward, done, info = env.step(action)

```

close() None#

Closes the environment and cleans up any resources.

get_num_envs() int#

Returns the number of environments in the interface.

Returns:

Number of environments

Return type:

int

get_parameters() EnvParams[source]#

Returns the EnvParams object which contains information about the sizes of observations and actions needed for setting up RL agents. :returns: environment parameters object :rtype: EnvParams

reset(seed: int | None = None) Tuple[Tensor, Dict[str, Any]][source]#

Resets the environment to the initial state and returns the initial observation.

Parameters:

seed (int | None) – Sets the random seed.

Returns:

Tuple of tensors containing the initial observation and info dictionary

Return type:

Tuple

step(action: Tensor) Tuple[Tensor, Tensor, Tensor, Dict[str, Any]][source]#

Steps the simulation using the action tensor and returns the new trajectory.

Parameters:

action (torch.Tensor) – Tensor with “action” key that is a tensor with shape (# env, # actions)

Returns:

Tuple of tensors containing the next state, reward, done, and info dictionary

Return type:

Tuple