human#

Classes#

GameControllerPolicy

The game controller policy allows interactive control of an agent with discrete or continuous actions.

KeyboardPolicy

The keyboard policy allows interactive control of the agent using keyboard input.

class prt_rl.common.policies.human.GameControllerPolicy(key_action_map: Dict[Key, int] | Dict[Key, int | str], blocking: bool = True)[source]#

The game controller policy allows interactive control of an agent with discrete or continuous actions.

For continuous actions, the key_action_map maps a game controller input to an action index rather than a value. For example, ‘JOY_RIGHT_X’: 1 would map the x direction of the right joystick to action index 1. .. rubric:: Notes

You don’t want to use blocking with continuous actions because this would result in jerky agents. I also need to consider half joystick moves. For example if th input is speed you don’t want to hold a joystick down to go nowhere. I should accept a default value for the actions as well.

Parameters:
  • env_params (EnvParams) – environment parameters

  • key_action_map – mapping from key string to action value

  • blocking (bool, optional) – Whether the policy blocks at each step. Defaults to True.

Raises:
class Key(value)[source]#
act(obs: Tensor, deterministic: bool = True) Tuple[Tensor, Dict[str, Tensor]][source]#

Gets a game controller input and maps it to the action space.

Parameters:

obs (TensorDict) – A tensor representing the current state of the environment.

Returns:

A TensorDict with the “action” key added.

class prt_rl.common.policies.human.KeyboardPolicy(key_action_map: Dict[str, int], blocking: bool = True)[source]#

The keyboard policy allows interactive control of the agent using keyboard input.

Notes

I could modify this to implement “sticky” keys, so in non-blocking the last key pressed stays the action until a new key is pressed. Alternatively, you could set a default value and the action goes back to a default when the key is released.

Parameters:
  • env_params (EnvParams) – environment parameters

  • key_action_map (Dict[str, int]) – mapping from key string to action value

  • blocking (bool) – If blocking is True the simulation will wait for keyboard input at each step (synchronous), otherwise the simulation will not block and use the most up-to-date value (asynchronous). Default is True.

Example

from prt_rl.utils.policies import KeyboardPolicy

policy = KeyboardPolicy(

env_params=env.get_parameters(), key_action_map={

‘up’: 0, ‘down’: 1, ‘left’: 2, ‘right’: 3,

}, blocking=True

)

action_td = policy.get_action(state_td)

act(obs: Tensor, deterministic: bool = True) Tuple[Tensor, Dict[str, Tensor]][source]#

Gets a keyboard press and maps it to the action space.

Parameters:
  • obs (torch.Tensor) – A tensor representing the current state of the environment.

  • deterministic (bool) – If True, the policy will not sample random actions. Defaults to True.

Returns:

A tensor with the action value based on the key pressed.

Return type:

torch.Tensor