human#
Classes#
The game controller policy allows interactive control of an agent with discrete or continuous actions.
The keyboard policy allows interactive control of the agent using keyboard input.
- class prt_rl.common.policies.human.GameControllerPolicy(key_action_map: Dict[Key, int] | Dict[Key, int | str], blocking: bool = True)[source]#
The game controller policy allows interactive control of an agent with discrete or continuous actions.
For continuous actions, the key_action_map maps a game controller input to an action index rather than a value. For example, ‘JOY_RIGHT_X’: 1 would map the x direction of the right joystick to action index 1. .. rubric:: Notes
You don’t want to use blocking with continuous actions because this would result in jerky agents. I also need to consider half joystick moves. For example if th input is speed you don’t want to hold a joystick down to go nowhere. I should accept a default value for the actions as well.
- Parameters:
- Raises:
ImportError – If inputs is not installed.
AssertionError – If a game controller is not found
- class prt_rl.common.policies.human.KeyboardPolicy(key_action_map: Dict[str, int], blocking: bool = True)[source]#
The keyboard policy allows interactive control of the agent using keyboard input.
Notes
I could modify this to implement “sticky” keys, so in non-blocking the last key pressed stays the action until a new key is pressed. Alternatively, you could set a default value and the action goes back to a default when the key is released.
- Parameters:
env_params (EnvParams) – environment parameters
key_action_map (Dict[str, int]) – mapping from key string to action value
blocking (bool) – If blocking is True the simulation will wait for keyboard input at each step (synchronous), otherwise the simulation will not block and use the most up-to-date value (asynchronous). Default is True.
Example
from prt_rl.utils.policies import KeyboardPolicy
- policy = KeyboardPolicy(
env_params=env.get_parameters(), key_action_map={
‘up’: 0, ‘down’: 1, ‘left’: 2, ‘right’: 3,
}, blocking=True
)
action_td = policy.get_action(state_td)
- act(obs: Tensor, deterministic: bool = True) Tuple[Tensor, Dict[str, Tensor]][source]#
Gets a keyboard press and maps it to the action space.
- Parameters:
obs (torch.Tensor) – A tensor representing the current state of the environment.
deterministic (bool) – If True, the policy will not sample random actions. Defaults to True.
- Returns:
A tensor with the action value based on the key pressed.
- Return type: