image_pipeline#
Classes#
- class prt_sim.gymnasium.image_pipeline.ImagePipeline(dataset_root: Path | None = None, render_mode: str | None = None, num_image_samples: int = 1, max_steps: int = 20, device: device = device(type='cpu'))[source]#
A template Gymnasium environment that simulates an image-processing pipeline.
This environment uses a single fixed task algorithm (Yolo object detector) and the actions produce a dynamic preprocessing pipeline.
- Observation:
A grayscale (H, W, 1) or RGB (H, W, 3) uint8 image
- Action:
Discrete(K) - select which processing operation to apply (placeholder)
- Termination:
Fixed horizon or an internal condition (customize in step)
- Truncation:
Ends when a max step count is reached
- Render modes:
‘rgb_array’ returns the current image as an np.ndarray
‘human’ (optional): implement if you want a GUI viewer
- property np_random: Generator#
Returns the environment’s internal
_np_randomthat if not set will initialise with a random seed.- Returns:
Instances of np.random.Generator
- property np_random_seed: int#
Returns the environment’s internal
_np_random_seedthat if not set will first initialise with a random int as seed.If
np_random_seedwas set directly instead of throughreset()orset_np_random_through_seed(), the seed will take the value -1.- Returns:
the seed of the current np_random or -1, if the seed of the rng is unknown
- Return type:
- reset(*, seed: int | None = None, options: Dict[str, Any] | None = None) Tuple[ndarray, Dict[str, Any]][source]#
Reset the environment to an initial state and returns an initial observation.
- Parameters:
- Returns:
The initial observation of the space. info (dict): A dictionary containing auxiliary information about the reset.
- Return type:
observation (np.ndarray)
- set_wrapper_attr(name: str, value: Any, *, force: bool = True) bool#
Sets the attribute name on the environment with value, see Wrapper.set_wrapper_attr for more info.
- step(action: Dict[str, Any]) Tuple[ndarray, float, bool, bool, Dict[str, Any]][source]#
Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
- Parameters:
action (dict) – An action provided by the agent. This is a dictionary with keys “algorithm”: int - the index of the algorithm to apply “parameters”: np.ndarray - the parameters for the algorithm scaled between 0 and 1
- Returns:
Agent’s observation of the current environment reward (float): Amount of reward returned after previous action terminated (bool): Whether the episode has ended. Further step() calls will return undefined results truncated (bool): Whether the episode was truncated (max steps reached). Further step() calls will return undefined results info (dict): Contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
- Return type:
observation (np.ndarray)
- property unwrapped: Env[ObsType, ActType]#
Returns the base non-wrapped environment.
- Returns:
The base non-wrapped
gymnasium.Envinstance- Return type:
Env