fovi.sensing.policies

class fovi.sensing.policies.BaseSaccadePolicy(retinal_transform, n_fixations)[source]

Bases: Module

Base class for SaccadeNet saccade/fixation policies.

Provides functionality for sampling multiple fixation points from images.

__init__(retinal_transform, n_fixations)[source]

Initialize the base saccade policy.

Parameters:
  • retinal_transform (RetinalTransform) – The retinal transform object used to apply retinal transformations to the images.

  • n_fixations (int) – The number of fixations to generate per image.

get_random_crop(height, width, scale, ratio)[source]

Generate a random crop with specified scale and aspect ratio.

Parameters:
  • height (int) – Image height.

  • width (int) – Image width.

  • scale (float or tuple) – Scale factor(s) for crop area.

  • ratio (float or tuple) – Aspect ratio(s) for the crop.

Returns:

  • list: Normalized fixation center [y, x]

  • list: Fixation size [height, width]

Return type:

tuple

get_random_nearcenter_fixation(height, width, scale, ratio, normalized_dist_from_center)[source]

Generate a random fixation near the center with specified constraints.

Parameters:
  • height (int) – Image height.

  • width (int) – Image width.

  • scale (float or tuple) – Scale factor(s) for crop area.

  • ratio (float or tuple) – Aspect ratio(s) for the crop.

  • normalized_dist_from_center (float) – Maximum normalized distance from center.

Returns:

  • list: Normalized fixation center [y, x]

  • list: Fixation size [height, width]

Return type:

tuple

sample_fixations(img_size, n=1, area_range=None, ratio=None, norm_dist_from_center=None)[source]

Sample multiple fixations for batch processing.

Parameters:
  • img_size (tuple) – Image size (height, width).

  • n (int) – Number of fixations to sample. Defaults to 1.

  • area_range – Scale range for crop area. Defaults to None.

  • ratio – Aspect ratio range. Defaults to None.

  • norm_dist_from_center (float, optional) – Maximum normalized distance from center. Defaults to None.

Returns:

  • torch.Tensor: Fixation locations of shape (n, 2)

  • np.ndarray: Fixation sizes of shape (n, 2)

Return type:

tuple

class fovi.sensing.policies.MultiRandomSaccadePolicy(retinal_transform, n_fixations=2, crop_area_range=[0.08, 1], add_aspect_variation=False, nonrandom_val=False, val_crop_size=1, nonrandom_first=False, norm_dist_from_center=None)[source]

Bases: BaseSaccadePolicy

Multi-random saccade policy for generating fixations in images.

This policy randomly selects multiple fixation points within the image, with configurable constraints on crop area, aspect ratio, and position.

retinal_transform

The retinal transform object used for sampling and transforming images

Type:

RetinalTransform

n_fixations

The number of fixations to generate.

Type:

int

fixation_size

The size of the fixation area.

Type:

int

multi_policy

Indicates if the policy is a multi-policy (i.e., it can handle multiple fixations).

Type:

bool

nonrandom_val

Whether to make validation fixations deterministic.

Type:

bool

norm_dist_from_center

If not None, changes how fixations are sampled. Rather than finding any valid crop, it takes a fixation within norm_dist_from_center fractional distance from the center of the image.

Type:

float

__init__(retinal_transform, n_fixations=2, crop_area_range=[0.08, 1], add_aspect_variation=False, nonrandom_val=False, val_crop_size=1, nonrandom_first=False, norm_dist_from_center=None)[source]

Initialize the multi-random saccade policy.

Parameters:
  • retinal_transform (RetinalTransform) – The retinal transform object used to apply retinal transformations to the images.

  • n_fixations (int, optional) – The number of fixations to generate. Defaults to 2.

  • crop_area_range (list, optional) – Range of crop area fractions [min, max]. Defaults to [0.08, 1].

  • add_aspect_variation (bool, optional) – Whether to add aspect ratio variation to crops. Defaults to False.

  • nonrandom_val (bool, optional) – Whether to make validation fixations deterministic (center). Defaults to False.

  • val_crop_size (float, optional) – Crop size fraction for validation. Defaults to 1.

  • nonrandom_first (bool, optional) – Whether to force the first fixation to be at center. Defaults to False.

  • norm_dist_from_center (float, optional) – Maximum normalized distance from center for fixation sampling. Defaults to None.

forward(x, n_fixations=None, fixations=None, fixation_size=None, area_range=None)[source]

Forward pass for the MultiRandomSaccadePolicy.

This method generates multiple random fixations for the input images and applies the retinal transform to each fixation.

Parameters:
  • x (torch.Tensor) – The input images of shape (n, c, h, w), where n is the batch size, c is the number of channels, h is the height, and w is the width.

  • n_fixations (int, optional) – The number of fixations to generate. Defaults to None, which uses the default number of fixations set in the policy.

  • fixations (list of torch.Tensor, optional) – A list of pre-defined fixations. Defaults to None, which generates random fixations.

  • fixation_size (int, optional) – The size of the fixation area. Defaults to None, which uses the default fixation size set in the policy.

  • area_range (tuple, optional) – The range of areas to sample from for the fixation size. Defaults to None.

Returns:

Dictionary containing:
  • x_fixs (torch.Tensor): The transformed images.

  • fixations (torch.Tensor): The fixation coordinates.

  • fixation_sizes (torch.Tensor): The fixation sizes.

  • fix_deltas (torch.Tensor): The fixation deltas.

Return type:

dict

class fovi.sensing.policies.NoSaccadePolicy(retinal_transform)[source]

Bases: BaseSaccadePolicy

Simple wrapper that does not apply any fixations to the input images.

retinal_transform

The retinal transform object used to apply retinal transformations to the images.

Type:

RetinalTransform

__init__(retinal_transform)[source]
Parameters:

retinal_transform (RetinalTransform) – The retinal transform object used to apply retinal transformations to the images.

forward(x, f1=None, area_range=None, n_fixations=None, fixation_size=None, fixations=None)[source]

Forward pass for the NoSaccadePolicy.

Parameters:
  • x (torch.Tensor) – The input image.

  • f1 (torch.Tensor, optional) – The first fixation coordinates. Must be None in order to use a center fixation.

  • area_range (tuple, optional) – Unused, for compatibility with other policies.

  • n_fixations (int, optional) – Unused, for compatibility with other policies.

Returns:

Dictionary containing:
  • x_fixs (torch.Tensor): The transformed image.

  • fixations (torch.Tensor): The fixation coordinates.

  • fixation_sizes (torch.Tensor): The fixation sizes.

  • fix_deltas (torch.Tensor): The fixation deltas.

Return type:

dict

class fovi.sensing.policies.PolicyRegistry[source]

Bases: object

Registry for fixation policy builder functions.

This registry stores builder functions that can construct policy instances from a SaccadeNet object. This allows external repositories to register custom policies without modifying the SaccadeNet code.

__init__()[source]
register(name, builder_fn)[source]

Register a policy builder function.

Parameters:
  • name (str) – Policy name to register

  • builder_fn (callable) – Function that takes a SaccadeNet instance and returns a policy instance

get(name)[source]

Get a policy builder by name.

Parameters:

name (str) – Policy name to retrieve

Returns:

Builder function for the policy

Return type:

callable

Raises:

ValueError – If policy name is not found in registry

has(name)[source]

Check if a policy is registered.

Parameters:

name (str) – Policy name to check

Returns:

True if policy is registered, False otherwise

Return type:

bool