fovi.sensing.policies

class fovi.sensing.policies.BaseSaccadePolicy(retinal_transform, n_fixations)[source]

Bases: Module

Base class for SaccadeNet saccade/fixation policies.

Provides functionality for sampling multiple fixation points from images.

__init__(retinal_transform, n_fixations)[source]

Initialize the base saccade policy.

Parameters:

retinal_transform (RetinalTransform) – The retinal transform object used to apply retinal transformations to the images.
n_fixations (int) – The number of fixations to generate per image.

get_random_crop(height, width, scale, ratio)[source]

Generate a random crop with specified scale and aspect ratio.

Parameters:

height (int) – Image height.
width (int) – Image width.
scale (float or tuple) – Scale factor(s) for crop area.
ratio (float or tuple) – Aspect ratio(s) for the crop.

Returns:

list: Normalized fixation center [y, x]
list: Fixation size [height, width]

Return type:

tuple

get_random_nearcenter_fixation(height, width, scale, ratio, normalized_dist_from_center)[source]

Generate a random fixation near the center with specified constraints.

Parameters:

height (int) – Image height.
width (int) – Image width.
scale (float or tuple) – Scale factor(s) for crop area.
ratio (float or tuple) – Aspect ratio(s) for the crop.
normalized_dist_from_center (float) – Maximum normalized distance from center.

Returns:

list: Normalized fixation center [y, x]
list: Fixation size [height, width]

Return type:

tuple

sample_fixations(img_size, n=1, area_range=None, ratio=None, norm_dist_from_center=None)[source]

Sample multiple fixations for batch processing.

Parameters:

img_size (tuple) – Image size (height, width).
n (int) – Number of fixations to sample. Defaults to 1.
area_range – Scale range for crop area. Defaults to None.
ratio – Aspect ratio range. Defaults to None.
norm_dist_from_center (float, optional) – Maximum normalized distance from center. Defaults to None.

Returns:

torch.Tensor: Fixation locations of shape (n, 2)
np.ndarray: Fixation sizes of shape (n, 2)

Return type:

tuple

class fovi.sensing.policies.MultiRandomSaccadePolicy(retinal_transform, n_fixations=2, crop_area_range=[0.08, 1], add_aspect_variation=False, nonrandom_val=False, val_crop_size=1, nonrandom_first=False, norm_dist_from_center=None)[source]

Bases: BaseSaccadePolicy

Multi-random saccade policy for generating fixations in images.

This policy randomly selects multiple fixation points within the image, with configurable constraints on crop area, aspect ratio, and position.

retinal_transform

The retinal transform object used for sampling and transforming images

Type:: RetinalTransform

n_fixations

The number of fixations to generate.

Type:: int

fixation_size

The size of the fixation area.

Type:: int

multi_policy

Indicates if the policy is a multi-policy (i.e., it can handle multiple fixations).

Type:: bool

nonrandom_val

Whether to make validation fixations deterministic.

Type:: bool

norm_dist_from_center

If not None, changes how fixations are sampled. Rather than finding any valid crop, it takes a fixation within norm_dist_from_center fractional distance from the center of the image.

Type:: float

__init__(retinal_transform, n_fixations=2, crop_area_range=[0.08, 1], add_aspect_variation=False, nonrandom_val=False, val_crop_size=1, nonrandom_first=False, norm_dist_from_center=None)[source]

Initialize the multi-random saccade policy.

Parameters:

retinal_transform (RetinalTransform) – The retinal transform object used to apply retinal transformations to the images.
n_fixations (int, optional) – The number of fixations to generate. Defaults to 2.
crop_area_range (list, optional) – Range of crop area fractions [min, max]. Defaults to [0.08, 1].
add_aspect_variation (bool, optional) – Whether to add aspect ratio variation to crops. Defaults to False.
nonrandom_val (bool, optional) – Whether to make validation fixations deterministic (center). Defaults to False.
val_crop_size (float, optional) – Crop size fraction for validation. Defaults to 1.
nonrandom_first (bool, optional) – Whether to force the first fixation to be at center. Defaults to False.
norm_dist_from_center (float, optional) – Maximum normalized distance from center for fixation sampling. Defaults to None.

forward(x, n_fixations=None, fixations=None, fixation_size=None, area_range=None)[source]

Forward pass for the MultiRandomSaccadePolicy.

This method generates multiple random fixations for the input images and applies the retinal transform to each fixation.

Parameters:

x (torch.Tensor) – The input images of shape (n, c, h, w), where n is the batch size, c is the number of channels, h is the height, and w is the width.
n_fixations (int, optional) – The number of fixations to generate. Defaults to None, which uses the default number of fixations set in the policy.
fixations (list of torch.Tensor, optional) – A list of pre-defined fixations. Defaults to None, which generates random fixations.
fixation_size (int, optional) – The size of the fixation area. Defaults to None, which uses the default fixation size set in the policy.
area_range (tuple, optional) – The range of areas to sample from for the fixation size. Defaults to None.

Returns:

Dictionary containing:

x_fixs (torch.Tensor): The transformed images.
fixations (torch.Tensor): The fixation coordinates.
fixation_sizes (torch.Tensor): The fixation sizes.
fix_deltas (torch.Tensor): The fixation deltas.

Return type:

dict

class fovi.sensing.policies.NoSaccadePolicy(retinal_transform)[source]

Bases: BaseSaccadePolicy

Simple wrapper that does not apply any fixations to the input images.

retinal_transform

The retinal transform object used to apply retinal transformations to the images.

Type:: RetinalTransform

__init__(retinal_transform)[source]

Parameters:: retinal_transform (RetinalTransform) – The retinal transform object used to apply retinal transformations to the images.

forward(x, f1=None, area_range=None, n_fixations=None, fixation_size=None, fixations=None)[source]

Forward pass for the NoSaccadePolicy.

Parameters:

x (torch.Tensor) – The input image.
f1 (torch.Tensor, optional) – The first fixation coordinates. Must be None in order to use a center fixation.
area_range (tuple, optional) – Unused, for compatibility with other policies.
n_fixations (int, optional) – Unused, for compatibility with other policies.

Returns:

Dictionary containing:

x_fixs (torch.Tensor): The transformed image.
fixations (torch.Tensor): The fixation coordinates.
fixation_sizes (torch.Tensor): The fixation sizes.
fix_deltas (torch.Tensor): The fixation deltas.

Return type:

dict

class fovi.sensing.policies.PolicyRegistry[source]

Bases: object

Registry for fixation policy builder functions.

This registry stores builder functions that can construct policy instances from a SaccadeNet object. This allows external repositories to register custom policies without modifying the SaccadeNet code.

__init__()[source]

register(name, builder_fn)[source]

Register a policy builder function.

Parameters:

name (str) – Policy name to register
builder_fn (callable) – Function that takes a SaccadeNet instance and returns a policy instance

get(name)[source]

Get a policy builder by name.

Parameters:: name (str) – Policy name to retrieve
Returns:: Builder function for the policy
Return type:: callable
Raises:: ValueError – If policy name is not found in registry

has(name)[source]

Check if a policy is registered.

Parameters:: name (str) – Policy name to check
Returns:: True if policy is registered, False otherwise
Return type:: bool