fovi.arch.knn

class fovi.arch.knn.KNNBaseLayer[source]

Bases: object

Abstract base class implementing basic KNN functionalities for foveated vision.

This class provides the foundation for KNN-based operations in foveated neural networks, including distance computation in both visual and cortical spaces.

_compute_knns(batch_size=None)[source]

Compute distances between input and output coordinates in visual or cortical space using batched approach to limit memory demands for high resolution coordinate systems.

This method supports multiple distance computation strategies: - Euclidean distances in visual space (interesting baseline) - Euclidean distances in cortical space (most typical – fast and bio-plausible) - Geodesic distances on cortical surface (slower, more ideal, but very well approximated by euclidean for typical small local RFs, so not typically necessary. we tend to use it for ViTs where there are fewer coordinates with larger kernels and only one KNNConv, but not for CNNs where there are many layers of KNNConvs and more coordinates with smaller kernels.)

Parameters:

batch_size (int, optional) – Number of output coordinates to process at once. If None, uses a default based on available memory.

Returns:

(knn_indices, knn_distances) where:

knn_indices: Tensor of shape (k, num_output_coords) with indices of k nearest neighbors
knn_distances: Tensor of shape (k, num_output_coords) with distances to k nearest neighbors

Return type:

tuple

_compute_geodesic_distances()[source]: Compute geodesic distances using the existing approach (already memory efficient).

_compute_euclidean_distances_batched(batch_size)[source]: Compute euclidean distances using batched approach.

_compute_all_distances()[source]

Compute distances between all input and output coordinates in visual or cortical space. Typically this isn’t used, but we need it for the PartitioningPatchEmbedding.

This method supports multiple distance computation strategies: - Euclidean distances in visual space (interesting baseline) - Euclidean distances in cortical space (most typical – fast and bio-plausible) - Geodesic distances on cortical surface (slower, more ideal, but very well approximated by euclidean for typical small local RFs, so not typically necessary. we tend to use it for ViTs where there are fewer coordinates with larger kernels and only one KNNConv, but not for CNNs where there are many layers of KNNConvs and more coordinates with smaller kernels.)

Returns:: Distance matrix between input and output coordinates.
Return type:: torch.Tensor

class fovi.arch.knn.KNNGetterLayer(k, in_coords, out_coords, device='cuda', sample_cortex=True, batch_size=None)[source]

Bases: Module, KNNBaseLayer

Simple KNN-layer that simply gets the KNNs and returns the input data reformatted into neighborhoods

__init__(k, in_coords, out_coords, device='cuda', sample_cortex=True, batch_size=None)[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(X_l)[source]

Get KNN neighborhoods

Parameters:: X_l (torch.Tensor) – Input features of shape [batch_size, channels, num_nodes].
Returns:: Pooled features of shape [batch_size, channels, num_output_nodes].
Return type:: torch.Tensor

class fovi.arch.knn.KNNPoolingLayer(k, in_coords, out_coords, mode='max', device='cuda', sample_cortex=True, gauss_sigma=10, batch_size=None)[source]

Bases: Module, KNNBaseLayer

K-Nearest Neighbors pooling layer for foveated vision.

This layer performs pooling operations over k-nearest neighbors in either visual or cortical space, supporting various pooling modes.

Parameters:

k (int) – Number of nearest neighbors to consider.
in_coords (SamplingCoords) – Input sampling coordinates object.
out_coords (SamplingCoords) – Output sampling coordinates object.
mode (str) – Pooling mode (‘max’, ‘avg’, ‘sum’, ‘gaussian’).
device (str) – PyTorch device to run on.
sample_cortex (bool or str) – Whether/how to sample cortical space: - False: Sample visual field - True: Sample cortical space using Euclidean distances (fast, approximate) - ‘geodesic’: Sample cortical space using geodesic distances (slower, more accurate)
gauss_sigma (float) – sigma for mode=’gaussian’ pooling; not used for other modes
batch_size (int, optional) – Number of output coordinates to process at once for memory efficiency. If None, uses a default based on available memory.

__init__(k, in_coords, out_coords, mode='max', device='cuda', sample_cortex=True, gauss_sigma=10, batch_size=None)[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

_compute_gaussian_weights()[source]

Compute Gaussian weights for local pooling based on distances between output nodes and their input neighbors.

Returns:: Gaussian weights of shape (k, num_output_nodes)
Return type:: torch.Tensor

forward(X_l)[source]

Apply KNN pooling to input features.

Parameters:: X_l (torch.Tensor) – Input features of shape [batch_size, channels, num_nodes].
Returns:: Pooled features of shape [batch_size, channels, num_output_nodes].
Return type:: torch.Tensor

class fovi.arch.knn.KNNConvLayer(in_channels, out_channels, k, in_coords, out_coords, device='cuda', arch_flag='', sample_cortex=True, bias=False, ref_frame_side_length=None, batch_size=None)[source]

Bases: Module, KNNBaseLayer

K-Nearest Neighbors convolution layer for foveated vision.

This layer performs convolution operations over k-nearest neighbors in either visual or cortical space, with learnable filters aligned to spatial kernels.

Parameters:

in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
k (int) – Number of nearest neighbors to consider.
in_coords (SamplingCoords) – Input sampling coordinates object.
out_coords (SamplingCoords) – Output sampling coordinates object.
device (str) – PyTorch device to run on.
arch_flag (str) – Architecture flag for reference coordinate computation.
sample_cortex (bool) – Whether to sample cortical space.
bias (bool) – Whether to use bias in convolution.
ref_frame_side_length (int, optional) – Manual specification of reference frame side length.
batch_size (int, optional) – Number of output coordinates to process at once for memory efficiency. If None, uses a default based on available memory.

__init__(in_channels, out_channels, k, in_coords, out_coords, device='cuda', arch_flag='', sample_cortex=True, bias=False, ref_frame_side_length=None, batch_size=None)[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

_init_conv_like()[source]: Initialize convolution-like parameters for the KNN layer.

_pad_and_gather_knns(X_l)[source]

Pad input features and gather KNN features.

Parameters:: X_l (torch.Tensor) – Input features.
Returns:: Gathered KNN features.
Return type:: torch.Tensor

_apply_local_rf(knn_features)[source]

Reweights features according to local receptive field (i.e., kernel mapping to align KNNs and reference kernel)

Parameters:: knn_features (torch.Tensor) – KNN features tensor to process
Returns:: Processed features after applying local receptive field
Return type:: torch.Tensor

_apply_local_rf_to_weights()[source]

Reweight convolutional weights according to local receptive field (i.e kernel mapping to align KNNs and reference kernel)

Only used for visualization purposes; in the forward pass, we apply the local_rf to the KNN features instead

forward(X_l)[source]

Apply convolution using k-nearest neighbors.

Parameters:: X_l (torch.Tensor) – Node features from layer l [batch, d_l, N_l]
Returns:: Node features from layer l+1 [batch, d_l+1, N_l+1]
Return type:: torch.Tensor

compute_reference_coords(arch_flag)[source]

Compute reference coordinates for the convolution kernel.

The reference coordinates define the shape of the convolution kernel and can be: - A square grid matching the KNN size - A circular grid matching the KNN size - The first neighborhood of coordinates (foveal reference)

Parameters:: arch_flag (str) – Architecture flag indicating reference coordinate style. Checks for containing: - ‘fovref’: Use first neighborhood as reference (not typically used) - ‘circref’: Use circular reference grid (not typically used) - ‘doubleres’: Double the resolution of reference grid (smoother alignment) - default: Use square reference grid of same total number of elements as the neighborhood

Note

Sets self.ref_coords (torch.Tensor): Reference coordinates tensor of shape [num_ref_coords, 2]

compute_local_rf()[source]

Compute local receptive field weights for each output coordinate.

This method computes the mapping between input KNN coordinates and reference coordinates to determine how each input point contributes to the output. Each KNN neighbor is mapped to its nearest reference grid position (one-hot assignment).

Returns:

Local receptive field weights of shape [n_out, k, n_ref] where:

n_out is number of output coordinates
k is number of nearest neighbors
n_ref is number of reference coordinates

Return type:

torch.Tensor

load_conv2d_weights(conv2d: Conv2d, strict: bool = False)[source]

Load weights from a pretrained nn.Conv2d into this layer.

The Conv2d kernel is resampled to match the ref_grid_size if necessary. The H and W dimensions are transposed to align the Conv2d weight convention with the grid_sample coordinate convention used by this layer (row,col) -> (col, row)

Parameters:

conv2d – A nn.Conv2d layer to load weights from.
strict – If True, raises error if shapes don’t match. If False, resamples the kernel to match ref_grid_size.

load_conv3d_weights(conv3d: Conv3d, temporal_strategy='average', strict: bool = False)[source]: load weights from Conv3D with strategy for collapsing over temporal dimension

class fovi.arch.knn.KNNDepthwiseSeparableConvLayer(in_channels, out_channels, k, in_coords, out_coords, device='cuda', arch_flag='', sample_cortex=True, bias=False, batch_size=None, ref_frame_side_length=None)[source]

Bases: KNNConvLayer

Depthwise separable KNN convolution layer for foveated vision.

This layer implements depthwise separable convolution over k-nearest neighbors, which reduces computational complexity compared to standard KNN convolution.

Parameters:

in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
k (int) – Number of nearest neighbors to consider.
in_coords (SamplingCoords) – Input sampling coordinates object.
out_coords (SamplingCoords) – Output sampling coordinates object.
device (str) – PyTorch device to run on.
arch_flag (str) – Architecture flag for reference coordinate computation.
sample_cortex (bool) – Whether to sample cortical space.
bias (bool) – Whether to use bias in convolution.
batch_size (int, optional) – Number of output coordinates to process at once for memory efficiency.

__init__(in_channels, out_channels, k, in_coords, out_coords, device='cuda', arch_flag='', sample_cortex=True, bias=False, batch_size=None, ref_frame_side_length=None)[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

_init_depthwise_separable_conv_like()[source]: Initialize depthwise separable conv like conv layers

forward(X_l)[source]

Apply convolution using k-nearest neighbors.

Parameters:: X_l (torch.Tensor) – Node features from layer l [batch, d_l, N_l]
Returns:: Node features from layer l+1 [batch, d_l+1, N_l+1]
Return type:: torch.Tensor

class fovi.arch.knn.KNNDepthwiseConvLayer(in_channels, out_channels, k, in_coords, out_coords, device='cuda', arch_flag='', sample_cortex=True, bias=False, batch_size=None, ref_frame_side_length=None)[source]

Bases: KNNConvLayer

Depthwise KNN convolution layer for foveated vision.

This layer implements depthwise convolution over k-nearest neighbors, where each input channel is convolved separately.

Parameters:

in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
k (int) – Number of nearest neighbors to consider.
in_coords (SamplingCoords) – Input sampling coordinates object.
out_coords (SamplingCoords) – Output sampling coordinates object.
device (str) – PyTorch device to run on.
arch_flag (str) – Architecture flag for reference coordinate computation.
sample_cortex (bool) – Whether to sample cortical space.
bias (bool) – Whether to use bias in convolution.
batch_size (int, optional) – Number of output coordinates to process at once for memory efficiency.

__init__(in_channels, out_channels, k, in_coords, out_coords, device='cuda', arch_flag='', sample_cortex=True, bias=False, batch_size=None, ref_frame_side_length=None)[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

_init_depthwise_conv_like()[source]: Initialize depthwise conv like a conv layer

forward(X_l)[source]

Apply convolution using k-nearest neighbors.

Parameters:: X_l (torch.Tensor) – Node features from layer l [batch, d_l, N_l]
Returns:: Node features from layer l+1 [batch, d_l+1, N_l+1]
Return type:: torch.Tensor

fovi.arch.knn.compute_receptive_field(knn_indices_list, layer_of_interest, unit_of_interest, input_size, plot_layer=0)[source]

Compute the effective receptive field of a unit mapped to the input space.

Parameters:

knn_indices_list (list) – List of knn_indices matrices for each layer.
layer_of_interest (int) – Index of the layer where the unit of interest resides.
unit_of_interest (int) – Index of the unit of interest in the layer of interest.
input_size (int) – Total number of units in the input space.
plot_layer (int, optional) – Layer to plot from. Defaults to 0.

Returns:

Counter array of shape (input_size,) indicating the occurrence count of input units.

Return type:

numpy.ndarray

fovi.arch.knn.compute_binary_receptive_field(knn_indices_list, layer_of_interest, unit_of_interest, input_size, plot_layer=0)[source]

Compute the receptive field of all units that contribute whatsoever to the unit of interest.

Parameters:

knn_indices_list (list) – List of knn_indices matrices for each layer.
layer_of_interest (int) – Index of the layer where the unit of interest resides.
unit_of_interest (int) – Index of the unit of interest in the layer of interest.
input_size (int) – Total number of units in the input space.
plot_layer (int, optional) – Layer to plot from. Defaults to 0.

Returns:

Counter array of shape (input_size,) indicating the occurrence count of input units.

Return type:

numpy.ndarray

fovi.arch.knn.get_in_out_coords(in_res, fov, cmf_a, stride, style='isotropic', auto_match_cart_resources=1, in_cart_res=None, device='cuda', in_coords=None, force_out_match_less_than=True, max_out_coord_val=1)[source]

Convenience function to generate input and output coordinates for KNN layers.

Parameters:

in_res (int) – Input resolution.
fov (float) – Field of view diameter in degrees.
cmf_a (float) – a parameter in CMF: M(r) = 1/(r+a). Smaller = stronger foveation.
stride (int) – Stride factor for downsampling.
style (str, optional) – Sampling style. Defaults to ‘isotropic’.
auto_match_cart_resources (int, optional) – Auto-match parameter. Defaults to 1.
in_cart_res (int, optional) – Input cartesian resolution. Defaults to None.
device (str, optional) – PyTorch device. Defaults to ‘cuda’.
in_coords (SamplingCoords, optional) – Pre-computed input coordinates. Defaults to None.
force_out_match_less_than (bool, optional) – If auto_match_cart_resources, this determines whether the output number is constrained to not be greater than the target cartesian resolution (if false, chooses the closest match, which could be greater). Defaults to True.
max_out_coord_val (int or str, optional) – Maximum output coordinate value. Defaults to 1.

Returns:

A tuple containing:

SamplingCoords: Input coordinates
SamplingCoords: Output coordinates
int: Output cartesian resolution

Return type:

tuple

fovi.arch.knn.get_knn_conv_layer(name: str)[source]

Get a KNN convolution layer class by name.

Parameters:: name – Name of the layer class.
Returns:: The layer class.
Raises:: ValueError – If name is not in the registry.