fovi.arch.alexnet

class fovi.arch.alexnet.ConvBlock(in_c, out_c, kernel_size, stride, padding, dilation=1, groups=1, inp=None, conv=nn.Conv2d, norm=partial(nn.GroupNorm, 32), act=partial(nn.ReLU, inplace=True), dropout=None, pool=partial(nn.MaxPool2d(3, 2)), after_pool=None, polar=False, out=None)[source]

Bases: Sequential

A configurable convolutional block with optional normalization, activation, dropout, and pooling.

This block provides a flexible way to construct convolutional layers with various preprocessing and postprocessing operations, including support for polar coordinate padding.

Parameters:

in_c (int) – Number of input channels.
out_c (int) – Number of output channels.
kernel_size (int) – Size of the convolutional kernel.
stride (int) – Stride of the convolution.
padding (int) – Padding for the convolution.
dilation (int, optional) – Dilation rate for the convolution. Defaults to 1.
groups (int, optional) – Number of groups for grouped convolution. Defaults to 1.
inp (callable, optional) – Input preprocessing layer factory. Defaults to None.
conv (nn.Module, optional) – Convolution layer class. Defaults to nn.Conv2d.
norm (callable, optional) – Normalization layer factory. Defaults to GroupNorm(32).
act (callable, optional) – Activation function factory. Defaults to ReLU.
dropout (callable, optional) – Dropout layer factory. Defaults to None.
pool (callable, optional) – Pooling layer factory. Defaults to MaxPool2d(3,2).
after_pool (callable, optional) – Post-pooling layer factory. Defaults to None.
polar (bool, optional) – Whether to use polar coordinate padding. Defaults to False.
out (callable, optional) – Output postprocessing layer factory. Defaults to None.

__init__(in_c, out_c, kernel_size, stride, padding, dilation=1, groups=1, inp=None, conv=nn.Conv2d, norm=partial(nn.GroupNorm, 32), act=partial(nn.ReLU, inplace=True), dropout=None, pool=partial(nn.MaxPool2d(3, 2)), after_pool=None, polar=False, out=None)[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

fovi.arch.alexnet.get_backbone(in_channels=3, kernels=baseline_alexnet_kernels['base'], w=1, preprocess=None, inp=lambda *_: ..., conv=lambda *_: ..., norm=lambda idx, args: ..., act=lambda *_: ..., dropout=lambda *_: ..., pool=lambda idx, *_: ..., after_pool=None, out=lambda *_: ..., avgpool=lambda *_: ..., polar=False)[source]

Build a configurable AlexNet-style backbone network.

This function constructs a backbone network using the specified kernel configurations and layer factories. It provides extensive customization options for each component.

Parameters:

in_channels (int, optional) – Number of input channels. Defaults to 3.
kernels (list, optional) – List of kernel specifications (out_c, ks, stride, pad, groups). Defaults to baseline_alexnet_kernels[‘base’].
w (float, optional) – Width multiplier for channel counts. Defaults to 1.
preprocess (callable, optional) – Preprocessing layer factory. Defaults to None.
inp (callable, optional) – Input layer factory function(idx, args). Defaults to None.
conv (callable, optional) – Convolution layer factory function(idx, args). Defaults to nn.Conv2d.
norm (callable, optional) – Normalization layer factory function(idx, args). Defaults to BatchNorm2d.
act (callable, optional) – Activation layer factory function(idx, args). Defaults to ReLU.
dropout (callable, optional) – Dropout layer factory function(idx, args). Defaults to None.
pool (callable, optional) – Pooling layer factory function(idx, args). Defaults to MaxPool2d after layers 0, 1, 4.
after_pool (callable, optional) – Post-pooling layer factory. Defaults to None.
out (callable, optional) – Output layer factory function(idx, args). Defaults to None.
avgpool (callable, optional) – Final average pooling layer factory. Defaults to AdaptiveAvgPool2d((6,6)).
polar (bool, optional) – Whether to use polar coordinate padding. Defaults to False.

Returns:

The constructed backbone network.

Return type:

nn.Sequential

fovi.arch.alexnet.get_repr_size(model, in_channels=3, img_size=224)[source]

Compute the flattened representation size of a model’s output.

Runs a forward pass with a random input to determine the output size.

Parameters:

model (nn.Module) – The model to analyze.
in_channels (int, optional) – Number of input channels. Defaults to 3.
img_size (int, optional) – Input image size (assumed square). Defaults to 224.

Returns:

The flattened output size of the model.

Return type:

int

fovi.arch.alexnet.build_model(weights=None, progress=True, backbone=None, mlp=None, backbone_kwargs=None, mlp_kwargs=None, repr_size_in_channels=3, img_size=224)[source]

Build a complete model with backbone and MLP projection head.

Constructs a model by combining a backbone network with an MLP projection head, wrapped in a BackboneProjectorWrapper for unified forward pass.

Parameters:

weights – Optional pretrained weights to load. Defaults to None.
progress (bool, optional) – Whether to show progress bar when loading weights. Defaults to True.
backbone (nn.Module, optional) – Pre-built backbone module. If None, one is created using backbone_kwargs. Defaults to None.
mlp (nn.Module, optional) – Pre-built MLP module. If None, one is created using mlp_kwargs. Defaults to None.
backbone_kwargs (dict, optional) – Keyword arguments for get_backbone(). Defaults to None.
mlp_kwargs (dict, optional) – Keyword arguments for get_mlp(). Defaults to None.
repr_size_in_channels (int, optional) – Input channels for computing repr size. Defaults to 3.
img_size (int, optional) – Image size for computing repr size. Defaults to 224.

Returns:

The complete model with backbone and projector.

Return type:

BackboneProjectorWrapper

fovi.arch.alexnet.alexnet2023_baseline(mlp_kwargs=None, weights=None, progress=True, img_size=224, **backbone_kwargs)[source]

Create an AlexNet-2023 baseline model with configurable backbone and MLP.

This is a convenience wrapper around build_model() for creating AlexNet-style models with default configurations.

Parameters:

mlp_kwargs (dict, optional) – Keyword arguments for the MLP head. Defaults to None.
weights – Optional pretrained weights to load. Defaults to None.
progress (bool, optional) – Whether to show progress bar when loading weights. Defaults to True.
img_size (int, optional) – Input image size. Defaults to 224.
**backbone_kwargs – Additional keyword arguments passed to get_backbone().

Returns:

The complete AlexNet-2023 model.

Return type:

BackboneProjectorWrapper