fovi.arch.resnet

fovi.arch.resnet.conv3x3_polar(in_planes: int, out_planes: int, stride: int = 1, groups: int = 1, dilation: int = 1) → Conv2d[source]

Create a 3x3 convolution with polar coordinate padding.

Parameters:

in_planes (int) – Number of input channels.
out_planes (int) – Number of output channels.
stride (int, optional) – Convolution stride. Defaults to 1.
groups (int, optional) – Number of groups for grouped convolution. Defaults to 1.
dilation (int, optional) – Dilation rate. Defaults to 1.

Returns:

A sequential module with PolarPadder and Conv2d.

Return type:

nn.Sequential

fovi.arch.resnet.conv_polar(in_planes: int, out_planes: int, kernel_size, pad: int, stride: int = 1, groups: int = 1, dilation: int = 1, **kwargs) → Conv2d[source]

Create a convolution with polar coordinate padding.

Parameters:

in_planes (int) – Number of input channels.
out_planes (int) – Number of output channels.
kernel_size (int or tuple) – Size of the convolution kernel.
pad (int) – Amount of polar padding to apply.
stride (int, optional) – Convolution stride. Defaults to 1.
groups (int, optional) – Number of groups for grouped convolution. Defaults to 1.
dilation (int, optional) – Dilation rate. Defaults to 1.
**kwargs – Additional keyword arguments passed to Conv2d.

Returns:

A sequential module with PolarPadder and Conv2d.

Return type:

nn.Sequential

class fovi.arch.resnet.BasicBlockPolar(inplanes: int, planes: int, stride: int = 1, downsample: Module | None = None, groups: int = 1, base_width: int = 64, dilation: int = 1, norm_layer: Callable[[...], Module] | None = None)[source]

Bases: Module

Basic residual block for polar coordinate ResNet.

A basic block with two 3x3 polar convolutions and a residual connection. Uses polar padding to handle wraparound in the angular dimension.

expansion

Channel expansion factor (always 1 for BasicBlock).

Type:: int

expansion: int = 1

__init__(inplanes: int, planes: int, stride: int = 1, downsample: Module | None = None, groups: int = 1, base_width: int = 64, dilation: int = 1, norm_layer: Callable[[...], Module] | None = None) → None[source]

Initialize the BasicBlockPolar.

Parameters:

inplanes (int) – Number of input channels.
planes (int) – Number of output channels.
stride (int, optional) – Stride for the first convolution. Defaults to 1.
downsample (nn.Module, optional) – Downsampling module for residual path. Defaults to None.
groups (int, optional) – Number of groups (must be 1). Defaults to 1.
base_width (int, optional) – Base width (must be 64). Defaults to 64.
dilation (int, optional) – Dilation rate (must be 1). Defaults to 1.
norm_layer (callable, optional) – Normalization layer factory. Defaults to BatchNorm2d.

forward(x: Tensor) → Tensor[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fovi.arch.resnet.BottleneckPolar(inplanes: int, planes: int, stride: int = 1, downsample: Module | None = None, groups: int = 1, base_width: int = 64, dilation: int = 1, norm_layer: Callable[[...], Module] | None = None)[source]

Bases: Module

Bottleneck residual block for polar coordinate ResNet.

A bottleneck block with 1x1 -> 3x3 polar -> 1x1 convolutions and a residual connection. Uses polar padding for the 3x3 convolution to handle wraparound in the angular dimension.

Note

Bottleneck in torchvision places the stride for downsampling at 3x3 convolution(self.conv2) while original implementation places the stride at the first 1x1 convolution(self.conv1) according to “Deep residual learning for image recognition” https://arxiv.org/abs/1512.03385. This variant is also known as ResNet V1.5 and improves accuracy according to https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch.

expansion

Channel expansion factor (always 4 for Bottleneck).

Type:: int

expansion: int = 4

__init__(inplanes: int, planes: int, stride: int = 1, downsample: Module | None = None, groups: int = 1, base_width: int = 64, dilation: int = 1, norm_layer: Callable[[...], Module] | None = None) → None[source]

Initialize the BottleneckPolar.

Parameters:

inplanes (int) – Number of input channels.
planes (int) – Number of output channels (before expansion).
stride (int, optional) – Stride for the 3x3 convolution. Defaults to 1.
downsample (nn.Module, optional) – Downsampling module for residual path. Defaults to None.
groups (int, optional) – Number of groups for grouped convolution. Defaults to 1.
base_width (int, optional) – Base width for computing intermediate channels. Defaults to 64.
dilation (int, optional) – Dilation rate for 3x3 convolution. Defaults to 1.
norm_layer (callable, optional) – Normalization layer factory. Defaults to BatchNorm2d.

forward(x: Tensor) → Tensor[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fovi.arch.resnet.ResNet(block: Type[BasicBlock | Bottleneck | BasicBlockPolar | BottleneckPolar], layers: List[int], num_classes: int = 1000, zero_init_residual: bool = False, groups: int = 1, width_per_group: int = 64, pre_block_pooling=True, main_block_stride=1, polar=False, no_fc=False, out_map_size=1, channel_mult=1, replace_stride_with_dilation: List[bool] | None = None, norm_layer: Callable[[...], Module] | None = None)[source]

Bases: ResNet

ResNet with polar coordinate support and configurable strides.

Extended ResNet implementation with support for polar coordinates and options to remove strides in the main blocks to preserve greater feature map resolution. Optionally removes pooling before first main block.

Parameters:

block (Type) – Block class to use (BasicBlock, Bottleneck, or polar variants).
layers (List[int]) – Number of blocks in each of the 4 layers.
num_classes (int, optional) – Number of output classes. Defaults to 1000.
zero_init_residual (bool, optional) – Whether to zero-initialize the last BN in each residual branch. Defaults to False.
groups (int, optional) – Number of groups for grouped convolution. Defaults to 1.
width_per_group (int, optional) – Base width per group. Defaults to 64.
pre_block_pooling (bool, optional) – Whether to apply pooling before first block. Defaults to True.
main_block_stride (int, optional) – Stride for main blocks (normally 2). Defaults to 1.
polar (bool, optional) – Whether to use polar coordinate padding. Defaults to False.
no_fc (bool, optional) – Whether to exclude the final FC layer. Defaults to False.
out_map_size (int, optional) – Output spatial size after adaptive pooling. Defaults to 1.
channel_mult (float, optional) – Channel width multiplier. Defaults to 1.
replace_stride_with_dilation (List[bool], optional) – Whether to replace stride with dilation in each of the last 3 layers. Defaults to None.
norm_layer (callable, optional) – Normalization layer factory. Defaults to BatchNorm2d.

__init__(block: Type[BasicBlock | Bottleneck | BasicBlockPolar | BottleneckPolar], layers: List[int], num_classes: int = 1000, zero_init_residual: bool = False, groups: int = 1, width_per_group: int = 64, pre_block_pooling=True, main_block_stride=1, polar=False, no_fc=False, out_map_size=1, channel_mult=1, replace_stride_with_dilation: List[bool] | None = None, norm_layer: Callable[[...], Module] | None = None) → None[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

fovi.arch.resnet.resnet18(pretrained: bool = False, progress: bool = True, polar=False, **kwargs: Any) → ResNet[source]: ResNet-18 model from “Deep Residual Learning for Image Recognition”. :param pretrained: If True, returns a model pre-trained on ImageNet :type pretrained: bool :param progress: If True, displays a progress bar of the download to stderr :type progress: bool

fovi.arch.resnet.resnet34(pretrained: bool = False, progress: bool = True, polar=False, **kwargs: Any) → ResNet[source]: ResNet-34 model from “Deep Residual Learning for Image Recognition”. :param pretrained: If True, returns a model pre-trained on ImageNet :type pretrained: bool :param progress: If True, displays a progress bar of the download to stderr :type progress: bool

fovi.arch.resnet.resnet50(pretrained: bool = False, progress: bool = True, polar=False, **kwargs: Any) → ResNet[source]: ResNet-50 model from “Deep Residual Learning for Image Recognition”. :param pretrained: If True, returns a model pre-trained on ImageNet :type pretrained: bool :param progress: If True, displays a progress bar of the download to stderr :type progress: bool

fovi.arch.resnet.resnet101(pretrained: bool = False, progress: bool = True, polar=False, **kwargs: Any) → ResNet[source]: ResNet-101 model from “Deep Residual Learning for Image Recognition”. :param pretrained: If True, returns a model pre-trained on ImageNet :type pretrained: bool :param progress: If True, displays a progress bar of the download to stderr :type progress: bool

fovi.arch.resnet.get_repr_size(model, in_channels=3, img_size=224)[source]

fovi.arch.resnet.resnet_ssl(layers=50, mlp_kwargs='8192-8192-8192', weights=None, progress=True, polar=False, **kwargs: Any) → ResNet[source]: wrapper for an SSL-style resnet model, used alongside Harvard Vision Lab model_rearing_workshop