fovi.utils.fastaugs.functional_tensor

Lighted edited from https://github.com/pytorch/vision/blob/main/torchvision/transforms/functional_tensor.py

Modified that I can make a small tweak to enable transforms over image batches to work with different parameters for each image. Ultimately this enables me to perform these transforms on the GPU, which is faster.

class fovi.utils.fastaugs.functional_tensor.Tensor

Bases: TensorBase

_clear_non_serializable_cached_data()[source]

Clears any data cached in the tensor’s __dict__ that would prevent the tensor from being serialized.

For example, subclasses with custom dispatched sizes / strides cache this info in non-serializable PyCapsules within the __dict__, and this must be cleared out for serialization to function.

Any subclass that overrides this MUST call super()._clear_non_serializable_cached_data(). Additional data cleared within the override must be able to be re-cached transparently to avoid breaking subclass functionality.

align_to(*names)[source]

Permutes the dimensions of the self tensor to match the order specified in names, adding size-one dims for any new names.

All of the dims of self must be named in order to use this method. The resulting tensor is a view on the original tensor.

All dimension names of self must be present in names. names may contain additional names that are not in self.names; the output tensor has a size-one dimension for each of those new names.

names may contain up to one Ellipsis (...). The Ellipsis is expanded to be equal to all dimension names of self that are not mentioned in names, in the order that they appear in self.

Python 2 does not support Ellipsis but one may use a string literal instead ('...').

Parameters:

names (iterable of str) – The desired dimension ordering of the output tensor. May contain up to one Ellipsis that is expanded to all unmentioned dim names of self.

Examples:

>>> tensor = torch.randn(2, 2, 2, 2, 2, 2)
>>> named_tensor = tensor.refine_names('A', 'B', 'C', 'D', 'E', 'F')

# Move the F and E dims to the front while keeping the rest in order
>>> named_tensor.align_to('F', 'E', ...)

Warning

The named tensor API is experimental and subject to change.

backward(gradient=None, retain_graph=None, create_graph=False, inputs=None)[source]

Computes the gradient of current tensor wrt graph leaves.

The graph is differentiated using the chain rule. If the tensor is non-scalar (i.e. its data has more than one element) and requires gradient, the function additionally requires specifying a gradient. It should be a tensor of matching type and shape, that represents the gradient of the differentiated function w.r.t. self.

This function accumulates gradients in the leaves - you might need to zero .grad attributes or set them to None before calling it. See Default gradient layouts for details on the memory layout of accumulated gradients.

Note

If you run any forward ops, create gradient, and/or call backward in a user-specified CUDA stream context, see Stream semantics of backward passes.

Note

When inputs are provided and a given input is not a leaf, the current implementation will call its grad_fn (though it is not strictly needed to get this gradients). It is an implementation detail on which the user should not rely. See https://github.com/pytorch/pytorch/pull/60521#issuecomment-867061780 for more details.

Parameters:
  • gradient (Tensor, optional) – The gradient of the function being differentiated w.r.t. self. This argument can be omitted if self is a scalar. Defaults to None.

  • retain_graph (bool, optional) – If False, the graph used to compute the grads will be freed; If True, it will be retained. The default is None, in which case the value is inferred from create_graph (i.e., the graph is retained only when higher-order derivative tracking is requested). Note that in nearly all cases setting this option to True is not needed and often can be worked around in a much more efficient way.

  • create_graph (bool, optional) – If True, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults to False.

  • inputs (Sequence[Tensor], optional) – Inputs w.r.t. which the gradient will be accumulated into .grad. All other tensors will be ignored. If not provided, the gradient is accumulated into all the leaf Tensors that were used to compute the tensors. Defaults to None.

detach()

Returns a new Tensor, detached from the current graph.

The result will never require gradient.

This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.

Note

Returned Tensor shares the same storage with the original one. In-place modifications on either of them will be seen, and may trigger errors in correctness checks.

detach_()

Detaches the Tensor from the graph that created it, making it a leaf. Views cannot be detached in-place.

This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.

dim_order(ambiguity_check=False) tuple[source]

Returns the uniquely determined tuple of int describing the dim order or physical layout of self.

The dim order represents how dimensions are laid out in memory of dense tensors, starting from the outermost to the innermost dimension.

Note that the dim order may not always be uniquely determined. If ambiguity_check is True, this function raises a RuntimeError when the dim order cannot be uniquely determined; If ambiguity_check is a list of memory formats, this function raises a RuntimeError when tensor can not be interpreted into exactly one of the given memory formats, or it cannot be uniquely determined. If ambiguity_check is False, it will return one of legal dim order(s) without checking its uniqueness. Otherwise, it will raise TypeError.

Parameters:

ambiguity_check (bool or List[torch.memory_format]) – The check method for ambiguity of dim order.

Examples:

>>> torch.empty((2, 3, 5, 7)).dim_order()
(0, 1, 2, 3)
>>> torch.empty((2, 3, 5, 7)).transpose(1, 2).dim_order()
(0, 2, 1, 3)
>>> torch.empty((2, 3, 5, 7), memory_format=torch.channels_last).dim_order()
(0, 2, 3, 1)
>>> torch.empty((1, 2, 3, 4)).dim_order()
(0, 1, 2, 3)
>>> try:
...     torch.empty((1, 2, 3, 4)).dim_order(ambiguity_check=True)
... except RuntimeError as e:
...     print(e)
The tensor does not have unique dim order, or cannot map to exact one of the given memory formats.
>>> torch.empty((1, 2, 3, 4)).dim_order(
...     ambiguity_check=[torch.contiguous_format, torch.channels_last]
... )  # It can be mapped to contiguous format
(0, 1, 2, 3)
>>> try:
...     torch.empty((1, 2, 3, 4)).dim_order(ambiguity_check="ILLEGAL") # type: ignore[arg-type]
... except TypeError as e:
...     print(e)
The ambiguity_check argument must be a bool or a list of memory formats.

Warning

The dim_order tensor API is experimental and subject to change.

eig(eigenvectors=False)[source]
index(positions, dims)[source]

Index a regular tensor by binding specified positions to dims.

This converts a regular tensor to a first-class tensor by binding the specified positional dimensions to Dim objects.

Parameters:
  • positions – Tuple of dimension positions to bind

  • dims – Dim objects or tuple of Dim objects to bind to

Returns:

First-class tensor with specified dimensions bound

is_shared()[source]

Checks if tensor is in shared memory.

This is always True for CUDA tensors.

istft(n_fft: int, hop_length: int | None = None, win_length: int | None = None, window: Tensor | None = None, center: bool = True, normalized: bool = False, onesided: bool | None = None, length: int | None = None, return_complex: bool = False)[source]

See torch.istft()

lstsq(other)[source]
lu(pivot=True, get_infos=False)[source]

See torch.lu()

module_load(other, assign=False)[source]

Defines how to transform other when loading it into self in load_state_dict().

Used when get_swap_module_params_on_conversion() is True.

It is expected that self is a parameter or buffer in an nn.Module and other is the value in the state dictionary with the corresponding key, this method defines how other is remapped before being swapped with self via swap_tensors() in load_state_dict().

Note

This method should always return a new object that is not self or other. For example, the default implementation returns self.copy_(other).detach() if assign is False or other.detach() if assign is True.

Parameters:
  • other (Tensor) – value in state dict with key corresponding to self

  • assign (bool) – the assign argument passed to nn.Module.load_state_dict()

norm(p: float | str | None = 'fro', dim=None, keepdim=False, dtype=None)[source]

See torch.norm()

refine_names(*names)[source]

Refines the dimension names of self according to names.

Refining is a special case of renaming that “lifts” unnamed dimensions. A None dim can be refined to have any name; a named dim can only be refined to have the same name.

Because named tensors can coexist with unnamed tensors, refining names gives a nice way to write named-tensor-aware code that works with both named and unnamed tensors.

names may contain up to one Ellipsis (...). The Ellipsis is expanded greedily; it is expanded in-place to fill names to the same length as self.dim() using names from the corresponding indices of self.names.

Python 2 does not support Ellipsis but one may use a string literal instead ('...').

Parameters:

names (iterable of str) – The desired names of the output tensor. May contain up to one Ellipsis.

Examples:

>>> imgs = torch.randn(32, 3, 128, 128)
>>> named_imgs = imgs.refine_names('N', 'C', 'H', 'W')
>>> named_imgs.names
('N', 'C', 'H', 'W')

>>> tensor = torch.randn(2, 3, 5, 7, 11)
>>> tensor = tensor.refine_names('A', ..., 'B', 'C')
>>> tensor.names
('A', None, None, 'B', 'C')

Warning

The named tensor API is experimental and subject to change.

register_hook(hook)[source]

Registers a backward hook.

The hook will be called every time a gradient with respect to the Tensor is computed. The hook should have the following signature:

hook(grad) -> Tensor or None

The hook should not modify its argument, but it can optionally return a new gradient which will be used in place of grad.

This function returns a handle with a method handle.remove() that removes the hook from the module.

Note

See Backward Hooks execution for more information on how when this hook is executed, and how its execution is ordered relative to other hooks.

Example:

>>> v = torch.tensor([0., 0., 0.], requires_grad=True)
>>> h = v.register_hook(lambda grad: grad * 2)  # double the gradient
>>> v.backward(torch.tensor([1., 2., 3.]))
>>> v.grad

 2
 4
 6
[torch.FloatTensor of size (3,)]

>>> h.remove()  # removes the hook
register_post_accumulate_grad_hook(hook)[source]

Registers a backward hook that runs after grad accumulation.

The hook will be called after all gradients for a tensor have been accumulated, meaning that the .grad field has been updated on that tensor. The post accumulate grad hook is ONLY applicable for leaf tensors (tensors without a .grad_fn field). Registering this hook on a non-leaf tensor will error!

The hook should have the following signature:

hook(param: Tensor) -> None

Note that, unlike other autograd hooks, this hook operates on the tensor that requires grad and not the grad itself. The hook can in-place modify and access its Tensor argument, including its .grad field.

This function returns a handle with a method handle.remove() that removes the hook from the module.

Note

See Backward Hooks execution for more information on how when this hook is executed, and how its execution is ordered relative to other hooks. Since this hook runs during the backward pass, it will run in no_grad mode (unless create_graph is True). You can use torch.enable_grad() to re-enable autograd within the hook if you need it.

Example:

>>> v = torch.tensor([0., 0., 0.], requires_grad=True)
>>> lr = 0.01
>>> # simulate a simple SGD update
>>> h = v.register_post_accumulate_grad_hook(lambda p: p.add_(p.grad, alpha=-lr))
>>> v.backward(torch.tensor([1., 2., 3.]))
>>> v
tensor([-0.0100, -0.0200, -0.0300], requires_grad=True)

>>> h.remove()  # removes the hook
reinforce(reward)[source]
rename(*names, **rename_map)[source]

Renames dimension names of self.

There are two main usages:

self.rename(**rename_map) returns a view on tensor that has dims renamed as specified in the mapping rename_map.

self.rename(*names) returns a view on tensor, renaming all dimensions positionally using names. Use self.rename(None) to drop names on a tensor.

One cannot specify both positional args names and keyword args rename_map.

Examples:

>>> imgs = torch.rand(2, 3, 5, 7, names=('N', 'C', 'H', 'W'))
>>> renamed_imgs = imgs.rename(N='batch', C='channels')
>>> renamed_imgs.names
('batch', 'channels', 'H', 'W')

>>> renamed_imgs = imgs.rename(None)
>>> renamed_imgs.names
(None, None, None, None)

>>> renamed_imgs = imgs.rename('batch', 'channel', 'height', 'width')
>>> renamed_imgs.names
('batch', 'channel', 'height', 'width')

Warning

The named tensor API is experimental and subject to change.

rename_(*names, **rename_map)[source]

In-place version of rename().

resize(*sizes)[source]
resize_as(tensor)[source]
share_memory_()[source]

Moves the underlying storage to shared memory.

This is a no-op if the underlying storage is already in shared memory and for CUDA tensors. Tensors in shared memory cannot be resized.

See torch.UntypedStorage.share_memory_() for more details.

solve(other)[source]
split(split_size, dim=0)[source]

See torch.split()

stft(n_fft: int, hop_length: int | None = None, win_length: int | None = None, window: Tensor | None = None, center: bool = True, pad_mode: str = 'reflect', normalized: bool = False, onesided: bool | None = None, return_complex: bool | None = None, align_to_window: bool | None = None)[source]

See torch.stft()

Warning

This function changed signature at version 0.4.1. Calling with the previous signature may cause error or return incorrect result.

storage() torch.TypedStorage[source]

Returns the underlying TypedStorage.

Warning

TypedStorage is deprecated. It will be removed in the future, and UntypedStorage will be the only storage class. To access the UntypedStorage directly, use Tensor.untyped_storage().

storage_type() type[source]

Returns the type of the underlying storage.

symeig(eigenvectors=False)[source]
to_sparse_coo()[source]

Convert a tensor to coordinate format.

Examples:

>>> dense = torch.randn(5, 5)
>>> sparse = dense.to_sparse_coo()
>>> sparse._nnz()
25
unflatten(dim, sizes) Tensor[source]

See torch.unflatten().

unique(sorted=True, return_inverse=False, return_counts=False, dim=None)[source]

Returns the unique elements of the input tensor.

See torch.unique()

unique_consecutive(return_inverse=False, return_counts=False, dim=None)[source]

Eliminates all but the first element from every consecutive group of equivalent elements.

See torch.unique_consecutive()

fovi.utils.fastaugs.functional_tensor.grid_sample(input: Tensor, grid: Tensor, mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool | None = None) Tensor[source]

Compute grid sample.

Given an input and a flow-field grid, computes the output using input values and pixel locations from grid.

Currently, only spatial (4-D) and volumetric (5-D) input are supported.

In the spatial (4-D) case, for input with shape \((N, C, H_\text{in}, W_\text{in})\) and grid with shape \((N, H_\text{out}, W_\text{out}, 2)\), the output will have shape \((N, C, H_\text{out}, W_\text{out})\).

For each output location output[n, :, h, w], the size-2 vector grid[n, h, w] specifies input pixel locations x and y, which are used to interpolate the output value output[n, :, h, w]. In the case of 5D inputs, grid[n, d, h, w] specifies the x, y, z pixel locations for interpolating output[n, :, d, h, w]. mode argument specifies nearest or bilinear interpolation method to sample the input pixels.

grid specifies the sampling pixel locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of [-1, 1]. For example, values x = -1, y = -1 is the left-top pixel of input, and values x = 1, y = 1 is the right-bottom pixel of input.

If grid has values outside the range of [-1, 1], the corresponding outputs are handled as defined by padding_mode. Options are

  • padding_mode="zeros": use 0 for out-of-bound grid locations,

  • padding_mode="border": use border values for out-of-bound grid locations,

  • padding_mode="reflection": use values at locations reflected by the border for out-of-bound grid locations. For location far away from the border, it will keep being reflected until becoming in bound, e.g., (normalized) pixel location x = -3.5 reflects by border -1 and becomes x' = 1.5, then reflects by border 1 and becomes x'' = -0.5.

Note

This function is often used in conjunction with affine_grid() to build Spatial Transformer Networks .

Note

When using the CUDA backend, this operation may induce nondeterministic behaviour in its backward pass that is not easily switched off. Please see the notes on /notes/randomness for background.

Note

NaN values in grid would be interpreted as -1.

Parameters:
  • input (Tensor) – input of shape \((N, C, H_\text{in}, W_\text{in})\) (4-D case) or \((N, C, D_\text{in}, H_\text{in}, W_\text{in})\) (5-D case)

  • grid (Tensor) – flow-field of shape \((N, H_\text{out}, W_\text{out}, 2)\) (4-D case) or \((N, D_\text{out}, H_\text{out}, W_\text{out}, 3)\) (5-D case)

  • mode (str) – interpolation mode to calculate output values 'bilinear' | 'nearest' | 'bicubic'. Default: 'bilinear' Note: mode='bicubic' supports only 4-D input. When mode='bilinear' and the input is 5-D, the interpolation mode used internally will actually be trilinear. However, when the input is 4-D, the interpolation mode will legitimately be bilinear.

  • padding_mode (str) – padding mode for outside grid values 'zeros' | 'border' | 'reflection'. Default: 'zeros'

  • align_corners (bool, optional) – Geometrically, we consider the pixels of the input as squares rather than points. If set to True, the extrema (-1 and 1) are considered as referring to the center points of the input’s corner pixels. If set to False, they are instead considered as referring to the corner points of the input’s corner pixels, making the sampling more resolution agnostic. This option parallels the align_corners option in interpolate(), and so whichever option is used here should also be used there to resize the input image before grid sampling. Default: False

Returns:

output Tensor

Return type:

output (Tensor)

Warning

When align_corners = True, the grid positions depend on the pixel size relative to the input image size, and so the locations sampled by grid_sample() will differ for the same input given at different resolutions (that is, after being upsampled or downsampled). The default behavior up to version 1.2.0 was align_corners = True. Since then, the default behavior has been changed to align_corners = False, in order to bring it in line with the default for interpolate().

Note

mode='bicubic' is implemented using the cubic convolution algorithm with \(\alpha=-0.75\). The constant \(\alpha\) might be different from packages to packages. For example, PIL and OpenCV use -0.5 and -0.75 respectively. This algorithm may “overshoot” the range of values it’s interpolating. For example, it may produce negative values or values greater than 255 when interpolating input in [0, 255]. Clamp the results with torch.clamp() to ensure they are within the valid range.

fovi.utils.fastaugs.functional_tensor.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) Tensor

Applies a 2D convolution over an input image composed of several input planes.

This operator supports TensorFloat32.

See Conv2d for details and output shape.

Note

In some circumstances when given tensors on a CUDA device and using CuDNN, this operator may select a nondeterministic algorithm to increase performance. If this is undesirable, you can try to make the operation deterministic (potentially at a performance cost) by setting torch.backends.cudnn.deterministic = True. See /notes/randomness for more information.

Note

This operator supports complex data types i.e. complex32, complex64, complex128.

Parameters:
  • input – input tensor of shape \((\text{minibatch} , \text{in\_channels} , iH , iW)\)

  • weight – filters of shape \((\text{out\_channels} , \frac{\text{in\_channels}}{\text{groups}} , kH , kW)\)

  • bias – optional bias tensor of shape \((\text{out\_channels})\). Default: None

  • stride – the stride of the convolving kernel. Can be a single number or a tuple (sH, sW). Default: 1

  • padding

    implicit paddings on both sides of the input. Can be a string {‘valid’, ‘same’}, single number or a tuple (padH, padW). Default: 0 padding='valid' is the same as no padding. padding='same' pads the input so the output has the same shape as the input. However, this mode doesn’t support any stride values other than 1.

    Warning

    For padding='same', if the weight is even-length and dilation is odd in any dimension, a full pad() operation may be needed internally. Lowering performance.

  • dilation – the spacing between kernel elements. Can be a single number or a tuple (dH, dW). Default: 1

  • groups – split input into groups, both \(\text{in\_channels}\) and \(\text{out\_channels}\) should be divisible by the number of groups. Default: 1

Examples:

>>> # With square kernels and equal stride
>>> filters = torch.randn(8, 4, 3, 3)
>>> inputs = torch.randn(1, 4, 5, 5)
>>> F.conv2d(inputs, filters, padding=1)
fovi.utils.fastaugs.functional_tensor.interpolate(input: Tensor, size: int | None = None, scale_factor: list[float] | None = None, mode: str = 'nearest', align_corners: bool | None = None, recompute_scale_factor: bool | None = None, antialias: bool = False) Tensor[source]

Down/up samples the input.

Tensor interpolated to either the given size or the given scale_factor

The algorithm used for interpolation is determined by mode.

Currently temporal, spatial and volumetric sampling are supported, i.e. expected inputs are 3-D, 4-D or 5-D in shape.

The input dimensions are interpreted in the form: mini-batch x channels x [optional depth] x [optional height] x width.

The modes available for resizing are: nearest, linear (3D-only), bilinear, bicubic (4D-only), trilinear (5D-only), area, nearest-exact

Parameters:
  • input (Tensor) – the input tensor

  • size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int]) – output spatial size.

  • scale_factor (float or Tuple[float]) – multiplier for spatial size. If scale_factor is a tuple, its length has to match the number of spatial dimensions; input.dim() - 2.

  • mode (str) – algorithm used for upsampling: 'nearest' | 'linear' | 'bilinear' | 'bicubic' | 'trilinear' | 'area' | 'nearest-exact'. Default: 'nearest'

  • align_corners (bool, optional) – Geometrically, we consider the pixels of the input and output as squares rather than points. If set to True, the input and output tensors are aligned by the center points of their corner pixels, preserving the values at the corner pixels. If set to False, the input and output tensors are aligned by the corner points of their corner pixels, and the interpolation uses edge value padding for out-of-boundary values, making this operation independent of input size when scale_factor is kept the same. This only has an effect when mode is 'linear', 'bilinear', 'bicubic' or 'trilinear'. Default: False

  • recompute_scale_factor (bool, optional) – recompute the scale_factor for use in the interpolation calculation. If recompute_scale_factor is True, then scale_factor must be passed in and scale_factor is used to compute the output size. The computed output size will be used to infer new scales for the interpolation. Note that when scale_factor is floating-point, it may differ from the recomputed scale_factor due to rounding and precision issues. If recompute_scale_factor is False, then size or scale_factor will be used directly for interpolation. Default: None.

  • antialias (bool, optional) – flag to apply anti-aliasing. Default: False. Using anti-alias option together with align_corners=False, interpolation result would match Pillow result for downsampling operation. Supported modes: 'bilinear', 'bicubic'.

Note

With mode='bicubic', it’s possible to cause overshoot. For some dtypes, it can produce negative values or values greater than 255 for images. Explicitly call result.clamp(min=0,max=255) if you want to reduce the overshoot when displaying the image. For uint8 inputs, it already performs saturating cast operation. So, no manual clamp operation is needed.

Note

Mode mode='nearest-exact' matches Scikit-Image and PIL nearest neighbours interpolation algorithms and fixes known issues with mode='nearest'. This mode is introduced to keep backward compatibility. Mode mode='nearest' matches buggy OpenCV’s INTER_NEAREST interpolation algorithm.

Note

The gradients for the dtype float16 on CUDA may be inaccurate in the upsample operation when using modes ['linear', 'bilinear', 'bicubic', 'trilinear', 'area']. For more details, please refer to the discussion in issue#104157.

Note

This operation may produce nondeterministic gradients when given tensors on a CUDA device. See /notes/randomness for more information.

fovi.utils.fastaugs.functional_tensor.torch_pad(input: Tensor, pad: list[int], mode: str = 'constant', value: float | None = None) Tensor

pad(input, pad, mode=”constant”, value=None) -> Tensor

Pads tensor.

Padding size:

The padding size by which to pad some dimensions of input are described starting from the last dimension and moving forward. \(\left\lfloor\frac{\text{len(pad)}}{2}\right\rfloor\) dimensions of input will be padded. For example, to pad only the last dimension of the input tensor, then pad has the form \((\text{padding\_left}, \text{padding\_right})\); to pad the last 2 dimensions of the input tensor, then use \((\text{padding\_left}, \text{padding\_right},\) \(\text{padding\_top}, \text{padding\_bottom})\); to pad the last 3 dimensions, use \((\text{padding\_left}, \text{padding\_right},\) \(\text{padding\_top}, \text{padding\_bottom}\) \(\text{padding\_front}, \text{padding\_back})\).

Padding mode:

See torch.nn.CircularPad2d, torch.nn.ConstantPad2d, torch.nn.ReflectionPad2d, and torch.nn.ReplicationPad2d for concrete examples on how each of the padding modes works. Constant padding is implemented for arbitrary dimensions. Circular, replicate and reflection padding are implemented for padding the last 3 dimensions of a 4D or 5D input tensor, the last 2 dimensions of a 3D or 4D input tensor, or the last dimension of a 2D or 3D input tensor.

Note

When using the CUDA backend, this operation may induce nondeterministic behaviour in its backward pass that is not easily switched off. Please see the notes on /notes/randomness for background.

Parameters:
  • input (Tensor) – N-dimensional tensor

  • pad (tuple) – m-elements tuple, where \(\frac{m}{2} \leq\) input dimensions and \(m\) is even.

  • mode'constant', 'reflect', 'replicate' or 'circular'. Default: 'constant'

  • value – fill value for 'constant' padding. Default: 0

Examples:

>>> t4d = torch.empty(3, 3, 4, 2)
>>> p1d = (1, 1) # pad last dim by 1 on each side
>>> out = F.pad(t4d, p1d, "constant", 0)  # effectively zero padding
>>> print(out.size())
torch.Size([3, 3, 4, 4])
>>> p2d = (1, 1, 2, 2) # pad last dim by (1, 1) and 2nd to last by (2, 2)
>>> out = F.pad(t4d, p2d, "constant", 0)
>>> print(out.size())
torch.Size([3, 3, 8, 4])
>>> t4d = torch.empty(3, 3, 4, 2)
>>> p3d = (0, 1, 2, 1, 3, 3) # pad by (0, 1), (2, 1), and (3, 3)
>>> out = F.pad(t4d, p3d, "constant", 0)
>>> print(out.size())
torch.Size([3, 9, 7, 3])
fovi.utils.fastaugs.functional_tensor.get_image_size(img: Tensor) List[int][source]
fovi.utils.fastaugs.functional_tensor.get_image_num_channels(img: Tensor) int[source]
fovi.utils.fastaugs.functional_tensor.convert_image_dtype(image: Tensor, dtype: dtype = torch.float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.vflip(img: Tensor) Tensor[source]
fovi.utils.fastaugs.functional_tensor.hflip(img: Tensor) Tensor[source]
fovi.utils.fastaugs.functional_tensor.crop(img: Tensor, top: int, left: int, height: int, width: int) Tensor[source]
fovi.utils.fastaugs.functional_tensor.rgb_to_grayscale(img: Tensor, num_output_channels: int = 1) Tensor[source]
fovi.utils.fastaugs.functional_tensor.adjust_brightness(img: Tensor, brightness_factor: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.adjust_contrast(img: Tensor, contrast_factor: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.adjust_hue(img: Tensor, hue_factor: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.adjust_hue_fast(img: Tensor, hue_factor: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.adjust_saturation(img: Tensor, saturation_factor: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.mat3(value)[source]

Identity matrix with given value

fovi.utils.fastaugs.functional_tensor._get_sbc_mat(s, b, c)[source]

adjusting saturation, brightness, and contrast are linear ops, can be combined into one matrix

fovi.utils.fastaugs.functional_tensor.color_jitter(img: Tensor, hue_factor: float, saturation_factor: float, brightness_factor: float, contrast_factor: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.random_color_jitter(b: Tensor, idx, hue_factor: float, saturation_factor: float, brightness_factor: float, contrast_factor: float)[source]

run color jitter on subset of images specified by idx

fovi.utils.fastaugs.functional_tensor.adjust_gamma(img: Tensor, gamma: float, gain: float = 1) Tensor[source]
fovi.utils.fastaugs.functional_tensor.center_crop(img: Tensor, output_size: None) Tensor[source]

DEPRECATED

fovi.utils.fastaugs.functional_tensor.five_crop(img: Tensor, size: None) List[Tensor][source]

DEPRECATED

fovi.utils.fastaugs.functional_tensor.ten_crop(img: Tensor, size: None, vertical_flip: bool = False) List[Tensor][source]

DEPRECATED

fovi.utils.fastaugs.functional_tensor.pad(img: Tensor, padding: List[int], fill: int = 0, padding_mode: str = 'constant') Tensor[source]
fovi.utils.fastaugs.functional_tensor.resize(img: Tensor, size: List[int], interpolation: str = 'bilinear', max_size: int | None = None, antialias: bool | None = None) Tensor[source]
fovi.utils.fastaugs.functional_tensor.affine(img: Tensor, matrix: List[float], interpolation: str = 'nearest', fill: List[float] | None = None) Tensor[source]
fovi.utils.fastaugs.functional_tensor.rotate(img: Tensor, matrix: List[float], interpolation: str = 'nearest', expand: bool = False, fill: List[float] | None = None) Tensor[source]
fovi.utils.fastaugs.functional_tensor.perspective(img: Tensor, perspective_coeffs: List[float], interpolation: str = 'bilinear', fill: List[float] | None = None) Tensor[source]
fovi.utils.fastaugs.functional_tensor.gaussian_blur(img: Tensor, kernel_size: List[int], sigma: List[float]) Tensor[source]
fovi.utils.fastaugs.functional_tensor.invert(img: Tensor) Tensor[source]
fovi.utils.fastaugs.functional_tensor.posterize(img: Tensor, bits: int) Tensor[source]
fovi.utils.fastaugs.functional_tensor.solarize(img: Tensor, threshold: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.adjust_sharpness(img: Tensor, sharpness_factor: float) Tensor[source]
fovi.utils.fastaugs.functional_tensor.autocontrast(img: Tensor) Tensor[source]
fovi.utils.fastaugs.functional_tensor.equalize(img: Tensor) Tensor[source]