fovi.trainer
- class fovi.trainer.Trainer(gpu, cfg: DictConfig, load_checkpoint=True)[source]
Bases:
object- __init__(gpu, cfg: DictConfig, load_checkpoint=True)[source]
Initialize trainer with hydra configuration
- Parameters:
gpu – which gpu to run on (or None to use cpu)
cfg – Hydra configuration object
load_checkpoint – Whether to load checkpoint
- create_optimizer()[source]
Create and configure optimizers for model and probes.
Sets up separate optimizers for the main model and linear probes, with appropriate weight decay settings and learning rate scaling.
- create_train_loader(train_dataset, subset=None, batches_ahead=3, phase='train')[source]
Create training data loader with appropriate transforms and augmentation.
- Parameters:
- Returns:
Configured data loader for training
- Return type:
FlashLoader
- create_val_loader(val_dataset, subset=None, ratio=1.)[source]
Create validation data loader with center crop transforms.
- create_standard_loader(dataset, batch_size, num_workers, resolution)[source]
Create standard data loader with basic transforms.
- Parameters:
- Returns:
- Standard PyTorch data loader with basic image transforms
(ToTensor, Resize, Normalize) applied
- Return type:
DataLoader
- create_model_and_scaler()[source]
Create and configure the neural network model and gradient scaler. :returns:
- (model, scaler) where model is the configured neural network
and scaler is the gradient scaler for mixed precision training
- Return type:
- train()[source]
Execute the main training loop.
Runs training for the specified number of epochs, performing validation at regular intervals and saving checkpoints. Handles learning rate scheduling and early stopping.
- Returns:
Training statistics for all epochs
- Return type:
- load_checkpoint(ckpt=None)[source]
Load model and optimizer state from checkpoint.
- Parameters:
ckpt (dict, optional) – Checkpoint dictionary. If None, loads from default checkpoint file in log folder.
- save_checkpoint(epoch)[source]
Save model and optimizer state to checkpoint file.
- Parameters:
epoch (int) – Current training epoch
- val_loop(return_preds=False, repeats=None)[source]
Execute validation loop.
Computes validation metrics for all n_fixations values in self.n_fixations_val. Runs a single forward pass with max(n_fixations_val) and slices outputs to evaluate at each fixation count.
- compute_activations(loader, layer_names=['projector'], fixation_size=None, area_range=None, training=False, n_fixations=None, max_batches=None, setting='supervised', do_postproc=False, **kwargs)[source]
Extract activations from specified layers for a given data loader.
Runs the model on data from the loader and captures intermediate activations from the specified layers using forward hooks.
- Parameters:
loader – Data loader to iterate over.
layer_names (list, optional) – List of layer names to capture activations from. Defaults to [‘projector’].
fixation_size (int or tuple, optional) – Size of fixation patches. Defaults to None.
area_range (list, optional) – [min, max] range of crop areas. Defaults to None.
training (bool, optional) – Whether to use training mode. Defaults to False.
n_fixations (int, optional) – Number of fixations per image. Defaults to None.
max_batches (int, optional) – Maximum number of batches to process. Defaults to None.
setting (str, optional) – Forward pass setting (‘supervised’ or ‘ssl’). Defaults to ‘supervised’.
do_postproc (bool, optional) – Whether to apply post-processing. Defaults to False.
**kwargs – Additional arguments passed to get_activations.
- Returns:
- (outputs, activations, targets) where:
outputs (np.ndarray): Model outputs of shape (N, …).
activations (dict): Dict mapping layer names to activation arrays.
targets (np.ndarray): Target labels of shape (N,).
- Return type:
- initialize_remote_logger()[source]
Initialize remote logging (e.g., wandb) for experiment tracking.
- classmethod exec(gpu, cfg)[source]
Execute training with the given configuration.
- Parameters:
gpu (int) – GPU device ID
cfg (DictConfig) – Training configuration
- fovi.trainer.find_config(base_fn, load, model_dirs=['../models', SAVE_DIR + '/logs', SLOW_DIR + '/logs'], device='cuda')[source]
Search for and load model configuration from multiple directories.
Attempts to load the configuration from each directory in order until one succeeds. If not found locally, attempts to download from HuggingFace Hub.
- Parameters:
base_fn (str) – Base filename/directory name to search for.
load (bool) – Whether to load model weights.
model_dirs (list, optional) – List of directories to search. Defaults to [‘../models’, SAVE_DIR + ‘/logs’, SLOW_DIR + ‘/logs’].
device (str, optional) – Device to load weights onto. Defaults to ‘cuda’.
- Returns:
(cfg, state_dict, model_key) from load_config.
- Return type:
- Raises:
ValueError – If model is not found in any of the directories or on HuggingFace Hub.