Skip to content

Launch Ensemble Training on Compute Cloud

flyvis_cli.training.train

Script to train an ensemble of models.

train_models
train_models(args, kwargs)

Launch training jobs for an ensemble of models.

Parameters:

Name Type Description Default
args Namespace

Command-line arguments.

required
kwargs List[str]

Additional keyword arguments as a list of strings.

required
Source code in flyvis_cli/training/train.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
def train_models(args: argparse.Namespace, kwargs: List[str]) -> None:
    """
    Launch training jobs for an ensemble of models.

    Args:
        args: Command-line arguments.
        kwargs: Additional keyword arguments as a list of strings.
    """
    launch_range(
        args.start,
        args.end,
        args.ensemble_id,
        args.task_name,
        args.nP,
        args.gpu,
        args.q,
        args.train_script,
        args.dry,
        kwargs,
    )
usage:
flyvis train [-h] [--start START] [--end END] [...] --ensemble_id ENSEMBLE_ID --task_name TASK_NAME [hydra_options...]
       or
train.py [-h] [--start START] [--end END] [...] --ensemble_id ENSEMBLE_ID --task_name TASK_NAME [hydra_options...]

For a full list of hydra options and default arguments, run: flyvis train-single --help

Train an ensemble of models. Launches a job for each model on the compute cloud.

options:
  -h, --help            show this help message and exit
  --start START         Start id of ensemble.
  --end END             End id of ensemble.
  --nP NP               Number of processors.
  --gpu GPU             Number of GPUs.
  --q Q                 Queue.
  --ensemble_id ENSEMBLE_ID
                        Id of the ensemble, e.g. 0045.
  --task_name TASK_NAME
                        Name given to the task, e.g., flow.
  --train_script TRAIN_SCRIPT
                        Script to run for training. Default: /groups/turaga/home/lappalainenj/FlyVis/private/flyvision/flyvis_cli/training/train_single.py
  --dry                 Perform a dry run without actually launching jobs.