########################## Configuring the Training ########################## Anemoi training is designed so you can adjust key parts of the models and training process without needing to modify the underlying code. A basic introduction to the configuration system is provided in the `getting started `_ section. This section will go into more detail on how to configure the training pipeline. *********************** Default Config Groups *********************** A typical config file will start with specifying the default config settings at the top as follows: .. code:: yaml defaults: - data: zarr - dataloader: native_grid - diagnostics: evaluation - hardware: example - graph: multi_scale - model: gnn - training: default - _self_ These are group configs for each section. The options after the defaults are then used to override the configs, by assigning new features and keywords. You can also find these defaults in other configs, like the ``hardware``, which implements: .. code:: yaml defaults: - paths: example - files: example ***************************** YAML-based config overrides ***************************** The config files are written in YAML format. This allows for easy overrides of the default settings. For example, to change the model from the default GNN to a transformer, you can use the following config in the config groups.: .. code:: yaml model: transformer This will override the default model config with the transformer model. You can also override individual settings. For example, to change the learning rate from the default value of 0.625e-4 to 1e-3, you can add the following to the config you're using: .. code:: yaml training: lr: rate: 1e-3 You can also change the GPU count to whatever you have available: .. code:: yaml hardware: num_gpus_per_node: 1 This matches the interface of the underlying defaults in Anemoi training. Example Config File =================== Here is an example of a config file that changes the model to a transformer, the learning rate to 1e-3, and the number of GPUs to 1. We also need to specify the paths to the data, output, and graph data and give the names of the files to use. You can get a dataset from the `Anemoi Datasets catalogue `_ or create one using the `Anemoi Datasets `_ package. You can create a graph using `Anemoi Graphs `_ or one will be created for you at runtime. Note that you must specify a filename for the graph, here we use `first_graph_m320.pt`. You'll also notice we've specified a resolution for the data, this must match the dataset you provide. .. code:: yaml defaults: - data: zarr - dataloader: native_grid - diagnostics: evaluation - hardware: example - graph: multi_scale - model: transformer # Change from default group - training: default - _self_ data: resolution: n320 hardware: num_gpus_per_node: 1 paths: output: /home/username/anemoi/training/output data: /home/username/anemoi/datasets graph: /home/username/anemoi/training/graphs files: dataset: datset-n320-2019-2021-6h.zarr graph: first_graph_n320.pt training: lr: rate: 1e-3 When we save this `example.yaml` file, we can run the training with this config using: .. code:: bash anemoi-training train --config-name=example.yaml ******************************* Command-line config overrides ******************************* It is also possible to use command line config overrides. We can switch out group configs using .. code:: bash anemoi-training train model=transformer or override individual config entries such as .. code:: bash anemoi-training train diagnostics.plot.enabled=False or combine everything together .. code:: bash anemoi-training train --config-name=debug.yaml model=transformer diagnostics.plot.enabled=False