Models
The models module provides several neural network architectures that work with graph input data and follow an encoder-processor-decoder structure.
Encoder-Processor-Decoder Model
The model defines a network architecture with configurable encoder, processor, and decoder components (Lang et al. (2024a)).
- class anemoi.models.models.encoder_processor_decoder.AnemoiModelEncProcDec(*, model_config: DictConfig, data_indices: dict, statistics: dict, n_step_input: int, n_step_output: int, graph_data: HeteroData)
Bases:
BaseGraphModelMessage passing graph neural network.
- forward(x: dict[str, Tensor], *, model_comm_group: ProcessGroup | None = None, grid_shard_sizes: dict[str, list[int] | None] | None = None, **kwargs) dict[str, Tensor]
Forward pass of the model.
- Parameters:
- Returns:
Output of the model, with the same shape as the input (sharded if input is sharded)
- Return type:
Residual connections (including graph-based truncation) are configured in the model config; see Residual connections for details.
Ensemble Encoder-Processor-Decoder Model
The ensemble model architecture implementing the AIFS-CRPS approach Lang et al. (2024b).
Key features:
Based on the base encoder-processor-decoder architecture
Injects noise in the processor for each ensemble member using
anemoi.models.layers.normalization.ConditionalLayerNorm
- class anemoi.models.models.ens_encoder_processor_decoder.AnemoiEnsModelEncProcDec(*, model_config: DictConfig, data_indices: dict, statistics: dict, graph_data: HeteroData, n_step_input: int, n_step_output: int)
Bases:
AnemoiModelEncProcDecMessage passing graph neural network with ensemble functionality.
- forward(x: dict[str, Tensor], *, fcstep: int, model_comm_group: ProcessGroup | None = None, grid_shard_sizes: dict[str, list[int] | None] | None = None, **kwargs) dict[str, Tensor]
Forward operator.
- Parameters:
x (dict[str, torch.Tensor]) – Input tensor, shape (bs, m, e, n, f)
fcstep (int) – Forecast step
model_comm_group (ProcessGroup, optional) – Model communication group
grid_shard_sizes (DatasetShardSizes, optional) – Per-dataset shard sizes for the grid dimension.
Nonemeans the corresponding dataset is replicated, not sharded.**kwargs – Additional keyword arguments
- Returns:
Output tensor per dataset
- Return type:
For the training-side CRPS setup, including loss, truncation, and ensemble-specific configuration changes, see Ensemble CRPS-based training.
Hierarchical Encoder-Processor-Decoder Model
This model extends the standard encoder-processor-decoder architecture by introducing a hierarchical processor.
Key features:
Requires a predefined list of hidden nodes, [hidden_1, …, hidden_n]
Nodes must be sorted to match the expected flow of information data -> hidden_1 -> … -> hidden_n -> … -> hidden_1 -> data
Supports hierarchical level processing through the enable_hierarchical_level_processing configuration. This argument determines whether a processor is added at each hierarchy level or only at the final level.
Channel scaling: 2^n * config.num_channels where n is the hierarchy level
By default, the number of channels for the mappers is defined as 2^n * config.num_channels, where n represents the hierarchy level. This scaling ensures that the processing capacity grows proportionally with the depth of the hierarchy, enabling efficient handling of data.
- class anemoi.models.models.hierarchical.AnemoiModelEncProcDecHierarchical(*, model_config: DictConfig, data_indices: dict, statistics: dict, n_step_input: int, n_step_output: int, graph_data: HeteroData)
Bases:
AnemoiModelEncProcDecMessage passing hierarchical graph neural network.
- forward(x: dict[str, Tensor], model_comm_group: ProcessGroup | None = None, grid_shard_sizes: dict[str, list[int] | None] | None = None, **kwargs) dict[str, Tensor]
Forward pass of the model.
- Parameters:
- Returns:
Output of the model, with the same shape as the input (sharded if input is sharded)
- Return type: