Preprocessing

The preprocessing module is used to pre- and post-process the data. The module contains the following classes:

class anemoi.models.preprocessing.BasePreprocessor(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: Module

Base class for data pre- and post-processors.

forward(x, in_place: bool = True, inverse: bool = False) Tensor

Process the input tensor.

Parameters:
  • x (torch.Tensor) – Input tensor

  • in_place (bool) – Whether to process the tensor in place

  • inverse (bool) – Whether to inverse transform the input

Returns:

Processed tensor

Return type:

torch.Tensor

transform(x, in_place: bool = True) Tensor

Process the input tensor.

inverse_transform(x, in_place: bool = True) Tensor

Inverse process the input tensor.

class anemoi.models.preprocessing.Processors(processors: list, inverse: bool = False)

Bases: Module

A collection of processors.

forward(x, in_place: bool = True) Tensor

Process the input tensor.

Parameters:
  • x (torch.Tensor) – Input tensor

  • in_place (bool) – Whether to process the tensor in place

Returns:

Processed tensor

Return type:

torch.Tensor

Normalizer

The normalizer module is used to normalize the data. The module contains the following classes:

class anemoi.models.preprocessing.normalizer.InputNormalizer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: BasePreprocessor

Normalizes input data with a configurable method.

transform(x: Tensor, in_place: bool = True, data_index: Tensor | None = None) Tensor

Normalizes an input tensor x of shape […, nvars].

Normalization done in-place unless specified otherwise.

The default usecase either assume the full batch tensor or the full input tensor. A dataindex is based on the full data can be supplied to choose which variables to normalise.

Parameters:
  • x (torch.Tensor) – Data to normalize

  • in_place (bool, optional) – Normalize in-place, by default True

  • data_index (Optional[torch.Tensor], optional) – Normalize only the specified indices, by default None

Returns:

_description_

Return type:

torch.Tensor

inverse_transform(x: Tensor, in_place: bool = True, data_index: Tensor | None = None) Tensor

Denormalizes an input tensor x of shape […, nvars | nvars_pred].

Denormalization done in-place unless specified otherwise.

The default usecase either assume the full batch tensor or the full output tensor. A dataindex is based on the full data can be supplied to choose which variables to denormalise.

Parameters:
  • x (torch.Tensor) – Data to denormalize

  • in_place (bool, optional) – Denormalize in-place, by default True

  • data_index (Optional[torch.Tensor], optional) – Denormalize only the specified indices, by default None

Returns:

Denormalized data

Return type:

torch.Tensor

Imputer

The imputer module is used to impute the data. The module contains the following classes:

class anemoi.models.preprocessing.imputer.BaseImputer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: BasePreprocessor, ABC

Base class for Imputers.

get_nans(x: Tensor) Tensor

get NaN mask from data

transform(x: Tensor, in_place: bool = True) Tensor

Impute missing values in the input tensor.

inverse_transform(x: Tensor, in_place: bool = True) Tensor

Impute missing values in the input tensor.

class anemoi.models.preprocessing.imputer.InputImputer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: BaseImputer

Imputes missing values using the statistics supplied.

Expects the config to have keys corresponding to available statistics and values as lists of variables to impute.: ``` default: “none” mean:

  • y

maximum:
  • x

minimum:
  • q

```

class anemoi.models.preprocessing.imputer.ConstantImputer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: BaseImputer

Imputes missing values using the constant value.

Expects the config to have keys corresponding to available statistics and values as lists of variables to impute.: ``` default: “none” 1:

  • y

5.0:
  • x

3.14:
  • q

```

class anemoi.models.preprocessing.imputer.CopyImputer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: BaseImputer

Imputes missing values copying them from another variable. ``` default: “none” variable_to_copy:

  • variable_missing_1

  • variable_missing_2

```

transform(x: Tensor, in_place: bool = True) Tensor

Impute missing values in the input tensor.

class anemoi.models.preprocessing.imputer.DynamicMixin

Bases: object

Mixin to add dynamic imputation behavior. To be used when NaN maps change at different timesteps.

get_nans(x: Tensor) Tensor

Override to calculate NaN locations dynamically.

transform(x: Tensor, in_place: bool = True) Tensor

Impute missing values in the input tensor.

inverse_transform(x: Tensor, in_place: bool = True) Tensor

Impute missing values in the input tensor.

class anemoi.models.preprocessing.imputer.DynamicInputImputer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: DynamicMixin, InputImputer

Imputes missing values using the statistics supplied and a dynamic NaN map.

class anemoi.models.preprocessing.imputer.DynamicConstantImputer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: DynamicMixin, ConstantImputer

Imputes missing values using the constant value and a dynamic NaN map.

class anemoi.models.preprocessing.imputer.DynamicCopyImputer(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: DynamicMixin, CopyImputer

Dynamic Copy imputation behavior.

transform(x: Tensor, in_place: bool = True) Tensor

Impute missing values in the input tensor.

inverse_transform(x: Tensor, in_place: bool = True) Tensor

Impute missing values in the input tensor.

Remapper

The remapper module is used to remap one variable to multiple other variables that have been listed in data.remapped:. The module contains the following classes:

class anemoi.models.preprocessing.remapper.Remapper(config=None, data_indices: IndexCollection | None = None, statistics: dict | None = None)

Bases: BasePreprocessor, ABC

Remap and convert variables for single variables.