metadata

class anemoi.inference.metadata.Metadata(metadata: dict[str, Any], supporting_arrays: dict[str, ndarray[tuple[Any, ...], dtype[Any]]] = {})

Bases: LegacyMixin

Base Metadata class.

property target_explicit_times: Any

Return the target explicit times from the training configuration.

property input_explicit_times: Any

Return the input explicit times from the training configuration.

property data_frequency: Any

Get the data frequency.

print_indices(print=<bound method Logger.info of <Logger anemoi.inference.metadata (WARNING)>>) None

Print data and model indices for debugging purposes.

property lagged: list[timedelta]

Return the list of steps for the multi_step_input fields.

property timestep: timedelta

Model time stepping timestep.

property precision: str | int

Return the precision of the model (bits per float).

property variable_to_input_tensor_index: MappingProxyType

Return the mapping between variable name and input tensor index.

property variable_to_output_tensor_index: MappingProxyType

Return the mapping between variable name and output tensor index.

property input_tensor_index_to_variable: MappingProxyType

Return the mapping between input tensor index and variable name.

property output_tensor_index_to_variable: MappingProxyType

Return the mapping between output tensor index and variable name.

property number_of_grid_points: int

Return the number of grid points per fields.

property number_of_input_features: int

Return the number of input features.

property model_computed_variables: tuple

The initial conditions variables that need to be computed and not retrieved.

property multi_step_input: int

Number of past steps needed for the initial conditions tensor.

property multi_step_output: int

Number of future steps predicted by single model forward.

property prognostic_output_mask: ndarray[tuple[Any, ...], dtype[Any]]

Return the prognostic output mask.

property prognostic_input_mask: ndarray[tuple[Any, ...], dtype[Any]]

Return the prognostic input mask.

property computed_time_dependent_forcings: tuple[ndarray, list]

Return the indices and names of the computed forcings that are not constant in time.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables_and_mask instead.

property computed_constant_forcings: tuple[ndarray[tuple[Any, ...], dtype[Any]], list[str]]

Return the indices and names of the computed forcings that are constant in time.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables_and_mask instead.

has_supporting_array(name: str) bool

Check if the metadata has a supporting array with the given name.

Parameters:

name (str) – The name of the supporting array.

Returns:

True if the supporting array exists, False otherwise.

Return type:

bool

property variables: tuple

Return the variables as found in the training dataset.

property variables_metadata: dict[str, Any]

Return the variables and their metadata as found in the training dataset.

property diagnostic_variables: list

Variables that are marked as diagnostic.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables instead.

property prognostic_variables: list

Variables that are marked as prognostic.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables instead.

property index_to_variable: MappingProxyType

Return a mapping from index to variable name.

property typed_variables: dict[str, Variable]

Returns a strongly typed variables.

property accumulations: list

Return the indices of the variables that are accumulations.

name_fields(fields: FieldList, namer: Callable[[...], str] | None = None) FieldList

Name fields using the provided namer.

Parameters:
  • fields (FieldList) – The fields to name.

  • namer (callable, optional) – The namer function, by default None.

Returns:

The named fields.

Return type:

FieldList

sort_by_name(fields: FieldList, *args: Any, namer: Callable[[...], Any] | None = None, **kwargs: Any) FieldList

Sort fields by name.

Parameters:
  • fields (ekd.FieldList) – The fields to sort.

  • args (Any) – Additional arguments.

  • namer (callable, optional) – The namer function, by default None.

  • kwargs (Any) – Additional keyword arguments.

Returns:

The sorted fields.

Return type:

ekd.FieldList

default_namer(*args: Any, **kwargs: Any) Callable[[...], str]

Return a callable that can be used to name earthkit-data fields.

Parameters:
  • args (Any) – Additional arguments.

  • kwargs (Any) – Additional keyword arguments.

Returns:

The namer function.

Return type:

Callable

property grid: str | None

Return the grid information.

property area: str | None

Return the area information.

select_variables(*, include: list[str] | None = None, exclude: list[str] | None = None, has_mars_requests: bool = False) list[str]

Get variables from input.

Parameters:
  • include (List[str]) – Categories to include.

  • exclude (List[str]) – Categories to exclude.

  • has_mars_requests (bool) – If True, only include variables that have MARS requests.

Returns:

The list of variables.

Return type:

List[str]

mars_input_requests() Iterator[dict[str, Any]]

Generate MARS input requests.

Returns:

The MARS requests.

Return type:

Iterator[DataRequest]

mars_by_levtype(levtype: str) tuple[set, set]

Get MARS parameters and levels by levtype.

Parameters:

levtype (str) – The levtype to filter by.

Returns:

The parameters and levels.

Return type:

tuple

mars_requests(*, variables: list[str], dates: list[str | datetime | int], use_grib_paramid: bool = False, always_split_time: bool = False, patch_request: Callable[[dict[str, Any]], dict[str, Any]] | None = None, dont_fail_for_missing_paramid: bool = False, **kwargs: Any) list[dict[str, Any]]

Generate MARS requests for the given variables and dates.

Parameters:
  • variables (list[str]) – The list of variables.

  • dates (list[Date]) – The list of dates.

  • use_grib_paramid (bool, optional) – Whether to use GRIB paramid, by default False.

  • always_split_time (bool, optional) – Whether to always split time, by default False.

  • patch_request (Optional[Callable], optional) – A callable to patch the request, by default None.

  • dont_fail_for_missing_paramid (bool, optional) – Whether to not fail for missing param ids, by default False.

  • **kwargs (Any) – Additional keyword arguments.

Returns:

The list of MARS requests.

Return type:

List[DataRequest]

simple_mars_requests(*, variables: list[str]) Iterator[dict[str, Any]]

Generate MARS requests for the given variables.

Parameters:

variables (list) – The list of variables.

Returns:

The MARS requests.

Return type:

Iterator[DataRequest]

Raises:

ValueError – If no variables are requested or if a variable is not found in the metadata.

report_error() None

Report an error with provenance information.

validate_environment(*, all_packages: bool = False, on_difference: Literal['warn', 'error', 'ignore', 'return'] = 'warn', exempt_packages: list[str] | None = None) bool | str

Validate environment of the checkpoint against the current environment.

Parameters:
  • all_packages (bool, optional) – Check all packages in the environment (True) or just anemoi’s (False), by default False.

  • on_difference (Literal['warn', 'error', 'ignore', 'return'], optional) – What to do on difference, by default “warn”

  • exempt_packages (list[str], optional) – List of packages to exempt from the check, by default EXEMPT_PACKAGES

Returns:

boolean if on_difference is not ‘return’, otherwise formatted text of the differences True if environment is valid, False otherwise

Return type:

Union[bool, str]

Raises:
  • RuntimeError – If found difference and on_difference is ‘error’

  • ValueError – If on_difference is not ‘warn’ or ‘error’

open_dataset(*, use_original_paths: bool | None = None, from_dataloader: str | None = None) tuple[Any, Any]

Open the dataset.

Parameters:
  • use_original_paths (bool) – Whether to use the original paths.

  • from_dataloader (str, optional) – The dataloader to use, by default None.

Returns:

The opened dataset and its arguments.

Return type:

tuple

open_dataset_args_kwargs(*, use_original_paths: bool, from_dataloader: str | None = None) tuple[Any, Any]

Get the arguments and keyword arguments for opening the dataset.

Parameters:
  • use_original_paths (bool) – Whether to use the original paths.

  • from_dataloader (str, optional) – The dataloader to use, by default None.

Returns:

The arguments and keyword arguments.

Return type:

tuple

variable_categories() dict

Get the categories of variables.

Returns:

The categories of variables.

Return type:

dict

load_supporting_array(name: str) ndarray[tuple[Any, ...], dtype[Any]]

Load a supporting array by name.

Parameters:

name (str) – The name of the supporting array.

Returns:

The supporting array.

Return type:

FloatArray

Raises:

ValueError – If the supporting array is not found.

property supporting_arrays: dict[str, ndarray[tuple[Any, ...], dtype[Any]]]

Return the supporting arrays.

property latitudes: ndarray[tuple[Any, ...], dtype[Any]] | None

Return the latitudes.

property longitudes: ndarray[tuple[Any, ...], dtype[Any]] | None

Return the longitudes.

property grid_points_mask: ndarray[tuple[Any, ...], dtype[Any]] | None

Return the grid points mask.

provenance_training() dict[str, Any]

Get the environmental configuration when trained.

Returns:

The environmental configuration.

Return type:

dict

sources(path: str) list

Get the sources from the metadata.

Parameters:

path (str) – The path to the sources.

Returns:

The list of sources.

Return type:

list

Raises:

ValueError – If not all paths were fixed.

print_variable_categories(print=<bound method Logger.info of <Logger anemoi.inference.metadata (WARNING)>>) None

Print the variable categories for debugging purposes.

patch(patch: dict) None

Patch the metadata with the given patch.

Parameters:

patch (dict) – The patch to apply.

class anemoi.inference.metadata.SingleDatasetMetadata(metadata: dict[str, Any], supporting_arrays: dict[str, ndarray[tuple[Any, ...], dtype[Any]]] = {})

Bases: Metadata

Legacy single-dataset metadata.

class anemoi.inference.metadata.MultiDatasetMetadata(metadata: dict[str, Any], supporting_arrays: dict[str, dict[str, ndarray[tuple[Any, ...], dtype[Any]]]] = {}, dataset_name='data')

Bases: Metadata

Map metadata for a multi-dataset checkpoint to a specific dataset name.

property dataset_names: list

List of canonical dataset names.

property timestep: timedelta

Model time stepping timestep.

property multi_step_input: int

Number of past steps needed for the initial conditions tensor.

property multi_step_output: int

Number of future steps predicted by single model forward.

property input_explicit_times: Any

Explicit times of the input steps used for the temporal downscaler.

property target_explicit_times: Any

Explicit times of the target steps used for the temporal downscaler.

property variable_to_input_tensor_index: MappingProxyType

Return the mapping between variable name and input tensor index.

property variable_to_output_tensor_index: MappingProxyType

Return the mapping between variable name and output tensor index.

property input_tensor_index_to_variable: MappingProxyType

Return the mapping between input tensor index and variable name.

property output_tensor_index_to_variable: MappingProxyType

Return the mapping between output tensor index and variable name.

variable_categories() dict[str, set[str]]

Get the categories of variables.

Returns:

The categories of variables.

Return type:

dict

class anemoi.inference.metadata.SourceMetadata(parent: Metadata, name: str, metadata: dict, supporting_arrays: dict = {})

Bases: Metadata

An object that holds metadata of a source. It is only the dataset and supporting_arrays parts of the metadata. The rest is forwarded to the parent metadata object.

property latitudes: ndarray[tuple[Any, ...], dtype[Any]] | None

Return the latitudes.

property longitudes: ndarray[tuple[Any, ...], dtype[Any]] | None

Return the longitudes.

property grid_points_mask: ndarray[tuple[Any, ...], dtype[Any]] | None

Return the grid points mask.