metadata

class anemoi.inference.metadata.Metadata(metadata: dict[str, Any], supporting_arrays: dict[str, ndarray[tuple[Any, ...], dtype[Any]]] = {})

Bases: LegacyMixin

Base Metadata class.

property target_explicit_times: Any: Return the target explicit times from the training configuration.

property input_explicit_times: Any: Return the input explicit times from the training configuration.

property data_frequency: Any: Get the data frequency.

print_indices(print=<bound method Logger.info of <Logger anemoi.inference.metadata (WARNING)>>) → None: Print data and model indices for debugging purposes.

property lagged: list[timedelta]: Return the list of steps for the multi_step_input fields.

property timestep: timedelta: Model time stepping timestep.

property precision: str | int: Return the precision of the model (bits per float).

property variable_to_input_tensor_index: MappingProxyType: Return the mapping between variable name and input tensor index.

property variable_to_output_tensor_index: MappingProxyType: Return the mapping between variable name and output tensor index.

property input_tensor_index_to_variable: MappingProxyType: Return the mapping between input tensor index and variable name.

property output_tensor_index_to_variable: MappingProxyType: Return the mapping between output tensor index and variable name.

property number_of_grid_points: int: Return the number of grid points per fields.

property number_of_input_features: int: Return the number of input features.

property model_computed_variables: tuple: The initial conditions variables that need to be computed and not retrieved.

property multi_step_input: int: Number of past steps needed for the initial conditions tensor.

property multi_step_output: int: Number of future steps predicted by single model forward.

property prognostic_output_mask: ndarray[tuple[Any, ...], dtype[Any]]: Return the prognostic output mask.

property prognostic_input_mask: ndarray[tuple[Any, ...], dtype[Any]]: Return the prognostic input mask.

property computed_time_dependent_forcings: tuple[ndarray, list]: Return the indices and names of the computed forcings that are not constant in time.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables_and_mask instead.

property computed_constant_forcings: tuple[ndarray[tuple[Any, ...], dtype[Any]], list[str]]: Return the indices and names of the computed forcings that are constant in time.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables_and_mask instead.

has_supporting_array(name: str) → bool

Check if the metadata has a supporting array with the given name.

Parameters:: name (str) – The name of the supporting array.
Returns:: True if the supporting array exists, False otherwise.
Return type:: bool

property variables: tuple: Return the variables as found in the training dataset.

property variables_metadata: dict[str, Any]: Return the variables and their metadata as found in the training dataset.

property diagnostic_variables: list: Variables that are marked as diagnostic.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables instead.

property prognostic_variables: list: Variables that are marked as prognostic.

Deprecated since version 0.6.4: This will be removed in 0.7.0. Use select_variables instead.

property index_to_variable: MappingProxyType: Return a mapping from index to variable name.

property typed_variables: dict[str, Variable]: Returns a strongly typed variables.

property accumulations: list: Return the indices of the variables that are accumulations.

name_fields(fields: FieldList, namer: Callable[[...], str] | None = None) → FieldList

Name fields using the provided namer.

Parameters:

fields (FieldList) – The fields to name.
namer (callable, optional) – The namer function, by default None.

Returns:

The named fields.

Return type:

FieldList

sort_by_name(fields: FieldList, *args: Any, namer: Callable[[...], Any] | None = None, **kwargs: Any) → FieldList

Sort fields by name.

Parameters:

fields (ekd.FieldList) – The fields to sort.
args (Any) – Additional arguments.
namer (callable, optional) – The namer function, by default None.
kwargs (Any) – Additional keyword arguments.

Returns:

The sorted fields.

Return type:

ekd.FieldList

default_namer(*args: Any, **kwargs: Any) → Callable[[...], str]

Return a callable that can be used to name earthkit-data fields.

Parameters:

args (Any) – Additional arguments.
kwargs (Any) – Additional keyword arguments.

Returns:

The namer function.

Return type:

Callable

property grid: str | None: Return the grid information.

property area: str | None: Return the area information.

select_variables(*, include: list[str] | None = None, exclude: list[str] | None = None, has_mars_requests: bool = False) → list[str]

Get variables from input.

Parameters:

include (List[str]) – Categories to include.
exclude (List[str]) – Categories to exclude.
has_mars_requests (bool) – If True, only include variables that have MARS requests.

Returns:

The list of variables.

Return type:

List[str]

mars_input_requests() → Iterator[dict[str, Any]]

Generate MARS input requests.

Returns:: The MARS requests.
Return type:: Iterator[DataRequest]

mars_by_levtype(levtype: str) → tuple[set, set]

Get MARS parameters and levels by levtype.

Parameters:: levtype (str) – The levtype to filter by.
Returns:: The parameters and levels.
Return type:: tuple

mars_requests(*, variables: list[str], dates: list[str | datetime | int], use_grib_paramid: bool = False, always_split_time: bool = False, patch_request: Callable[[dict[str, Any]], dict[str, Any]] | None = None, dont_fail_for_missing_paramid: bool = False, **kwargs: Any) → list[dict[str, Any]]

Generate MARS requests for the given variables and dates.

Parameters:

variables (list[str]) – The list of variables.
dates (list[Date]) – The list of dates.
use_grib_paramid (bool, optional) – Whether to use GRIB paramid, by default False.
always_split_time (bool, optional) – Whether to always split time, by default False.
patch_request (Optional[Callable], optional) – A callable to patch the request, by default None.
dont_fail_for_missing_paramid (bool, optional) – Whether to not fail for missing param ids, by default False.
**kwargs (Any) – Additional keyword arguments.

Returns:

The list of MARS requests.

Return type:

List[DataRequest]

simple_mars_requests(*, variables: list[str]) → Iterator[dict[str, Any]]

Generate MARS requests for the given variables.

Parameters:: variables (list) – The list of variables.
Returns:: The MARS requests.
Return type:: Iterator[DataRequest]
Raises:: ValueError – If no variables are requested or if a variable is not found in the metadata.

report_error() → None: Report an error with provenance information.

validate_environment(*, all_packages: bool = False, on_difference: Literal['warn', 'error', 'ignore', 'return'] = 'warn', exempt_packages: list[str] | None = None) → bool | str

Validate environment of the checkpoint against the current environment.

Parameters:

all_packages (bool, optional) – Check all packages in the environment (True) or just anemoi’s (False), by default False.
on_difference (Literal['warn', 'error', 'ignore', 'return'], optional) – What to do on difference, by default “warn”
exempt_packages (list[str], optional) – List of packages to exempt from the check, by default EXEMPT_PACKAGES

Returns:

boolean if on_difference is not ‘return’, otherwise formatted text of the differences True if environment is valid, False otherwise

Return type:

Union[bool, str]

Raises:

RuntimeError – If found difference and on_difference is ‘error’
ValueError – If on_difference is not ‘warn’ or ‘error’

open_dataset(*, use_original_paths: bool | None = None, from_dataloader: str | None = None) → tuple[Any, Any]

Open the dataset.

Parameters:

use_original_paths (bool) – Whether to use the original paths.
from_dataloader (str, optional) – The dataloader to use, by default None.

Returns:

The opened dataset and its arguments.

Return type:

tuple

open_dataset_args_kwargs(*, use_original_paths: bool, from_dataloader: str | None = None) → tuple[Any, Any]

Get the arguments and keyword arguments for opening the dataset.

Parameters:

use_original_paths (bool) – Whether to use the original paths.
from_dataloader (str, optional) – The dataloader to use, by default None.

Returns:

The arguments and keyword arguments.

Return type:

tuple

variable_categories() → dict

Get the categories of variables.

Returns:: The categories of variables.
Return type:: dict

load_supporting_array(name: str) → ndarray[tuple[Any, ...], dtype[Any]]

Load a supporting array by name.

Parameters:: name (str) – The name of the supporting array.
Returns:: The supporting array.
Return type:: FloatArray
Raises:: ValueError – If the supporting array is not found.

property supporting_arrays: dict[str, ndarray[tuple[Any, ...], dtype[Any]]]: Return the supporting arrays.

property latitudes: ndarray[tuple[Any, ...], dtype[Any]] | None: Return the latitudes.

property longitudes: ndarray[tuple[Any, ...], dtype[Any]] | None: Return the longitudes.

property grid_points_mask: ndarray[tuple[Any, ...], dtype[Any]] | None: Return the grid points mask.

provenance_training() → dict[str, Any]

Get the environmental configuration when trained.

Returns:: The environmental configuration.
Return type:: dict

sources(path: str) → list

Get the sources from the metadata.

Parameters:: path (str) – The path to the sources.
Returns:: The list of sources.
Return type:: list
Raises:: ValueError – If not all paths were fixed.

print_variable_categories(print=<bound method Logger.info of <Logger anemoi.inference.metadata (WARNING)>>) → None: Print the variable categories for debugging purposes.

patch(patch: dict) → None

Patch the metadata with the given patch.

Parameters:: patch (dict) – The patch to apply.

class anemoi.inference.metadata.SingleDatasetMetadata(metadata: dict[str, Any], supporting_arrays: dict[str, ndarray[tuple[Any, ...], dtype[Any]]] = {})

Bases: Metadata

Legacy single-dataset metadata.

class anemoi.inference.metadata.MultiDatasetMetadata(metadata: dict[str, Any], supporting_arrays: dict[str, dict[str, ndarray[tuple[Any, ...], dtype[Any]]]] = {}, dataset_name='data')

Bases: Metadata

Map metadata for a multi-dataset checkpoint to a specific dataset name.

property dataset_names: list: List of canonical dataset names.

property timestep: timedelta: Model time stepping timestep.

property multi_step_input: int: Number of past steps needed for the initial conditions tensor.

property multi_step_output: int: Number of future steps predicted by single model forward.

property input_explicit_times: Any: Explicit times of the input steps used for the temporal downscaler.

property target_explicit_times: Any: Explicit times of the target steps used for the temporal downscaler.

property variable_to_input_tensor_index: MappingProxyType: Return the mapping between variable name and input tensor index.

property variable_to_output_tensor_index: MappingProxyType: Return the mapping between variable name and output tensor index.

property input_tensor_index_to_variable: MappingProxyType: Return the mapping between input tensor index and variable name.

property output_tensor_index_to_variable: MappingProxyType: Return the mapping between output tensor index and variable name.

variable_categories() → dict[str, set[str]]

Get the categories of variables.

Returns:: The categories of variables.
Return type:: dict

class anemoi.inference.metadata.SourceMetadata(parent: Metadata, name: str, metadata: dict, supporting_arrays: dict = {})

Bases: Metadata

An object that holds metadata of a source. It is only the dataset and supporting_arrays parts of the metadata. The rest is forwarded to the parent metadata object.

property latitudes: ndarray[tuple[Any, ...], dtype[Any]] | None: Return the latitudes.

property longitudes: ndarray[tuple[Any, ...], dtype[Any]] | None: Return the longitudes.

property grid_points_mask: ndarray[tuple[Any, ...], dtype[Any]] | None: Return the grid points mask.