Inputs

The input methods listed below are used to fetch the initial conditions of a model run. They will also be the source of forcings during the model run, unless a forcing entry is specified in the configuration (see Forcings).

Datasets

You can use the dataset that was used during training as input. It can be test, training or validation, corresponding to the entries given during training as dataloader.test, dataloader.training and dataloader.validation respectively.

input: test

test is the default input if no input is specified.

If the training happened on a different computer and the datasets files are not available on the current computer, you can use anemoi-datasets’s Configuration to define a search path to the datasets. To enable this, you have set the use_original_paths option to false.

input:
  test:
    use_original_paths: false

You can also provide a full dataset specification as follows:

input:
  dataset:
    join:
    - dataset: dataset-1
      select: [ 2t, tp ]
    - dataset: dataset-2
      drop: [ 2t, tp ]

See Opening datasets in the documentation of the anemoi-datasets package for more information on how to open datasets.

grib

You can specify the input as grib to read the data from a GRIB file.

input:
  grib: /path/to/grib/file.grib

For more options, see GRIB input.

icon_grib_file

The icon_grib_file input is a class dedicated to reading ICON GRIB files. It is

input:
  icon_grib_file:
    path: /path/to/grib/file.grib
    grid: icon_grid_0026_R03B07_G.nc
    refinement_level_c: 5

The grid entry refers to a NetCDF file that contains the definition of the ICON grid in radians. The refinement_level_c parameter is used to specify the refinement level of the ICON grid. The icon_grib_file input also accepts the namer parameter of the GRIB input.

Note

Once the grids are stored by in the checkpoint Anemoi, the icon_grib_file input will become obsolete.

mars

You can also specify the input as mars to read the data from ECMWF’s MARS archive. This requires the ecmwf-api-client package to be installed, and the user to have an ECMWF account.

input: mars

You can also specify some of the MARS keywords as options. The default is to retrieve the data from the operational analysis (class=od). You can change that to use ERA5 reanalysis data (class=ea).

input:
  mars:
    class: ea

The mars input also accepts the namer parameter of the GRIB input.

cds

You can also specify the input as cds to read the data from the Climate Data Store. This requires the cdsapi package to be installed, and the user to have a CDS account.

input:
  cds:
    dataset: ???

As the CDS contains a plethora of datasets, you can specify the dataset you want to use with the key dataset.

This can be a str in which case the dataset is used for all requests, or a dict of any number of levels which will be descended based on the key/values for each request.

You can use * to represent any not given value for a key, i.e. set a dataset for param: 2t. and param: * to represent any other param.

input:
  cds:
    # Dataset examples
    ## As a string
    dataset:
      'reanalysis-era5-pressure-levels'

    ## As a simple dictionary
    dataset:
      levtype:
        pl: reanalysis-era5-pressure-levels
        sfc: reanalysis-era5-single-levels

    ## As a complex dictionary
    dataset:
      stream:
        oper:
          levtype:
            pl: reanalysis-era5-pressure-levels
            sfc: reanalysis-era5-single-levels
        an:
          # ... Other datasets
        '*': # Any other stream
          # ... Other datasets

In the above example, the dataset reanalysis-era5-pressure-levels is used for all with levtype: pl and reanalysis-era5-single-levels used for all with levtype: sfc.

Additionally, any kwarg can be passed to be added to all requests, i.e. for ERA5 data, product_type: ‘reanalysis’ is needed.

input:
  cds:
    dataset:
      levtype:
        pl: reanalysis-era5-pressure-levels
        sfc: reanalysis-era5-single-levels
    product_type: 'reanalysis'