Inputs
The input methods listed below are used to fetch the initial conditions
of a model run. They will also be the source of forcings during the
model run, unless a forcing
entry is specified in the configuration
(see Forcings).
Datasets
You can use the dataset that was used during training as input. It can
be test
, training
or validation
, corresponding to the
entries given during training as dataloader.test
,
dataloader.training
and dataloader.validation
respectively.
input: test
test
is the default input if no input is specified.
If the training happened on a different computer and the datasets files
are not available on the current computer, you can use
anemoi-datasets’s Configuration to define a
search path to the datasets. To enable this, you have set the
use_original_paths
option to false
.
input:
test:
use_original_paths: false
You can also provide a full dataset specification as follows:
input:
dataset:
join:
- dataset: dataset-1
select: [ 2t, tp ]
- dataset: dataset-2
drop: [ 2t, tp ]
See Opening datasets in the documentation of the anemoi-datasets package for more information on how to open datasets.
grib
You can specify the input as grib
to read the data from a GRIB file.
input:
grib: /path/to/grib/file.grib
For more options, see GRIB input.
icon_grib_file
The icon_grib_file
input is a class dedicated to reading ICON GRIB
files. It is
input:
icon_grib_file:
path: /path/to/grib/file.grib
grid: icon_grid_0026_R03B07_G.nc
refinement_level_c: 5
The grid
entry refers to a NetCDF file that contains the definition
of the ICON grid in radians. The refinement_level_c
parameter is
used to specify the refinement level of the ICON grid. The
icon_grib_file
input also accepts the namer
parameter of the
GRIB input.
Note
Once the grids are stored by in the checkpoint Anemoi, the
icon_grib_file
input will become obsolete.
mars
You can also specify the input as mars
to read the data from ECMWF’s
MARS archive. This requires the ecmwf-api-client package to be
installed, and the user to have an ECMWF account.
input: mars
You can also specify some of the MARS keywords as options. The default
is to retrieve the data from the operational analysis (class=od
).
You can change that to use ERA5 reanalysis data (class=ea
).
input:
mars:
class: ea
The mars
input also accepts the namer
parameter of the GRIB
input.
cds
You can also specify the input as cds
to read the data from the
Climate Data Store. This
requires the cdsapi package to be installed, and the user to have a
CDS account.
input:
cds:
dataset: ???
As the CDS contains a plethora of datasets, you can specify the dataset you want to use with the key dataset.
This can be a str in which case the dataset is used for all requests, or a dict of any number of levels which will be descended based on the key/values for each request.
You can use * to represent any not given value for a key, i.e. set a dataset for param: 2t. and param: * to represent any other param.
input:
cds:
# Dataset examples
## As a string
dataset:
'reanalysis-era5-pressure-levels'
## As a simple dictionary
dataset:
levtype:
pl: reanalysis-era5-pressure-levels
sfc: reanalysis-era5-single-levels
## As a complex dictionary
dataset:
stream:
oper:
levtype:
pl: reanalysis-era5-pressure-levels
sfc: reanalysis-era5-single-levels
an:
# ... Other datasets
'*': # Any other stream
# ... Other datasets
In the above example, the dataset reanalysis-era5-pressure-levels is used for all with levtype: pl and reanalysis-era5-single-levels used for all with levtype: sfc.
Additionally, any kwarg can be passed to be added to all requests, i.e. for ERA5 data, product_type: ‘reanalysis’ is needed.
input:
cds:
dataset:
levtype:
pl: reanalysis-era5-pressure-levels
sfc: reanalysis-era5-single-levels
product_type: 'reanalysis'