NumPy to NumPy API

The simplest way to run a inference from a checkpoint is to provide the initial state as a dictionary containing NumPy arrays for each input variable.

You then create a Runner object and call the run method, which will yield the state at each time step. Below is a simple code example to illustrate this:

import datetime

import numpy as np

from anemoi.inference.runners.simple import SimpleRunner

# Create a runner with the checkpoint file
runner = SimpleRunner("checkpoint.ckpt")

# Select a starting date
date = datetime.datetime(2024, 10, 25)

# Assuming that the initial conditions requires two
# dates, e.g. T0 and T-6

multi_step_input = 2

# Define the grid

latitudes = np.linspace(90, -90, 181)  # 1 degree resolution
longitudes = np.linspace(0, 359, 360)

number_of_points = len(latitudes) * len(longitudes)
latitudes, longitudes = np.meshgrid(latitudes, longitudes)

# Create the initial state

input_state = {
    "date": date,
    "latitudes": latitudes,
    "longitudes": longitudes,
    "fields": {
        "2t": np.random.rand(multi_step_input, number_of_points),
        "msl": np.random.rand(multi_step_input, number_of_points),
        "z_500": np.random.rand(multi_step_input, number_of_points),
        ...: ...,
    },
}

# Run the model

for state in runner.run(input_state=input_state, lead_time=240):
    # This is the date of the new state
    print("New state:", state["date"])

    # This is value of a field for that date
    print("Forecasted 2t:", state["fields"]["2t"])

The field names are the one that where provided when running the training, which were the name given to fields when creating the training dataset.

States

A state is a Python dictionary with the following keys:

  • date: datetime.datetime object that represent the date at which the state is valid.

  • latitudes: a NumPy array with the list of latitudes that matches the data values of fields

  • longitudes: a NumPy array with the corresponding list of longitudes. It must have the same size as the latitudes array.

  • fields: a dictionary that maps fields names with their data.

Each field is given as a NumPy array. If the model is multi-step, it will needs to be initialised with fields from two or more dates, the values must be two dimensions arrays, with the shape (number-of-dates, number-of-grid-points), otherwise the values can be a one dimension array. The first dimension is expected to represent each date in ascending order, and the date entry of the state must be the last one.

As it iterates, the model will produce new states with the same format. The date will represent the forecasted date, and the fields would have the forecasted values as NumPy array. These arrays will be of one dimensions (the number of grid points), even if the model is multi-step.

Checkpoints

Some newer version of anemoi-training will store the latitudes and longitudes used during training into the checkpoint. The example code above can be simplified as follows:

latitudes = runner.checkpoint.latitudes
longitudes = runner.checkpoint.longitudes