Overview
Dimensions
Training datasets are large array-like objects encoded in Zarr format.
The array has the following dimensions:

The first dimension is the time dimension, the second dimension are the variables (e.g. temperature, pressure, etc), the third dimension is the ensemble, and fourth dimension are the grid points values.
Chunking
“Chunks” are the basic unit of data storage in Zarr. This means that it is the granularity at which data is read from disk.
By default, the array is chunked along the time dimension so the whole state of the atmosphere at a given time is loaded in one go:

This structure provides an efficient way to build the training dataset, as input and output of the model are simply consecutive slices of the array.
x, y = ds[n : n + 1]
y_hat = model.predict(x)
loss = model.loss(y, y_hat)