Migration System

This serves as general information on how the migration system works. This can be useful for contributors who need to write a migration script, users who want to understand how their checkpoint are updated, or futur contributors to the migration code

The migration system’s goal is to allow users to keep a checkpoint trained on a version of anemoi-models, and use it on newer or older version, even if it would have lead to a break of the checkpoint.

This is not only convenient for user to avoid having to retrain a full model just because a layer has been renamed, but also it allowes more flexibility to contributors for changes that they would not have done lest breaking existing checkpoints.

General Overview

Migrations are stored in anemoi-models in migrations.scripts as an ordered list of scripts. Each script contains:

  • some metadata information, such as the version of the migration system, or the version of anemoi-models,

  • a migrate function to migrate checkpoints

  • optionnally a migrate_setup function to fix import issues.

Similarly, the checkpoint contain some migration information that informs on its migration state:

  • the name of the migration: corresponds to the filename of the script in anemoi-models,

  • the metadata: same as in the migration scripts,

  • the signature: a hash digest of the original migration script. This is used to detect whether already executed scripts have changed. For now, it only logs a warning, but a more complex behavior could be added in the future,

Compatibility groups

Some changes cannot be migrated. For example, a change in architecture that adds some trainable weights. When this happens, a “final” migration script need to be created. The “final” migrations act as separators to show migrations that are compatible with one another. For example, let’s look at this list of migration in anemoi-models:

Name

migration 1

migration 2

final migration

migration 3

final migraion

migration 4

Version

0.8.1

0.8.3

0.9.0

0.10.5

0.12.0

0.12.2

Compatibility group

1

1

2

2

3

3

This also shows the compatibility groups that groups migrations that are compatible with one-another.

For example, for a checkpoint trained on version 0.8.1, migration 1 is already registered in the checkpoint. This checkpoint can be migrated to be used with all versions of its compatibility group (group 1) up until (and excluding) 0.9.0.

Similarly, a checkpoint trained on version 0.12.2 can be downgraded up until (and including) 0.12.0.

Note

Checkpoints only store migration information of their own compatibility group. The “final” migration of a group can also be seen as the first migration of the following group. In fact, “final” migrations are always the first registered migration of a group, and acts as a marker of the compatibility group of the checkpoint. The first compatibility group is an exception, and does not start with a “final” migration.

Resolution algorithm

The operations to execute are decided by the following resolution algorithm. To follow along, here is an example:

In anemoi-models

In the checkpoint

migration 1

migration 1

migration 2

migration 2

migration 5

migration 6

migration 7

  • First, we check if there are extra migrations in the checkpoint. If so, fail.

  • Then, we migrate any missing migrations in the checkpoint, starting from the start (here migration 5, 6 and 7).

In the example, it will produce:

  • MIGRATE migration 5

  • MIGRATE migration 6

  • MIGRATE migration 7

Executed migrations

The whole history of migrations is stored in the metadata of the checkpoint. It can be accessed through:

>>> history = metadata.get("migrations", {}).get("history", [])
>>> for executed_migration in history:
...    print(executed_migration)
{ "type": "migrate", "name": "migration_name2.py", "signature": "[...]" }

Migrator

exception anemoi.models.migrations.migrator.IncompatibleCheckpointException

Bases: BaseException

The provided checkpoint cannot be migrated because it is to old/recent.

exception anemoi.models.migrations.migrator.IncompleteMigrationScript

Bases: BaseException

The migration script is missing some mandatory content (metadata).

class anemoi.models.migrations.migrator.MigrationVersions

Bases: dict

class anemoi.models.migrations.migrator.MigrationMetadata(versions: MigrationVersions, final: bool = False)

Bases: object

Metadata object of the migration.

versions: MigrationVersions

Migration and anemoi-model versions.

final: bool = False

Whether the migration is final.

class anemoi.models.migrations.migrator.SerializedMigration

Bases: TypedDict

The serialized migration stored in the checkpoint

name: str

Name of the migration

signature: str

The signature of the script. Can be used to detect if a script changed.

class anemoi.models.migrations.migrator.Migration(name: str, metadata: MigrationMetadata, signature: str, migrate: Callable[[MutableMapping[str, Any]], MutableMapping[str, Any]] | None = None, migrate_setup: Callable[[MigrationContext], None] | None = None)

Bases: object

Represents a migration

name: str

Name of the migration

metadata: MigrationMetadata

Tracked metadata

signature: str

Signature of the migration. Can be used to detect if the script changed

migrate: Callable[[MutableMapping[str, Any]], MutableMapping[str, Any]] | None = None

Callback to execute the migration

migrate_setup: Callable[[MigrationContext], None] | None = None

Setup function to execute before loading the checkpoint. This can be used to mock missing modules or Attributes.

classmethod from_serialized(migration: SerializedMigration) Migration

Alt init to load the migration from the serialized migration dict in the checkpoint This migration does not contain the migrate or migrate_setup callbacks as they are not serialized.

Parameters:

migration (SerializedMigration) – The serialized migration dict

Returns:

The migration.

Return type:

Migration

serialize() SerializedMigration

Serialize this migration

Returns:

The serialized dict to store in the checkpoint.

Return type:

SerializedMigration

class anemoi.models.migrations.migrator.MigrationOp(run: Callable[[MutableMapping[str, Any]], MutableMapping[str, Any]], migration: Migration)

Bases: object

Migration Operation

class anemoi.models.migrations.migrator.MissingAttribute(*args, **kwargs)

Bases: object

Placeholder type when encountering ImportError or AttributeError in Unpickler.find_class

class anemoi.models.migrations.migrator.SaveCkpt(ckpt_dir: Path)

Bases: object

Useful for testing. Used in the save_ckpt fixture.

Setup Context

class anemoi.models.migrations.setup_context.MigrationContext

Bases: object

A context object allowing setup callbacks to access some utilities:

  • context.move_attribute("pkg.start.MyClass", "pkg.end.MyRenamedClass") to update paths

    to attributes.

  • context.move_module("pkg.start", "pkg.end") to move a full module.

  • context.delete_attribute("pkg.mod.MyClass") to remove a class you can use “*” as

    a wildcard for the attribute name: context.delete_attribute("pkg.mod.*") will remove all attribute from the module.

delete_attribute(path: str) None

Indicate that an attribute has been deleted. Any class referencing this module will be replace by a MissingAttribute object.

Parameters:

path (str) – Path to the attribute. For example pkg.mod.MyClass.

delete_module(path: str) None

Mark a module for deletion.

move_attribute(path_start: str, path_end: str) None

Move and rename an attribute between modules.

Parameters:
  • path_start (str) – Starting module path

  • path_end (str) – End module path

move_module(path_start: str, path_end: str) None

Move a module.

Parameters:
  • path_start (str) – Starting module path

  • path_end (str) – End module path