Distributed
graph
- anemoi.models.distributed.graph.shard_tensor(input_: Tensor, dim: int, shapes: tuple, mgroup: ProcessGroup, gather_in_backward: bool = True) Tensor
Shard tensor.
Keeps only part of the tensor that is relevant for the current rank.
- anemoi.models.distributed.graph.gather_tensor(input_: Tensor, dim: int, shapes: tuple, mgroup: ProcessGroup) Tensor
Gather tensor.
Gathers tensor shards from ranks.
- anemoi.models.distributed.graph.reduce_tensor(input_: Tensor, mgroup: ProcessGroup) Tensor
Reduce tensor.
Reduces tensor across ranks.
- Parameters:
input (Tensor) – Input
mgroup (ProcessGroup) – model communication group
- Returns:
Reduced tensor.
- Return type:
Tensor
- anemoi.models.distributed.graph.sync_tensor(input_: Tensor, dim: int, shapes: tuple, mgroup: ProcessGroup) Tensor
Sync tensor.
Perform a gather in the forward pass and an allreduce followed by a split in the backward pass.
khop_edges
- anemoi.models.distributed.khop_edges.get_k_hop_edges(nodes: Tensor, edge_attr: Tensor, edge_index: Tensor | SparseTensor, num_hops: int = 1) tuple[Tensor | SparseTensor, Tensor]
Return 1 hop subgraph.
- anemoi.models.distributed.khop_edges.sort_edges_1hop_sharding(num_nodes: int | tuple[int, int], edge_attr: Tensor, edge_index: Tensor | SparseTensor, mgroup: ProcessGroup | None = None) tuple[Tensor | SparseTensor, Tensor, list, list]
Rearanges edges into 1 hop neighbourhoods for sharding across GPUs.
- Parameters:
- Returns:
edges sorted according to k hop neigh., edge attributes of sorted edges, shapes of edge indices for partitioning between GPUs, shapes of edge attr for partitioning between GPUs
- Return type:
- anemoi.models.distributed.khop_edges.sort_edges_1hop_chunks(num_nodes: int | tuple[int, int], edge_attr: Tensor, edge_index: Tensor | SparseTensor, num_chunks: int) tuple[list[Tensor], list[Tensor | SparseTensor]]
Rearanges edges into 1 hop neighbourhood chunks.
- Parameters:
- Returns:
list of sorted edge attribute chunks, list of sorted edge_index chunks
- Return type:
shapes
transformer
- anemoi.models.distributed.transformer.shard_heads(input_: Tensor, shapes: list, mgroup: ProcessGroup) Tensor
Sync tensor.
Gathers e.g query, key or value tensor along sequence dimension via all to all communication and shards along head dimension for parallel self-attention computation. Expected format is (batch_size, … heads, sequence_length, channels)
- Parameters:
input (Tensor) – Input
shapes (list) – shapes of shards
mgroup (ProcessGroup) – model communication group
- Returns:
Sharded heads.
- Return type:
Tensor
- anemoi.models.distributed.transformer.shard_sequence(input_: Tensor, shapes: list, mgroup: ProcessGroup) Tensor
Sync tensor.
Gathers e.g query, key or value tensor along head dimension via all to all communication and shards along sequence dimension for parallel mlp and layernorm computation. Expected format is (batch_size, … heads, sequence_length, channels)
- Parameters:
input (Tensor) – Input
shapes (list) – shapes of shards
mgroup (ProcessGroup) – model communication group
- Returns:
Sharded sequence
- Return type:
Tensor
utils
- anemoi.models.distributed.utils.get_memory_format(tensor: Tensor)
Helper routine to get the memory format.