Mutual Refinement and Composition

03 Oct, 2025

Introduction

Composition is a fundamental operation on Layouts. An example one encounters for example "in the wild" is composition of data with Thread Value Layout which than gives the user possibility to assign appropriate portion of data to each thread as indicated in this picture which I took from the CuTe docs:

TV_Layout

The categorical treatment of Layouts that Colfax established equips us with a step by step procedure to calculate Layout composition using the graphical representation of morphisms. In this blogpost I want to recap Chapter 4 of the Colfax Paper and conclude therewith my series of blogposts on this Paper.

Consider $A = (6, 6) : (6, 1)$ , $B = (12, 3, 6) : (1, 72, 12)$ .

Let's find corresponding Morphism representation. Let $f$ be morphism which encodes $A$ and $g$ be morphism which encodes $B$ .

$f$ :

$s_{0} d_{0} = 1 = d_{2}$ , append $(6)$ traverse to $d_{1}$
$s_{2} d_{2} = 6 = d_{1}$ , append $(6)$ .
We have therefore $f : S \to T = (6, 6) \to (6, 6)$ over $α = (2, 1)$ . $g$ :
$s_{0} d_{0} = 1 = d_{1}$ , append $(s_{1}) = (12)$ traverse to $d_{3}$
$s_{1} d_{1} = 12 = d_{3}$ , append $(s_{3}) = (6)$ , traverse to $d_{2}$
$s_{3} d_{3} = 72 = d_{2}$ , append $(s_{2}) = (3)$ .
We have therefore $g : U \to V = (12, 3, 6) \to (12, 6, 3)$ over $β = (1, 3, 2)$

This can be depicted visually:

Composition Intro

Note that we attempt to compute the Layout composition $B \circ A$ that corresponds to $g \circ f$ .

We see that is problematic, because we can't "input" the arrows coming from $f$ to the start of $g$ .

To be clear, what we want would be to have a picture like this, where arrows flow from $f^{'}$ into the domain of $g^{'}$ and from there into the codomain of $g^{'}$ . We see below that loosely speaking that is archived by splitting up the $6$ into a $2$ and a $3$ in the codomain of $f^{'}$ and by factoring the $12$ in the domain of $g^{'}$ into $6$ and $2$ .

Composition Into 2

Here we see the concept visually.

Composition Intro 3

Let's recap the concept of Refinement. $((2, (2, 2)))$ refines $((2, 4))$ refines $(8)$ but not in the other direction. That is because in our framework Refinement can be thought of such that $S^{'}$ refines $S$ if $S^{'}$ may be obtained from $S$ by replacing each entry in $S$ with a some nested tuples of equal size. Let us think a moment about what this geometrically means using a simple example:

$(2, 4)$ can be geometrically thought of as a rectangle. $(2, (2, 2))$ also describes shape of a rectangle. However one dimension of a rectangle can be again described as a rectangle. This is useful and important concept in CuTe because it allows us to implement highly non trivial patterns in an intuitive way as you can convince yourself of by following examples from my blogpost about hierarchical Layouts.

A mutual refinement can be thought of in terms of the following diagram

Mutual refinement

In other words we want to refine the codomain of one function and the domain of another function such that we all possible values of the codomain of one function lie in the domain of the other function. That in turn will yield that we can compose the two functions we obtain by mutually refining.

Algo

This is an algorithm to find mutual refinement between two nested tuples.

Let's apply it to $c o d o m a i n (f) = T = (6, 6)$ and $d o m a i n (g) = U = (12, 3, 6)$ .

Initially we have $X = T, Y = U, X^{'} = (), Y^{'} = (), X_{m} = (), Y_{m} = ()$ . We will now compute the mutual refinement $T^{'}, U^{'}$ .

Note that $l e n (X) = 2$ and $l e n (Y) = 3$ .

First step:

i=1,j=1
1. $e n t r y_{1} (X)$ divides ${e n t r y}_{1} (Y)$ $\to$ $X_{m} = (6)$ $\to$ $X^{'} = (6)$
2. $Y_{m} = (6)$ , ${e n t r y}_{1} (Y) = 2$ , $i \to 2$
i=2,j=1
1. ${e n t r y}_{1} (Y)$ divides ${e n t r y}_{2} (X)$ $\to$ $X_{m} = (2)$ , $Y_{m} = (6, 2)$ , $Y^{'} = ((6, 2))$
2. $Y_{m} = ()$ , ${e n t r y}_{2} (X) = 3$ , $j \to 2$
i=2,j=2
1. ${e n t r y}_{2} (X) = {e n t r y}_{2} (Y)$ $\to$ $X_{m} = (2, 3)$ , $X^{'} = (6, (2, 3))$
2. $Y_{m} = (3)$ , $Y^{'} = ((6, 2), 3)$ , $Y_{m} = ()$
3. $i \to 3, j \to 3$

Fill Y' with remaining entries from Y $Y^{'} = ((6, 2), 3, 6)$

So we obtained $T^{'} = (6, (2, 3))$ and $U^{'} = ((6, 2), 3, 6)$

Diagram

The diagram on the left show

T'   U'
T    U

We know that the domain of the second morphism needs to contain the codomain of the first morphism (after all, that is the whole reason we perform mutual refinement) which is depicted by the fact that for the upper $6$ we only have an outgoing arrow. For all the other entries we have ingoing and outgoing arrow.

Branching 1

Note that we take all "branching arrows" at the second stage and instead branch at first stage and than have an "identity arrow" pointing from $i \to i$ .

Branching 2

Here we simply make the replacement that for every incoming branch we replace it with a simple identity onto itself $i \to i$ .

Let's take a look again at the above diagram and show how to resolve the diagram:

Diagram Compose 1

The left side is obtained by the process provided above.

Find mutual refinement.
Bring each part separately into the desired form by using above process Gives us the left side.

From here we can drop the "mutual refinement" layer from the middle because by construction it is clear that $i \to i$ from left and right side. This gives us the right diagram from where we can read of $g^{'} \circ f^{'} = ((2, 3), 6) \to (6, 2, 6, 3)$ over $α = (2, 4, 1)$ .

This in return yields the Layout:

$s h a p e (L) = ((2, 3), 6)$
$s t r i d e (L^{♭}) = (6, 72, 1)$
equip stride with profile $((6, 72), 1)$
$L_{g^{'} \circ f^{'}} = ((2, 3), 6) : ((6, 72), 1)$

We have a useful theorem that connect the composition $C = L_{g^{'} \circ f^{'}}$ to $B \circ A$ .

Formula

So to obtain the composition we just take the $s i z e (A)$ complement of $C$ .

That means we have a clear procedure to compute composition:

Determine standard representation of $A$ and $B$
Determine mutual refinement similar to what we did above
Form the diagram using mutual refinement, i.e. "connect" codomain of $f$ and domain of $g$ to the mutual refinement "layer"
Use above mechanism to resolve the diagram. For incoming branches, let them branch from their corresponding source and from there map $i \to i$ . For outgoing branches map $i \to i$ instead of branching "into" a source
Read off the Layout from the diagram
Use the above formula to obtain the composition from the Layout we read off the diagram. (i.e. coalesce over the size of the Layout we apply first)

Let's use the most complex example given in the Paper and analyse it step by step:

Step_1

We see that $f$ corresponds to $A$ because $s h a p e (A) = (6, 6)$ and $s t r i d e (A) = (5, 2 \cdot 6 \cdot 5) = (5, 60)$ .
We see that $g$ corresponds to $s h a p e (B) = (10, 360)$ and $s t r i d e (B) = (2, 3 \cdot 10 \cdot 2) = (2, 60)$ .

Step_2

We will not go through the mechanical procedure above, but we can confirm visually that this is correct. We see that $3 \cdot 2 = 6$ , $5 \cdot 2 = 10$ and $3 \cdot 2 \cdot 6 \cdot 10 = 360$ .

Step_3

Forming the diagram here means simply to insert the "mutual refinement layer" in the middle and connect the codomain of $f$ to it and the domain of $g$ to it.

Step_4

Resolving the diagram is the process from above.
Left side of mutual refinement layer: For example replace $6$ by the two points it branches to ( $2$ and $3$ ) and map them via $i \to i$ .
Right side: This is even simpler. Simply "resolve" all branches via $i \to i$ .

Step_5

We remove the "mutual refinement layer". This is can be easily done by tracing from where the three arrows entering come and where they go to. For example $2$ goes into at $2$ and from there to $2$ so we can replace the two arrows by one arrow directly going from $2$ to $2$ and similar for the others.

Step_6

Read the Layout off
$s h a p e (L_{g^{'} \circ f^{'}}) = ((2, 3), 6)$
flattened stride is $(10, 60, 360)$ (see above how to obtain it)
$s t r i d e (L_{g^{'} \circ f^{'}} = ((10, 60), 360)$ is the flattened stride equipped with profile of the shape.

Step_7

$c o a l (C, s h a p e (A))$
The relative modes $(2, 3) : (10, 60)$ and $6 : 360$ are coalesced, therefore the Layout is coalesced over the shape and we are done. The relative modes are determined by the procedure described in Relative coalesce chapter of the paper.

Conclusion

This concludes the blogpost about the Colfax paper. Please consider starring the repository they released along the paper. The codebase delivered there offers also ability to produce lots of the figures shown here and of course can be used to let computer perform the calculations. However it is helpful if we know how things supposed to look like so we can sanity check the computer output and especially understand how to interpret them.