layers

`layers` ¤

`DEFAULT_LAYER_FUSE_OPT_RULES = {SumCollapsePattern: apply_sum_collapse, TuckerPattern: apply_tucker, CandecompPattern: apply_candecomp}` `module-attribute` ¤

`DEFAULT_LAYER_SHATTER_OPT_RULES = {DenseKroneckerPattern: apply_dense_tensordot, TensorDotKroneckerPattern: apply_tensordot_tensordot}` `module-attribute` ¤

`CandecompPattern` ¤

Bases: LayerOptPatternDefn

Detect combinations of Hadamard and Sum layer to merge as CP-T layers.

Source code in cirkit/backend/torch/optimization/layers.py

class CandecompPattern(LayerOptPatternDefn):
    """Detect combinations of Hadamard and Sum layer to merge as CP-T layers."""

    @classmethod
    def is_output(cls) -> bool:
        return False

    @classmethod
    def entries(cls) -> Sequence[type[TorchLayer]]:
        return [TorchSumLayer, TorchHadamardLayer]

    @classmethod
    def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
        return [{} for _ in cls.entries()]

    @classmethod
    def config_patterns(cls) -> list[dict[str, Any]]:
        return [{"arity": 1}, {}]

`config_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def config_patterns(cls) -> list[dict[str, Any]]:
    return [{"arity": 1}, {}]

`entries()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def entries(cls) -> Sequence[type[TorchLayer]]:
    return [TorchSumLayer, TorchHadamardLayer]

`is_output()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def is_output(cls) -> bool:
    return False

`sub_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
    return [{} for _ in cls.entries()]

`DenseKroneckerPattern` ¤

Bases: LayerOptPatternDefn

Detect sum layer which have a Kronecker parameter node as parameter output.

The goal of this pattern is to replace the expensive matrix multiplication from the sum by leveraging the decomposition of the parameters from the kronecker product.

Given $W=A \otimes B$ the parameters of the sum layer, with $A$ of shape $(a_1,\dots,a_n)$ and $B$ of shape $(b_1,\dots,b_n)$ $$ \begin{align} (Wx){kl} &=((A \otimes B) x)\ &= (B (A x)^{T})_{k1}
\end{align} $$

As $W$ has shape $(a_1b_1,\dots,\a_nb_n)$, it is significantly less computationally expensive to compute the two inner products instead.

Source code in cirkit/backend/torch/optimization/layers.py

class DenseKroneckerPattern(LayerOptPatternDefn):
    r"""Detect sum layer which have a Kronecker parameter node as parameter output.

    The goal of this pattern is to replace the expensive matrix multiplication from
    the sum by leveraging the decomposition of the parameters from the kronecker product.

    Given $W=A \otimes B$ the parameters of the sum layer,
    with $A$ of shape $(a_1,\dots,a_n)$ and $B$ of shape $(b_1,\dots,b_n)$
    $$
        \begin{align*} 
        (Wx)_{kl} &=((A \otimes B) x)_{kl}\\
        &= (B (A x)^{T})_{k1}  
        \end{align*}
    $$

    As $W$ has shape $(a_1b_1,\dots,\a_nb_n)$, it is significantly less computationally
    expensive to compute the two inner products instead.
    """

    @classmethod
    def is_output(cls) -> bool:
        return False

    @classmethod
    def entries(cls) -> Sequence[type[TorchLayer]]:
        return [TorchSumLayer]

    @classmethod
    def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
        return [{"weight": KroneckerOutParameterPattern}]

    @classmethod
    def config_patterns(cls) -> list[dict[str, Any]]:
        return [{"arity": 1}]

`config_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def config_patterns(cls) -> list[dict[str, Any]]:
    return [{"arity": 1}]

`entries()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def entries(cls) -> Sequence[type[TorchLayer]]:
    return [TorchSumLayer]

`is_output()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def is_output(cls) -> bool:
    return False

`sub_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
    return [{"weight": KroneckerOutParameterPattern}]

`SumCollapsePattern` ¤

Bases: LayerOptPatternDefn

Detect adjacent sum layers that could be fused

Source code in cirkit/backend/torch/optimization/layers.py

class SumCollapsePattern(LayerOptPatternDefn):
    """Detect adjacent sum layers that could be fused"""

    @classmethod
    def is_output(cls) -> bool:
        return False

    @classmethod
    def entries(cls) -> Sequence[type[TorchLayer]]:
        return [TorchSumLayer, TorchSumLayer]

    @classmethod
    def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
        return [{} for _ in cls.entries()]

    @classmethod
    def config_patterns(cls) -> list[dict[str, Any]]:
        return [{"arity": 1}, {}]

`config_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def config_patterns(cls) -> list[dict[str, Any]]:
    return [{"arity": 1}, {}]

`entries()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def entries(cls) -> Sequence[type[TorchLayer]]:
    return [TorchSumLayer, TorchSumLayer]

`is_output()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def is_output(cls) -> bool:
    return False

`sub_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
    return [{} for _ in cls.entries()]

`TensorDotKroneckerPattern` ¤

Bases: LayerOptPatternDefn

Detect Dot layer which have a Kronecker parameter node as parameter output.

The goal of this pattern is to replace the expensive matrix multiplication from the dot layer by leveraging the decomposition of the parameters from the kronecker product.

Given $W=A \otimes B$ the parameters of the dot layer, with $A$ of shape $(a_1,\dots,a_n)$ and $B$ of shape $(b_1,\dots,b_n)$ $$ \begin{align} (Wx){kl} &=((A \otimes B) x)\ &= (B (A x)^{T})_{k1}
\end{align} $$

As $W$ has shape $(a_1b_1,\dots,\a_nb_n)$, it is significantly less computationally expensive to compute the two inner products instead.

Source code in cirkit/backend/torch/optimization/layers.py

class TensorDotKroneckerPattern(LayerOptPatternDefn):
    r"""Detect Dot layer which have a Kronecker parameter node as parameter output.

    The goal of this pattern is to replace the expensive matrix multiplication from
    the dot layer by leveraging the decomposition of the parameters from the kronecker product.

    Given $W=A \otimes B$ the parameters of the dot layer,
    with $A$ of shape $(a_1,\dots,a_n)$ and $B$ of shape $(b_1,\dots,b_n)$
    $$
        \begin{align*} 
        (Wx)_{kl} &=((A \otimes B) x)_{kl}\\
        &= (B (A x)^{T})_{k1}  
        \end{align*}
    $$

    As $W$ has shape $(a_1b_1,\dots,\a_nb_n)$, it is significantly less computationally
    expensive to compute the two inner products instead.
    """

    @classmethod
    def is_output(cls) -> bool:
        return False

    @classmethod
    def entries(cls) -> Sequence[type[TorchLayer]]:
        return [TorchTensorDotLayer]

    @classmethod
    def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
        return [{"weight": KroneckerOutParameterPattern}]

    @classmethod
    def config_patterns(cls) -> list[dict[str, Any]]:
        return [{}]

`config_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def config_patterns(cls) -> list[dict[str, Any]]:
    return [{}]

`entries()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def entries(cls) -> Sequence[type[TorchLayer]]:
    return [TorchTensorDotLayer]

`is_output()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def is_output(cls) -> bool:
    return False

`sub_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
    return [{"weight": KroneckerOutParameterPattern}]

`TuckerPattern` ¤

Bases: LayerOptPatternDefn

Detect combinations of Sum and Kroenecker product to merge in a Tucker layer

Source code in cirkit/backend/torch/optimization/layers.py

class TuckerPattern(LayerOptPatternDefn):
    """Detect combinations of Sum and Kroenecker product to merge in a Tucker layer"""

    @classmethod
    def is_output(cls) -> bool:
        return False

    @classmethod
    def entries(cls) -> Sequence[type[TorchLayer]]:
        return [TorchSumLayer, TorchKroneckerLayer]

    @classmethod
    def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
        return [{} for _ in cls.entries()]

    @classmethod
    def config_patterns(cls) -> list[dict[str, Any]]:
        return [{"arity": 1}, {}]

`config_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def config_patterns(cls) -> list[dict[str, Any]]:
    return [{"arity": 1}, {}]

`entries()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def entries(cls) -> Sequence[type[TorchLayer]]:
    return [TorchSumLayer, TorchKroneckerLayer]

`is_output()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def is_output(cls) -> bool:
    return False

`sub_patterns()` `classmethod` ¤

Source code in cirkit/backend/torch/optimization/layers.py

@classmethod
def sub_patterns(cls) -> Sequence[dict[str, ParameterOptPattern]]:
    return [{} for _ in cls.entries()]

`apply_candecomp(compiler, match)` ¤

Construct the CPT layer fusing one Sum and one Hadamard layer.

Parameters:

Name	Type	Description	Default
`compiler`	`TorchCompiler`	The current compiler doing the optimization.	required
`match`	`LayerOptMatch`	The match to optimize.	required

Returns:

Type	Description
`tuple[TorchCPTLayer]`	tuple[TorchCPTLayer]: The CPT layer replacing the sum and hadamard layers.

Source code in cirkit/backend/torch/optimization/layers.py

def apply_candecomp(compiler: "TorchCompiler", match: LayerOptMatch) -> tuple[TorchCPTLayer]:
    r"""Construct the CPT layer fusing one Sum and one Hadamard layer.

    Args:
        compiler (TorchCompiler): The current compiler doing the optimization.
        match (LayerOptMatch): The match to optimize.

    Returns:
        tuple[TorchCPTLayer]: The CPT layer replacing the sum and hadamard layers.
    """
    dense = cast(TorchSumLayer, match.entries[0])
    hadamard = cast(TorchHadamardLayer, match.entries[1])
    cpt = TorchCPTLayer(
        hadamard.num_input_units,
        dense.num_output_units,
        hadamard.arity,
        weight=dense.weight,
        semiring=compiler.semiring,
    )
    return (cpt,)

`apply_dense_tensordot(compiler, match)` ¤

Return two Dot Layer corresponding to a Sum parameterized by a Kronecker product

Parameters:

Name	Type	Description	Default
`compiler`	`TorchCompiler`	The current compiler doing the optimization.	required
`match`	`LayerOptMatch`	The match to optimize.	required

Returns:

Type	Description
`tuple[TorchTensorDotLayer, TorchTensorDotLayer]`	tuple[TorchTensorDotLayer, TorchTensorDotLayer]: the two dot layer to replace the sum layer.

Source code in cirkit/backend/torch/optimization/layers.py

def apply_dense_tensordot(
    compiler: "TorchCompiler", match: LayerOptMatch
) -> tuple[TorchTensorDotLayer, TorchTensorDotLayer]:
    r"""Return two Dot Layer corresponding to a Sum parameterized by a Kronecker product

    Args:
        compiler (TorchCompiler): The current compiler doing the optimization.
        match (LayerOptMatch): The match to optimize.

    Returns:
        tuple[TorchTensorDotLayer, TorchTensorDotLayer]: the two dot layer to
            replace the sum layer.
    """
    dense = cast(TorchSumLayer, match.entries[0])
    weight_patterns = match.sub_entries[0]["weight"]
    kronecker = cast(TorchKroneckerParameter, weight_patterns[0].entries[0])
    return _apply_tensordot_rule(
        compiler, dense.num_input_units, dense.num_output_units, dense.weight, kronecker
    )

`apply_sum_collapse(compiler, match)` ¤

Fuse two sum nodes together.

This function simply develop the two node into one single sum using matrix multiplication of the two sum's parameters.

Indeed, if we have two sums with parameters $W_1$, $W_2$: $$S_1=W_1X$$ $$S_2=W_2S_1$$ $$S_2=W_2W_1X$$

The final sums have weight : $W_2W_1$

Parameters:

Name	Type	Description	Default
`compiler`	`TorchCompiler`	The current compiler	required
`match`	`LayerOptMatch`	The match to replace	required

Returns:

Type	Description
`tuple[TorchSumLayer]`	tuple[TorchSumLayer]: The sum layer computing the two sum in one sum.

Source code in cirkit/backend/torch/optimization/layers.py

def apply_sum_collapse(compiler: "TorchCompiler", match: LayerOptMatch) -> tuple[TorchSumLayer]:
    """Fuse two sum nodes together.

    This function simply develop the two node into one
    single sum using matrix multiplication of the two
    sum's parameters.

    Indeed, if we have two sums with parameters $W_1$, $W_2$:
    $$S_1=W_1X$$
    $$S_2=W_2S_1$$
    $$S_2=W_2W_1X$$

    The final sums have weight : $W_2W_1$

    Args:
        compiler (TorchCompiler): The current compiler
        match: The match to replace

    Returns:
       tuple[TorchSumLayer]: The sum layer computing the two sum
            in one sum.
    """
    dense1 = cast(TorchSumLayer, match.entries[0])
    dense2 = cast(TorchSumLayer, match.entries[1])
    weight = TorchParameter.from_binary(
        TorchMatMulParameter(dense1.weight.shape, dense2.weight.shape),
        dense1.weight,
        dense2.weight,
    )
    dense = TorchSumLayer(
        dense2.num_input_units,
        dense1.num_output_units,
        arity=dense2.arity,
        weight=weight,
        semiring=compiler.semiring,
    )
    return (dense,)

`apply_tensordot_tensordot(compiler, match)` ¤

Return two Dot Layer corresponding to a Dot Layer parameterized by a Kronecker product

Parameters:

Name	Type	Description	Default
`compiler`	`TorchCompiler`	The current compiler doing the optimization.	required
`match`	`LayerOptMatch`	The match to optimize.	required

Returns:

Type	Description
`tuple[TorchTensorDotLayer, TorchTensorDotLayer]`	tuple[TorchTensorDotLayer, TorchTensorDotLayer]: the two dot layer to replace the sum layer.

Source code in cirkit/backend/torch/optimization/layers.py

def apply_tensordot_tensordot(
    compiler: "TorchCompiler", match: LayerOptMatch
) -> tuple[TorchTensorDotLayer, TorchTensorDotLayer]:
    r"""Return two Dot Layer corresponding to a Dot Layer parameterized by a Kronecker product

    Args:
        compiler (TorchCompiler): The current compiler doing the optimization.
        match (LayerOptMatch): The match to optimize.

    Returns:
        tuple[TorchTensorDotLayer, TorchTensorDotLayer]: the two dot layer to
            replace the sum layer.
    """

    tdot = cast(TorchTensorDotLayer, match.entries[0])
    weight_patterns = match.sub_entries[0]["weight"]
    kronecker = cast(TorchKroneckerParameter, weight_patterns[0].entries[0])
    return _apply_tensordot_rule(
        compiler, tdot.num_input_units, tdot.num_output_units, tdot.weight, kronecker
    )

`apply_tucker(compiler, match)` ¤

Create a Tucker layer that compute the sum of a kronecker product.

This optimization consists of rewriting the full operation in a single einsum to avoid computing the intermediary tensor from the kronecker product.

The output of the kronecker product which take the vectors $x$ and $y$ of shape $a$ and $b$ respectively (no batch or fold for simplicity), can be written as the following einsum:

\[ a,b \rightarrow ab \]

We would then proceed to flatten the output to get a vector $z$ of size $i=a \times b$. This vector is then used in the einsum for the sum. Given W the parameter matrix of shape $(o,i)$, the sum $Wx$ is:

\[ i,oi \rightarrow o \]

Now let's reshape the tensors to re-introduce the $a$ and $b$ dimensions. The sum would be written as:

\[ ab, oab \rightarrow o \]

We can finally substitute the results of the kronecker product for the $x$ and $y$ vectors:

\[ a,b,oab \rightarrow o \]

Thus avoiding the intermediary Kronecker product. This is exactly what the tucker layer will compute.

Parameters:

Name	Type	Description	Default
`compiler`	`TorchCompiler`	The current compiler.	required
`match`	`LayerOptMatch`	The match to replace.	required

Returns:

Type	Description
`tuple[TorchTuckerLayer]`	tuple[TorchTuckerLayer]: The tucker layer merging the two operations.

Source code in cirkit/backend/torch/optimization/layers.py

def apply_tucker(compiler: "TorchCompiler", match: LayerOptMatch) -> tuple[TorchTuckerLayer]:
    r"""Create a Tucker layer that compute the sum of a kronecker product.

    This optimization consists of rewriting the full operation in a single
    einsum to avoid computing the intermediary tensor from the kronecker
    product.

    The output of the kronecker product which take the vectors $x$ and $y$ of shape $a$ and
    $b$ respectively (no batch or fold for simplicity), can be written as the
    following einsum:

    $$
        a,b \rightarrow ab
    $$

    We would then proceed to flatten the output to get a vector $z$ of
    size $i=a \times b$. This vector is then used in the einsum for the sum.
    Given W the parameter matrix of shape $(o,i)$, the sum $Wx$ is:

    $$
        i,oi \rightarrow o
    $$

    Now let's reshape the tensors to re-introduce the $a$ and $b$ dimensions.
    The sum would be written as:

    $$
        ab, oab \rightarrow o
    $$

    We can finally substitute the results of the kronecker product for the $x$
    and $y$ vectors:

    $$
        a,b,oab \rightarrow o
    $$

    Thus avoiding the intermediary Kronecker product.
    This is exactly what the tucker layer will compute.

    Args:
        compiler (TorchCompiler): The current compiler.
        match (LayerOptMatch): The match to replace.

    Returns:
       tuple[TorchTuckerLayer]: The tucker layer merging the two operations.
    """
    dense = cast(TorchSumLayer, match.entries[0])
    kronecker = cast(TorchKroneckerLayer, match.entries[1])
    tucker = TorchTuckerLayer(
        kronecker.num_input_units,
        dense.num_output_units,
        kronecker.arity,
        weight=dense.weight,
        semiring=compiler.semiring,
    )
    return (tucker,)

layers

layers ¤

DEFAULT_LAYER_FUSE_OPT_RULES = {SumCollapsePattern: apply_sum_collapse, TuckerPattern: apply_tucker, CandecompPattern: apply_candecomp} module-attribute ¤

DEFAULT_LAYER_SHATTER_OPT_RULES = {DenseKroneckerPattern: apply_dense_tensordot, TensorDotKroneckerPattern: apply_tensordot_tensordot} module-attribute ¤

CandecompPattern ¤

config_patterns() classmethod ¤

entries() classmethod ¤

is_output() classmethod ¤

sub_patterns() classmethod ¤

DenseKroneckerPattern ¤

config_patterns() classmethod ¤

entries() classmethod ¤

is_output() classmethod ¤

sub_patterns() classmethod ¤

SumCollapsePattern ¤

config_patterns() classmethod ¤

entries() classmethod ¤

is_output() classmethod ¤

sub_patterns() classmethod ¤

TensorDotKroneckerPattern ¤

config_patterns() classmethod ¤

entries() classmethod ¤

is_output() classmethod ¤

sub_patterns() classmethod ¤

TuckerPattern ¤

config_patterns() classmethod ¤

entries() classmethod ¤

is_output() classmethod ¤

sub_patterns() classmethod ¤

apply_candecomp(compiler, match) ¤

apply_dense_tensordot(compiler, match) ¤

apply_sum_collapse(compiler, match) ¤

apply_tensordot_tensordot(compiler, match) ¤

apply_tucker(compiler, match) ¤

`layers` ¤

`DEFAULT_LAYER_FUSE_OPT_RULES = {SumCollapsePattern: apply_sum_collapse, TuckerPattern: apply_tucker, CandecompPattern: apply_candecomp}` `module-attribute` ¤

`DEFAULT_LAYER_SHATTER_OPT_RULES = {DenseKroneckerPattern: apply_dense_tensordot, TensorDotKroneckerPattern: apply_tensordot_tensordot}` `module-attribute` ¤

`CandecompPattern` ¤

`config_patterns()` `classmethod` ¤

`entries()` `classmethod` ¤

`is_output()` `classmethod` ¤

`sub_patterns()` `classmethod` ¤

`DenseKroneckerPattern` ¤

`config_patterns()` `classmethod` ¤

`entries()` `classmethod` ¤

`is_output()` `classmethod` ¤

`sub_patterns()` `classmethod` ¤

`SumCollapsePattern` ¤

`config_patterns()` `classmethod` ¤

`entries()` `classmethod` ¤

`is_output()` `classmethod` ¤

`sub_patterns()` `classmethod` ¤

`TensorDotKroneckerPattern` ¤

`config_patterns()` `classmethod` ¤

`entries()` `classmethod` ¤

`is_output()` `classmethod` ¤

`sub_patterns()` `classmethod` ¤

`TuckerPattern` ¤

`config_patterns()` `classmethod` ¤

`entries()` `classmethod` ¤

`is_output()` `classmethod` ¤

`sub_patterns()` `classmethod` ¤

`apply_candecomp(compiler, match)` ¤

`apply_dense_tensordot(compiler, match)` ¤

`apply_sum_collapse(compiler, match)` ¤

`apply_tensordot_tensordot(compiler, match)` ¤

`apply_tucker(compiler, match)` ¤