gemm_block_simd

Function gemm_block_simd 

Source
pub fn gemm_block_simd(a: MatrixBlock, b: MatrixBlock, c: &mut MatrixBlock)
Expand description

Scalar (non-SIMD) implementation of 64 x 64 block GEMM over F_2.

Computes C = A * B + C where all arithmetic is in F_2 (XOR for addition, AND for multiplication).

§Algorithm

For each row i of A:
  For each bit position k in A[i]:
    If A[i][k] == 1:
      C[i] ^= B[k]  (XOR row k of B into row i of C)

Note that this is not quite the standard algorithm for matrix multiplication. The standard algorithm would require us to iterate over a column of A for every output bit. The three loops in the algorithm are independent, so we can instead iterate over outputs and then move down the columns of A. This way, we only need to consider one limb of A at a time, and we don’t need to do bit extractions (except for iterating over the bits of a limb).