Approximating Faster Transformers
Optimizing matrix multiplication is a problem as old as time. The product of two matrices \(\mathbf{A}\) and \(\mathbf{B}\) can be looked at as the inner product of rows of \(\mathbf{A}\) with columns
shreshtha.hashnode.dev27 min read