Flash Attention
This Blog is about Flash attention. This is an optimized way to build an transformers
Previously we are working on Dot product attention which containing the complexity of O(N^2). Which is a quardatic complexity. To overcome the complexity constrain...
dlwithkiran.dev4 min read