This Blog is about Flash attention. This is an optimized way to build an transformers Previously we are working on Dot product attention which containing the complexity of O(N^2). Which is a quardatic complexity. To overcome the complexity constrain...
dlwithkiran.dev4 min read
No responses yet.