Building a FlashAttention Hardware Tile with Verilator: Exploring ML Accelerator Microarchitectures

Transformers have become the backbone of modern AI models, powering everything from GPT-5 to Gemini and DeepSeek. But at the hardware level, their attention layers are still a challenge — dense, memor