VGVinayak Gautaminvinayakgautam01.hashnode.dev·Mar 9 · 10 min readWhy Your Parallel Code Might Be Stalling: 6 Surprising Insights from Parallel HistogramsGithub Repo : gpu-parallel-patterns Colab : Colab Benchmark Histogram GPU/Env : Tesla T4 / Driver 580.82.07 / CUDA 12.8 How to reproduce : scripts/bootstrap_colab.sh→ scripts/tests.sh → scripts/bench_00
VGVinayak Gautaminvinayakgautam01.hashnode.dev·Mar 7 · 16 min readThe Hidden Geometry of Code: Why Stencils Rule (and Break) High-Performance ComputingGithub Repo : gpu-parallel-patterns Colab : Colab Benchmark Stencil GPU/Env : Tesla T4 / Driver 580.82.07 / CUDA 12.8 How to reproduce : scripts/bootstrap_colab.sh→ scripts/tests.sh → scripts/bench_st00
VGVinayak Gautaminvinayakgautam01.hashnode.dev·Feb 28 · 20 min readGPU Parallel Patterns: 2D Convolution on CUDAGithub Repo : gpu-parallel-patterns Colab : Colab Benchmark Convolution GPU/Env : Tesla T4 / Driver 580.82.07 / CUDA 12.8 How to reproduce : scripts/bootstrap_colab.sh → scripts/test.sh → scripts/benc00