Tag feed

#hpc

31 posts8 followers

Explore Hashnode

Alternatives

Trending tags this week

WWingEdge777wingedge777.hashnode.devMay 29 · 27 min read

[CUDA in Practice] SGEMM — Beating cuBLAS: A Deep Dive into Peak-Performance Matrix Multiplication in Pure CUDA C++

0. Preface — The Last Stand of Scalar Compute Warning: Extremely dense content ahead, with many diagrams, heavy bit-manipulation, and memory-mapping derivations. Best read on a PC. Target audience: R

0

EÖEce Özen İldemcodebeyondtheearth.hashnode.devMar 17 · 13 min read

Is Python Really That Slow… or Just a Lazy Snake? - CERN School of Computing Diaries!

Hsss... Hello again, Earthlingsss! Tonight, I feel like I’m speaking to you from the corridors of Slytherin.I’ve been using Python for nearly 10 years (woah!), and I’ve never really worried about its

0

DADaniel Alves Roselblog.alves.worldMar 15 · 8 min read

TPUs for Reinforcement Learning and Behavioral Modeling

Given that you have created a TPU deployment on GCP, you are now probably trying to figure out how to run your carefully crafted training on this compute. This overview will highlight my learning from

0

VGVinayak Gautamvinayakgautam01.hashnode.devMar 9 · 10 min read

Why Your Parallel Code Might Be Stalling: 6 Surprising Insights from Parallel Histograms

Github Repo : gpu-parallel-patterns Colab : Colab Benchmark Histogram GPU/Env : Tesla T4 / Driver 580.82.07 / CUDA 12.8 How to reproduce : scripts/bootstrap_colab.sh→ scripts/tests.sh → scripts/bench_

0

VGVinayak Gautamvinayakgautam01.hashnode.devMar 7 · 16 min read

The Hidden Geometry of Code: Why Stencils Rule (and Break) High-Performance Computing

Github Repo : gpu-parallel-patterns Colab : Colab Benchmark Stencil GPU/Env : Tesla T4 / Driver 580.82.07 / CUDA 12.8 How to reproduce : scripts/bootstrap_colab.sh→ scripts/tests.sh → scripts/bench_st

0

SSSrilakshmi Sripathiremomentum-labs.hashnode.devMar 4 · 3 min read

HelixScale: HPC-Optimized BioNemo Orchestrator

"In the high-stakes world of Drug Discovery, the bottleneck isn't just the science—it's the infrastructure. During my technical sabbatical, I've focused on mastering the 'Substrate Gap': the friction

0

VGVinayak Gautamvinayakgautam01.hashnode.devFeb 28 · 20 min read

GPU Parallel Patterns: 2D Convolution on CUDA

Github Repo : gpu-parallel-patterns Colab : Colab Benchmark Convolution GPU/Env : Tesla T4 / Driver 580.82.07 / CUDA 12.8 How to reproduce : scripts/bootstrap_colab.sh → scripts/test.sh → scripts/benc

0

ARAmir Reza Dalirunderthehood.hashnode.devFeb 26 · 3 min read

Docker vs Singularity: What Changes When You Move to HPC

If you've spent years with Docker and suddenly land on an HPC cluster that runs SingularityCE, the transition is smoother than you'd expect. Here's a condensed comparison covering the key differences.

0

JWJiajun Wang(Jesse)jiajun.deFeb 12 · 5 min read

The Ultimate Beginner’s Guide to FAU HPC: From Zero to A100

So, you’ve received an invitation to use the High-Performance Computing (HPC) cluster at FAU (likely a Tier3 project). You want to run Deep Learning, VLM, or RL experiments, but you are staring at a black terminal screen and don't know where to start...

0

BBBillion Bison HKbillionbison.hashnode.devJan 20 · 1 min read

NVIDIA Blackwell AI Servers & GPU Solutions – Powered by Billion Bison

Start your journey into next-generation computing with NVIDIA Blackwell architecture, designed for advanced AI training, machine learning, data centers, and high-performance computing (HPC). Billion Bison delivers enterprise-grade NVIDIA Blackwell GP...

0

#hpc

Search Hashnode

#hpc

Explore Hashnode

Trending tags this week

[CUDA in Practice] SGEMM — Beating cuBLAS: A Deep Dive into Peak-Performance Matrix Multiplication in Pure CUDA C++

Is Python Really That Slow… or Just a Lazy Snake? - CERN School of Computing Diaries!

TPUs for Reinforcement Learning and Behavioral Modeling

Why Your Parallel Code Might Be Stalling: 6 Surprising Insights from Parallel Histograms

The Hidden Geometry of Code: Why Stencils Rule (and Break) High-Performance Computing

HelixScale: HPC-Optimized BioNemo Orchestrator

GPU Parallel Patterns: 2D Convolution on CUDA

Docker vs Singularity: What Changes When You Move to HPC

The Ultimate Beginner’s Guide to FAU HPC: From Zero to A100

NVIDIA Blackwell AI Servers & GPU Solutions – Powered by Billion Bison