Ben Mayer

@benmayer

Joined October 2025

About

Nothing here yet.

Available for

Nothing here yet.

Ben Mayer's blogs

Run AI Blogrunai.blog1 post

Articles Threads Comments

Recently published

BMBen Mayerrunai.blog

0

Unlocking Microsecond-Scale Latency: A Deep Dive into IMEX for Multi-GPU Inference

Nov 30, 2025 · 5 min read · Introduction In the era of trillion-parameter models, the bottleneck for Large Language Model (LLM) inference is rarely raw compute capability alone. As we scale across multiple GPUs using Tensor Parallelism (TP), the dominant latency factor shifts t...

Join discussion

Ben Mayer

About

Available for

Ben Mayer's blogs

Recently published

Unlocking Microsecond-Scale Latency: A Deep Dive into IMEX for Multi-GPU Inference

Search Hashnode

Ben Mayer

About

Available for

Ben Mayer's blogs

Recently published

Unlocking Microsecond-Scale Latency: A Deep Dive into IMEX for Multi-GPU Inference