Discussion

Erik Chen

Aug 6, 2025

The most economical ways to run gpt‑oss‑120B

1) Use a single enterprise GPU (H100‑80GB) with MXFP4 for best $/throughput GPT‑OSS‑120B ships with 4‑bit MXFP4 weights and a Mixture‑of‑Experts (MoE) design with ~5.1B active params per token, enabling efficient inference on a single 80 GB data‑cent...

developer.tenten.co7 min read

#openai #chatgpt #ai #llm

Responses

No responses yet.

Search Hashnode

The most economical ways to run gpt‑oss‑120B

Responses