@mauriziomorri
Biology, AI, Molecules and Machines
Nothing here yet.
Nothing here yet.
Mar 10 · 3 min read · Most performance talk about large language models still fixates on raw compute, but long context serving is usually a memory problem first. During decoding, the model must reuse the key value cache fo
Join discussionFeb 20 · 3 min read · If you run CI CD the same way you did a few years ago, you are probably shipping faster than ever and also trusting more invisible machinery than ever. The pipeline is now part of your threat model. T
Join discussion