@m-kim

Mark Kim

@m-kim

tech

Joined November 2021

About

Nothing here yet.

Available for

Nothing here yet.

Mark Kim's blogs

It needs to be written downm-kim.hashnode.dev59 posts

Articles Comments

Recently published

MKMark Kimm-kim.hashnode.devJun 1 · 5 min read

Running an LLM at home

How I learned to stop worrying and love LLMs About a month ago, I found myself with a weekend to myself and I thought "Hmmm, it's been a while since I tried to run an LLM at home." It's been so long t

MKMark Kimm-kim.hashnode.devJan 10, 2024 · 1 min read

atomics, AMD Radeon and SYCL (Part 4)

Looking closely at the atomics used in the backwards rasterizer, it appears that the computation only occurs across a block. The atomics could be replaced with a reduction! So, I tried writing it in SYCL and there's warp stalling and it crashes the c...

MKMark Kimm-kim.hashnode.devDec 28, 2023 · 1 min read

atomics, AMD Radeon and SYCL (part 3)

Well, I tried to use AdaptiveCpp, but there's some weirdness about memcpy and memset going on. And, it looks like I'm not the only one to notice that atomics are terrible. Here's something that just came up five days ago that's exactly what I'm strug...

MKMark Kimm-kim.hashnode.devDec 22, 2023 · 1 min read

atomics, AMD Radeon and SYCL (Part 2)

I couldn't get AdaptiveCpp to compile the project. It segfaulted. Here are some resources for inlining assembly into SYCL for HIP/ROCm. Atomic performance issues in AdaptiveCpp. Dumping (or at least trying to) IR from icpx. HIP Clang inline assemb...

MKMark Kimm-kim.hashnode.devDec 21, 2023 · 1 min read

atomics, AMD Radeon and SYCL

Performance is atrocious for atomics on the 6600XT. Specifically, with SYCL. There's a thread from the GROMACS developers on why using AdaptiveCPP. And a thread from the IntelLLVM team. How bad is it for Gaussian Splatting? 100 iterations of ~30k par...

Mark Kim

About

Available for

Mark Kim's blogs

Recently published

Running an LLM at home

atomics, AMD Radeon and SYCL (Part 4)

atomics, AMD Radeon and SYCL (part 3)

atomics, AMD Radeon and SYCL (Part 2)

atomics, AMD Radeon and SYCL

Search Hashnode

Mark Kim

About

Available for

Mark Kim's blogs

Recently published

Running an LLM at home

atomics, AMD Radeon and SYCL (Part 4)

atomics, AMD Radeon and SYCL (part 3)

atomics, AMD Radeon and SYCL (Part 2)

atomics, AMD Radeon and SYCL