Bringing Up DeepSeek-V4-Flash on AMD MI300X

116 points by kkm 21 days ago on hackernews | 20 comments

Nice work! Would DeepSeek V4 Pro on 8xMI300X work with these patches?

Also the vllm patch accompanying the blogpost: https://github.com/doublewordai/vllm-amd-blog-doubleword

We at doubleword are bullish for AMD for low-interactivity inference - it does just take a bigger lift on the software side...

brcmthrowaway | 21 days ago

Are you long AMD?

latchkey | 21 days ago

Interesting that you ask that as AMD hits another ATH.

brcmthrowaway | 20 days ago

Then you are definitely long on AMD.

: latchkey | 20 days ago
More accurately... I'm long on a viable alternative to the current monopoly. We have two OS's for phones (android and ios), there is no reason why we shouldn't have the same for all AI hardware and software. The only one even close, is AMD.

boxking | 20 days ago

hello,sir, I want bulk order Asrock BC-250, is it still available ?

latchkey | 20 days ago

lol find the discord!

boxking | 20 days ago

yes,sir, any possibility to find 1000pcs or more

latchkey | 20 days ago

They are all over ebay. https://www.ebay.com/sch/i.html?_nkw=bc-250

I'm super curious what you would use them for.

boxking | 20 days ago

now many guys want to buy this, I am reseller AMD BC-250,it is popular now

: boxking | 20 days ago
thank you sir, actually it is not easy to find on ebay, because they are small seller, hard to find hundred, also ebay price a bit high for me

maCDzP | 21 days ago

I train on AMD MI250X and managed to get Gemma 4 31B to work - but it took a lot of work on the software side.

: [OP] kkm | 21 days ago
This is very interesting, planning to write about it?

latchkey | 21 days ago

Nice work and thanks for being a customer.

(CEO Hot Aisle)

erichocean | 20 days ago

I wish you guys could partner with Modular to get Mojo inference working on your hardware, e.g. https://www.modular.com/models/deepseek-v4-pro

: latchkey | 20 days ago
Not sure I understand. If they support MI300x, their self-hosted will run on our hardware.

edg5000 | 20 days ago

Checked out this company about a year ago and they only offered small models. Now I see they have GLM-fp8/Kimi and DeepSeek V4 Pro. Since workloads are predominantly cached input, I'm surprised to see no separate price for cached input vs uncached. I hope the prices will drop significantly; with these prices you'll end up with thousands in monthly costs quickly. Hopefully more hardware companies will be on the market in the coming years. If the Chinese eventually start competing with the current memory makers, maybe that will help.

alfiedotwtf | 20 days ago

It’s just weird Deepseek released a model that was not compatible with any of the usual engines. Without derez’s new project just to support DSv4, how long until it’s actually viable in llama :(