I guess I look forward to seeing 'TheBloke' do a finetune with vram reqs. If my ... | Hacker News

Hacker Timesnew | past | comments | ask | show | jobs | submit

		wing-_-nuts on April 5, 2024 \| parent \| context \| favorite \| on: Qwen1.5-32B: Fitting the Capstone of the Qwen1.5 L... I guess I look forward to seeing 'TheBloke' do a finetune with vram reqs. If my 3090 could fit a good quality quant I'd be interested. I tried to run a mixtral with vram / sys ram offloading and got garbage ouput, but that's probably down to some mistake I made.

tosh on April 5, 2024 | [–]

GGUF: https://huggingface.co/Qwen/Qwen1.5-32B-Chat-GGUF

AWQ: https://huggingface.co/Qwen/Qwen1.5-32B-Chat-AWQ

kken on April 5, 2024 | [–]

Try Ollama or LM Studio. Mixtral and its finetunes work perfectly for me on my RTX3090 with offloading.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact