Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

I guess I look forward to seeing 'TheBloke' do a finetune with vram reqs. If my 3090 could fit a good quality quant I'd be interested. I tried to run a mixtral with vram / sys ram offloading and got garbage ouput, but that's probably down to some mistake I made.



Try Ollama or LM Studio. Mixtral and its finetunes work perfectly for me on my RTX3090 with offloading.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: