I guess I look forward to seeing 'TheBloke' do a finetune with vram reqs. If my 3090 could fit a good quality quant I'd be interested. I tried to run a mixtral with vram / sys ram offloading and got garbage ouput, but that's probably down to some mistake I made.