Surely you’d need more ternary weights though to achieve same performance outcome?
A bit like a Q4 quant is smaller than a Q8 but also tangibly worse so the “compression” isn’t really like for like
Either way excited about more tenary progress.
Surely you’d need more ternary weights though to achieve same performance outcome?
A bit like a Q4 quant is smaller than a Q8 but also tangibly worse so the “compression” isn’t really like for like
Either way excited about more tenary progress.