| | Gemlite: Towards Building Custom Low-Bit Fused CUDA Kernels (mobiusml.github.io) |
| 47 points by un_ess on Aug 15, 2024 | past | 2 comments |
|
| | Aana SDK: open-source library for building multimodal AI applications (mobiusml.github.io) |
| 2 points by ibuildthings on July 17, 2024 | past | 1 comment |
|
| | Faster and Smaller Whisper: A Deep Dive into Quantization and Torch Compilation (mobiusml.github.io) |
| 3 points by freediver on June 4, 2024 | past |
|
| | Towards 1-bit Machine Learning Models (mobiusml.github.io) |
| 351 points by homarp on March 28, 2024 | past | 157 comments |
|
| | Half-Quadratic Quantization of Large Machine Learning Models (mobiusml.github.io) |
| 1 point by Jimmc414 on March 14, 2024 | past |
|
| | Half-Quadratic Quantization of Large Machine Learning Models (mobiusml.github.io) |
| 8 points by ibuildthings on Dec 7, 2023 | past | 1 comment |
|
| | Half-Quadratic Quantization of Large Machine Learning Models (mobiusml.github.io) |
| 2 points by mobicham on Dec 7, 2023 | past | 1 comment |
|
| | Low-Rank Pruning of Llama2 (mobiusml.github.io) |
| 2 points by ibuildthings on Nov 3, 2023 | past | 3 comments |
|