| | Making FlashAttention-4 faster for inference (modal.com) |
| 3 points by birdculture 13 hours ago | past | discuss |
|
| | Making FlashAttention-4 faster for inference (modal.com) |
| 2 points by matt_d 1 day ago | past | discuss |
|
| | Modal Major Outage (modal.com) |
| 4 points by hunkins 9 days ago | past | 2 comments |
|
| | Modal's Series C: Raising $355M at a $4.65B valuation (modal.com) |
| 2 points by yla92 19 days ago | past |
|
| | Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint (modal.com) |
| 91 points by charles_irl 25 days ago | past | 18 comments |
|
| | How to Achieve Truly Serverless GPUs (modal.com) |
| 1 point by birdculture 27 days ago | past |
|
| | How to Achieve Serverless GPUs (modal.com) |
| 4 points by gmays 28 days ago | past |
|
| | How to Achieve Truly Serverless GPUs (modal.com) |
| 2 points by birdculture 29 days ago | past |
|
| | How to Achieve Truly Serverless GPUs (modal.com) |
| 3 points by birdculture 30 days ago | past |
|
| | How to Achieve Serverless GPUs (modal.com) |
| 8 points by charles_irl 31 days ago | past |
|
| | Boosting multimodal inference performance by >10% with a single Python dict (modal.com) |
| 16 points by jxmorris12 37 days ago | past |
|
| | Accelerating AI research that accelerates AI research (modal.com) |
| 3 points by tosh 3 months ago | past |
|
| | Agents need good developer experience too (modal.com) |
| 2 points by birdculture 4 months ago | past |
|
| | Agents need good developer experience too (modal.com) |
| 1 point by birdculture 4 months ago | past |
|
| | Three types of LLM workloads and how to serve them (modal.com) |
| 75 points by charles_irl 4 months ago | past | 5 comments |
|
| | LLM architecture has evolved from GPT-2 to GPT-OSS (2025) (modal.com) |
| 2 points by jxmorris12 4 months ago | past |
|
| | Keeping 20k GPUs Healthy (modal.com) |
| 3 points by aburan28 4 months ago | past |
|
| | Keeping 20k GPUs healthy (modal.com) |
| 134 points by jxmorris12 4 months ago | past | 62 comments |
|
| | High-Performance LLM Inference (modal.com) |
| 1 point by birdculture 4 months ago | past |
|
| | Keeping 20k GPUs Healthy (modal.com) |
| 2 points by MasterScrat 5 months ago | past |
|
| | Keeping 20k GPUs Healthy (modal.com) |
| 3 points by birdculture 5 months ago | past |
|
| | Keeping 20,000 GPUs Healthy (modal.com) |
| 1 point by susam 5 months ago | past |
|
| | Keeping 20k GPUs Healthy (modal.com) |
| 3 points by birdculture 5 months ago | past |
|
| | Keeping 20k GPUs Healthy (modal.com) |
| 3 points by birdculture 5 months ago | past | 1 comment |
|
| | What Is Arithmetic Bandwidth? (modal.com) |
| 2 points by jxmorris12 5 months ago | past |
|
| | Keeping 10k GPUs Healthy (modal.com) |
| 2 points by birdculture 5 months ago | past | 1 comment |
|
| | GPU memory snapshots: sub-second startup (2025) (modal.com) |
| 27 points by jxmorris12 5 months ago | past | 13 comments |
|
| | Sandboxed Claude Code GIF Creator (modal.com) |
| 1 point by birdculture 5 months ago | past |
|
| | Agents need good developer experience too (modal.com) |
| 1 point by birdculture 5 months ago | past |
|
| | Host overhead is killing your inference efficiency (modal.com) |
| 3 points by birdculture 5 months ago | past |
|
|
| More |