| | Anthropic Is Taking AI Welfare Seriously. I'm Not Sure It Knows What It's Measu (lesswrong.com) |
| 2 points by joozio 57 minutes ago | past | discuss |
|
| | Estimating No-Cot Task-Completion Time Horizons of Frontier AI Models (lesswrong.com) |
| 3 points by kqr 1 day ago | past | discuss |
|
| | Even "illegible" Mythos reasoning traces seem pretty legible (lesswrong.com) |
| 10 points by kqr 1 day ago | past | 1 comment |
|
| | Only Law Can Prevent Extinction (lesswrong.com) |
| 3 points by paulpauper 2 days ago | past | discuss |
|
| | Would you choose a simulated utopia or the real world? (lesswrong.com) |
| 3 points by paulpauper 3 days ago | past | discuss |
|
| | Language models manipulating their own internal states (lesswrong.com) |
| 2 points by afpx 3 days ago | past | discuss |
|
| | How far behind are open models? (lesswrong.com) |
| 2 points by gmays 4 days ago | past | discuss |
|
| | Bun's Migration from Zig to Rust as a Potential Case Study for Gradual Disempow (lesswrong.com) |
| 2 points by joozio 5 days ago | past | discuss |
|
| | Logits as a new monitor for evaluation awareness (lesswrong.com) |
| 2 points by aranguri 9 days ago | past | discuss |
|
| | Running an Air Purifier on Batteries (lesswrong.com) |
| 2 points by mhb 10 days ago | past | discuss |
|
| | Babble and Prune (lesswrong.com) |
| 4 points by Ariarule 10 days ago | past | discuss |
|
| | There are only four skills: design, technical, management and physical (lesswrong.com) |
| 3 points by surprisetalk 10 days ago | past | discuss |
|
| | Where does the race to automate AI research end? (lesswrong.com) |
| 1 point by joozio 11 days ago | past | discuss |
|
| | Taking the Training Wheels Off: Aligning LLMs Without Personas (lesswrong.com) |
| 4 points by joozio 12 days ago | past | 1 comment |
|
| | I hired 5 people to sit behind me and make me productive for a month (2023) (lesswrong.com) |
| 6 points by LorenDB 12 days ago | past | 1 comment |
|
| | Why AI safety researchers should consider a contract research manager position (lesswrong.com) |
| 4 points by joozio 13 days ago | past | discuss |
|
| | How far behind are open models? (lesswrong.com) |
| 5 points by vesteny77 13 days ago | past | 1 comment |
|
| | Probabilistic, Reformative Justice (lesswrong.com) |
| 9 points by mdurana 14 days ago | past |
|
| | AI Researchers, Ask Yourself These 6 Questions to Strengthen Your Moral Muscles (lesswrong.com) |
| 2 points by yurivish 14 days ago | past | 1 comment |
|
| | Mnemonic portraits for 19,023 human genes (lesswrong.com) |
| 1 point by brinedew 16 days ago | past |
|
| | How far behind are open models? (lesswrong.com) |
| 11 points by alecco 16 days ago | past | 5 comments |
|
| | A Year Late, Claude Beats Pokémon (lesswrong.com) |
| 1 point by szatkus 17 days ago | past |
|
| | Many portions of Magnifica Humanitas appear to be AI-written (lesswrong.com) |
| 3 points by dev_hugepages 17 days ago | past | 1 comment |
|
| | Claude, Author of the Humanitas (lesswrong.com) |
| 1 point by doener 18 days ago | past |
|
| | Overview and Comments on Pope Leo's Magnifica Humanitas on AI (lesswrong.com) |
| 2 points by mnicky 18 days ago | past | 1 comment |
|
| | Claude, Author of the Humanitas (lesswrong.com) |
| 2 points by cubefox 18 days ago | past | 1 comment |
|
| | Judging AGI Output (2020) (lesswrong.com) |
| 2 points by merelydev 18 days ago | past |
|
| | Chinese Room re-visited: How LLM's have real but different understanding of word (lesswrong.com) |
| 3 points by stevefan1999 18 days ago | past | 1 comment |
|
| | Cognitive Security as an AI Safety Cause Area (lesswrong.com) |
| 2 points by joozio 19 days ago | past |
|
| | Implications of Predicting the Next Token (lesswrong.com) |
| 3 points by cubefox 19 days ago | past | 1 comment |
|
|
| More |