Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Situational awareness or just remembering specific tokens related to the strategy to "play dead" in its reasoning traces?


Imagine, a llm trained on the best thrillers, spy stories, politics, history, manipulation techniques, psychology, sociology, sci-fi... I wonder where it got the idea for deception?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: