Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

or maybe they don’t actually cache (fully) but lie and just don’t charge the user right now. at least half the users, who are probably also using the most similar tokens / prompts, wouldn’t really know the difference in latency (or care)
 help



If it actually cost that much RAM, they would almost certainly add extra things to the API to manage cache lifetime. Ie. A 'please cache this for X minutes' flag, or a setting for a single re-use cache (the most common use case)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: