One thing on my "bucket list" is to "use a bloom filter for something". They seem like such awesome data structures but I've never found a place to use one :-(
Chrome uses them for the "safe browsing" filter. Google HQ makes a bloom filter out of all the blocked websites and sends out the filter occasionally in an update. That way Chrome can test the URL you're about to visit for membership in the huge blacklist without taking much network/disk/memory/CPU. Of course there could be false positives but Chrome only has to phone home to double-check on possible matches. http://blog.alexyakunin.com/2010/03/nice-bloom-filter-applic...
This is exactly the scenario that a bloom filter is for.
You have an expensive lookup. You're caching information on success/fail so that you don't have to do the expensive lookup every time. But the caches are getting large.
What you do is replace the local caches with a bloomfilter. That data structure takes a bounded amount of memory. When it says, "No, I have not seen you before," you really haven't. And when it says, "I might recognize you," it is only sometimes right. However its mistakes will not really matter because you'll do the expensive lookup.
The tradeoff is that the more data you put into a bloom filter, the higher the odds are that it will think think you might have seen things before, and therefore the less useful it becomes. But in this caching situation, it saves you work even if the false positive rate is fairly high.