Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Because a city/region/state can be uniquely identified with a postal code (hell, in Ireland, the entire address is encapsulated in the postal code), but the reverse is not true.

At scale, repeated low-cardinality columns matter a great deal.



There are ZIP codes that overlap a city and also an unincorporated area. Furthermore, there are zip codes that overlap different states. A data model that renders these unrepresentable may come back to bite you.


This assumption got me in trouble as a junior analyst years ago. I was asked to analyze our customer base and wrote something like the below. Management congratulated me on finding thousands more customers than we'd ever had before.

SELECT zipcode.rural_urban_code, COUNT(*) AS n_customer FROM customer INNER JOIN zipcode USING(zipcode) GROUP BY 1;


FYI this is not true in the US. Zip codes identify postal routes not locations


saying zipcodes uniquely identify city/state/region is like saying John uniquely identifies a human :)


EDIT: TIL that there are cross-state ZIP codes.


these kinds of things are almost never true in the real world.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: