We're usually calling them "blinker tests" in our integration test suite. Reasons for blinker tests vary, but most are in line with what others here have already stated: concurrency, especially correct synchronization of test execution with stuff happening in asynchronous parts of the (distributed) system under test, is by far the biggest cause for problematic tests. This one is often exagerrated by the difference in concurrent execution on developer machines with maybe 4-6 cores and the CI server with 50-80, which often leads to "blinking" behavior that never happens locally, but every few builds on the CI server.
Second biggest is database transaction management and incorrect assumptions over when database changes become visible to other processes (which are in some way also concurrency problems, so it basically comes down to that). Third biggest is unintentional nondeterminism in the software, like people assuming that a certain collection implementation has deterministic order, but actually it doesn't, someone was just lucky to get the same order all the time while testing on the dev machine.
Second biggest is database transaction management and incorrect assumptions over when database changes become visible to other processes (which are in some way also concurrency problems, so it basically comes down to that). Third biggest is unintentional nondeterminism in the software, like people assuming that a certain collection implementation has deterministic order, but actually it doesn't, someone was just lucky to get the same order all the time while testing on the dev machine.