Mostly because they chose the wrong backend technology (which they have been doing repeatedly since their early days).
The right way to solve twitter would be to have 140-byte tweets sorted by a <userid,time> 64-bit key, with a few more attributes (all falls into 256-bytes neatly), shard them across servers and keep everything recent in memory.
Logging into a server would fetch the list of following to the front end server, broadcast the request to all tweet servers, wait 50ms or so for responses, merge, sort and HTML format them.
The front end servers would not need any memory or disk (could be an army of $500 servers behind a load balance, or a few beefy ones). The backend servers would have to have some beefy CPU and memory, but still ultra commodity (256 bytes/tween means 1GB=4M tweets, so one 64GB server=256M tweets). Shard for latency, redundancy, etc. Also, special case the Gagas/Kutchers of this world by giving them their own server, and/or have them broadcast to and cache their tweets in the front end servers (Spend 256MB memory on tweet cache in the front end servers, and you get 1M cached tweets - which would cover all of the popular people and then some).
> It's hard to make that realtime, because you'd need to some sort of broadcast every second or at least every few seconds.
Twitter isn't realtime either - they say they don't succeed to stay within the 5 seconds all the time, and when Gaga tweets it takes up to 5 minutes.
Furthermore, I was talking about broadcasting a request for updates on demand when needed. PGM/UDP can blast through hundreds of megabytes per second on gigabit connection. That's quite easy. 22MB/sec is nothing, even 100MB/sec is not much these days (though you might have to bond/team to make that reliable)
> It's hard to follow a lot of people (you could be getting a large number of replies), so there would need be a follow limit.
Not at all. Make the replies (at most) 5 tweets from each person you follow, with a "and there's more ..." flag in the reply, have the front end ask for more if it makes sense once the 50ms is done.
It's ok if the 300 people who follow one million people take 200ms instead of 50ms to get a reply. And if you want to make it quicker for them, have these ones (and only these ones) on a "push" rather than "pull" model. The vast majority of people follow less than 50 people, perhaps less than 20.
> Facebook has a follow limit and people don't really expect Facebook to be realtime - the central bit is not really.
People do expect facebook to be realtime, it mostly delivers (better than twitter), their limits are not hard (I know people who asked and got them lifted within a few minutes).
> Also, there is no one right way to do these things, in my view.
No, but there's a lot of wrong ways, and twitter keeps choosing among them.
The right way to solve twitter would be to have 140-byte tweets sorted by a <userid,time> 64-bit key, with a few more attributes (all falls into 256-bytes neatly), shard them across servers and keep everything recent in memory.
Logging into a server would fetch the list of following to the front end server, broadcast the request to all tweet servers, wait 50ms or so for responses, merge, sort and HTML format them.
The front end servers would not need any memory or disk (could be an army of $500 servers behind a load balance, or a few beefy ones). The backend servers would have to have some beefy CPU and memory, but still ultra commodity (256 bytes/tween means 1GB=4M tweets, so one 64GB server=256M tweets). Shard for latency, redundancy, etc. Also, special case the Gagas/Kutchers of this world by giving them their own server, and/or have them broadcast to and cache their tweets in the front end servers (Spend 256MB memory on tweet cache in the front end servers, and you get 1M cached tweets - which would cover all of the popular people and then some).
Network broadcast was invented for a reason.