If your app depends on lots of of them, it's only going to run as fast as the slowest dependency. 1% chance of poor performance isn't to bad, the joint distribution of 20 microservices each with a 1% chance, well that gets pretty ugly. In the normal case, everything is great, but the failure modes of each service become a much bigger deal.
It's a great architecture, but fan out of dependencies is a real risk.
This is solved because you can scale each service differently. Too many DB queries from your messaging service? Upgrade the DB, add another read slave. Too much CPU load on your image processing service? Add more image processing nodes.
Breaking your system out into multiple dependencies means you can not only scale your infrastructure, but you can scale individual parts of your infrastructure based on demand, bottlenecks, usage, etc.
Netflix has talked about in the past how, because their systems are broken apart, they don't have to deal with these issues. Rating service having problems? Don't show user ratings. Search service offline for updates? Disable search. If Netflix was one giant (Rails? Django? Node?) app, it would be very difficult to cut out poorly-performing parts temporarily.
> Rating service having problems? Don't show user ratings. Search service offline for updates? Disable search
As an example of a (probably?) bad way to organize services, I worked on a project that had factored a role-based access control system into its own service. Every single web request hit this service, which made it a single point of failure, performance critical, impossible to temporarily disable, etc.
One alternative to centralized role servers is to use client certificates. I've used x509 certs for this purpose. They are pretty hairy, but so is rolling your own authentication/authorization/token system.
A low percent of poor performance is much easier to achieve if your service is simple, often so simple its answers can be cached. Even if it's outright down, some requests still can be served from the cache.
If you weave together 20 services to produce one mega-service, it's much harder to optimize for performance and even just keep the implementation correct. Caching of complicated multi-factor answers is less frequently possible.
Also, 20 microservices may feed e.g. 5 large 'end-user' services together, in various combinations. If one of the microservices is slow, only these of the end-user services are affected that actually use it.
Monolithic large services are harder to combine, so the risk that an unrelated remote service somehow gets called in the process and slows things down is higher.
Bringing those dependencies in under one roof doesn't eliminate their risk. Though it does increase the chance that an error in one of these small dependencies brings down the whole system.
With microservices if one of the services runs into trouble, you can ignore it and still serve the other 90-99% of your site without it. You can also deploy updates to services without having to deploy your entire site.
So you have short timeouts and retries that are load balanced to different nodes. But ideally your services are fast even in their 99 percentiles so this isn't an issue. This is much easier to achieve in a small service than a huge complex one.
Um. maybe. 5 machines behind a load balancer. Normal case, load is even, 100 requests to each server. One server starts running into trouble, exceeding timeouts. your load is now ~125 per server, because each client retries frequently. Is 125 enough to push over a "slow" threshold on the others? This will further magnify the load.
The load balancer will spin up more machines, so now you have 10 machines leaning on whatever the back end is.
Yes, your approach is great - but you really have to understand the failure modes - if you're living on the edge, you could have a pretty un fun cascading error.
I don't find that performance runs at % chance scale. When things perform poorly, they are often very consistent. With this architecture you can rapidly iterate & scale your poor performing core services, while the non-core services can run slowly w/o causing a big deal.
The usual alternative is redeploying and scaling the entire app to iterate on performance, which is a much slower process. If performance is a concern, microservices should be a big win.
All APIs have performance SLAs that are not consistent. You may have 50% of requests finish in 200ms or less, 90% finish in 300ms or less, and 99% finish in 600ms or less. You can do work to narrow the performance variation but performance is a % chance.
It's a great architecture, but fan out of dependencies is a real risk.