Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

The moneyquote:

"We made incorrect assumptions about the Express.js API without digging further into its code base. As a result, our misuse of the Express.js API was the ultimate root cause of our performance issue."

This situation is my biggest challenge with software these days. The advice to "just use FooMumbleAPI!" is rampant and yet the quality of the implemented APIs and the amount of review they have had varies all over the map. Consequently any decision to use such an API seems to require one first read and review the entire implementation of the API, otherwise you get the experience that NetFlix had. That is made worse by good APIs where you spend all that time reviewing them only to note they are well written, but each version which could have not so clued in people committing changes might need another review. So you can't just leave it there. And when you find the 'bad' ones, you can send a note to the project (which can respond anywhere from "great, thanks for the review!" to "if you don't like it why not send us a pull request with what you think is a better version.")

What this means in practice is that companies that use open source extensively in their operation, become slower and slower to innovate as they are carrying the weight of a thousand different systems of checks on code quality and robustness, which people using closed source will start delivering faster and faster as they effectively partition the review/quality question to the person selling them the software and they focus on their product innovation.

There was an interesting, if unwitting, simulation of this going on inside Google when I left, where people could check-in changes to the code base that would have huge impacts across the company causing other projects to slow to a halt (in terms of their own goals) while they ported to the new way of doing things. In this future world changes, like the recently hotly debated systemd change, will incur costs while the users of the systems stop to re-implement in the new context, and there isn't anything to prevent them from paying this cost again and again. A particularly Machievellan proprietary source vendor might fund programmers to create disruptive changes to expressly inflict such costs on their non-customers.

I know, too tin hat, but it is what I see coming.



You're assuming that your closed source vendors are perfectly aligned with you. In practice they almost inevitably seem to cause capture (https://en.wikipedia.org/wiki/Regulatory_capture).

Open/closed is a red herring here. Projects slowing down as they succeed seems to be a universal phenomenon, from startups to civilizations. Specialization leads to capture. I think almost exclusively about how to fix this: http://akkartik.name/about (you've seen and liked this), http://www.ribbonfarm.com/2014/04/09/the-legibility-tradeoff

Disclosure: google employee


Yep, closed source doesn't solve the problem either. If you believe that just because you're paying money for someone to take responsibility for a problem, they will actually solve the problem in a way that's amenable to you...well, there are numerous closed-source software vendors looking to sell you something.

In practice, the way to avoid this is to keep the software as simple as possible. Try to adjust to your user's most pressing current needs, not every need they might conceivably have. Killing features and deleting code is as important as launching features and writing code; make sure that your incentive systems reward this. Very often, third-party code gets pulled in to scratch one particular itch; if it's no longer itching, rip the code out. If it is still itching and you've built significant parts of your system around it, you may want to think about replacing the innards with a home-grown system.


When a software provider I use starts ripping out features I relied upon, I start looking for an alternate provider, one that isn't so eager to kill features. And in particular, I try not to learn or rely on any new features, if there is past behaviour of feature removal by the provider.

It's better to be careful - very careful - about what you add, and to have a story for migration, than to remove features.


This depends on industry, of course - in consumer web it's much better to risk pissing off a few customers but make the majority of them happy than to keep all your existing customers but risk losing out on a new innovation that gives a competitor a toe-hold. Enterprise SaaS probably has different trade-offs, and software infrastructure probably different still.

This paradox, BTW, could be thought of as the full-employment theorem for entrepreneurs. As long as it is rational for a business to avoid change for fear of having to remove or support it later, then there will exist changes that a company with no customers and no codebase could implement that no incumbent would dare. Some of these are bound to be useful to some segment of the market, and that's why you get continued disruption in technology markets.


I agree this is just part of the software iteration game. This has been Microsoft's modus operandi since they started. Microsoft still is with getting people to move to Azure/mobile, the cycling is just happening further up the stack.

Some of the iterations are good, some are just for control, mostly though it is to keep developers locked in and chasing rather than innovating. Software has to change and iterate though, but too slow or too fast can be harmful or the worthiness of iterations varies when platforms change.

On the topic of Node.js and Express though I think it is production solid and it is very fast (faster than php, ruby, python). I love it and I think here Netflix developers are at fault for not testing something a bit outside of the express default usage that is tested thoroughly. Their solution of changing to Restify so quick on a problem probably tells you how they got into this problem even if Restify is indeed better for this purpose. With any new change comes research/testing/possible problems open source or closed.

In the end this is why microframeworks, which both of the node frameworks are, do win out as they are easier to inspect and live on after the hype (monolithic frameworks not so much).


> What this means in practice is that companies that use open source extensively in their operation, become slower and slower to innovate as they are carrying the weight of a thousand different systems of checks on code quality and robustness, which people using closed source will start delivering faster and faster as they effectively partition the review/quality question to the person selling them the software and they focus on their product innovation.

To contrast with what you said, I've worked at Microsoft, which is almost the company that invented NIH and we had the same problem. I think it's because, to paraphrase Alan Perlis, programming nowadays is less about building pyramids than fitting fluctuating myriads of simpler organisms into place.


That is an excellent point. That we're building primarily distributed software rather than point software has two effects, one that the number of failure modes are quite high, and two that the 2nd and 3rd order interactions are not easily deduced by inspection. I like the organisms picture. I think of them as 'atoms' in a molecule but organisms is much better, it includes their own idiosyncratic behavior into the mix.


This is why I like choosing open source technologies with some sort of commercial support available. The best support I've ever gotten was from MySQL ab. (before Sun) -- the 10k we paid them for two years and three servers was affordable back then even for a small startup. I had a MySQL engineer (if memory serves he is now a MySQL community manager at Oracle) SSH into my server 34 minutes after a desperate call.

Disclaimer: I work for a company providing commercial support (scaling) for an open source project (Drupal).


Morgan definitely knows what he's doing. (-;


Actually, open source allows something that is not possible with commercial software. You don't have to read all of the code and understand all of the implementation details, but you have that option if you need to. I think having the option is a great benefit as we can see from this example. When they started debugging they could reference the express source and figure out what was going on. If they were using some sort of commercial framework they would have had to either refer to the docs or call the help desk.

Just because you have the option to read the implementation details of every library and service you are using doesn't mean that you have to. You only have to learn enough about it to decide whether you think it is a good addition to your stack, and to use it to do whatever you are trying to get done. But open source gives you the ability to figure out why you're using it wrong, or how it is broken when that time comes.

If you are saying that open source is not documented well enough because the developers fall back on "check the source," that is a different argument where commercial software may be better, but this is not true for any of the more common open source software I've used.

Making a decision of which software to depend on in your application is something that is always difficult whether you have access to the source code of all of the choices or not. It's a decision that you make with limited information. You only have extensive knowledge of the tools that you already have experience with, so using alternatives is always a risk, but understanding and managing that risk is part of the developer's job.

And with regard to keeping up with changes, you always can remain with previous versions for some time to avoid the slowdown associated with shifting to new APIs.


I agree with you that the greatest strength of open source is that you can just go read the code and (optionally) fix it. Netflix did this and it got them past their crisis.

I think you missed my point though, which was that there isn't any sort of fitness function being applied to open source. In the closed source environment that it 'price' (caveat walled gardens). By paying for the software customers "vote with their wallet" on the things they like/want/use. What is more if two folks are putting out the same capability they are incentivised to compete on a vague sense of value. That creates a 'culling' function which operates independently of the software creators (the only way to sell more is to be "better").

Open source doesn't have that forcing function, there isn't really even a reputation function involved where bad FOSS would give folks a bad reputation that a user could pick up on (yes we probably all know one or more committers who are jerks but that isn't quite the same thing). And there is no barrier to entry so we get multiple variations on the same conceptual solution (I've lost count of the number of CMS's or blogging packages there are out there for example).

Finally there is the various bits of entanglement, you wrote "And with regard to keeping up with changes, you always can remain with previous versions for some time to avoid the slowdown associated with shifting to new APIs." but that has not been my experience. As you're evolving your product interdependencies in the APIs and packages you are using force them to upgrade as well. There is no island of stability in the FOSS world (well maybe OpenBSD servers or something but I have not seen them in the 'stack') Backports only go on so long and suddenly you're no longer getting updates from your source as Ubuntu showed when they dropped updates for 12.x LTS. All those systems still shell shockable or heart bleedable, or something else. Because to upgrade that part, needs to upgrade the next part, and so on and so on until you've upgraded everything. And you pay that price again and again and again and again.

Much of this is just different than the other model which is not necessarily a bad model. But from a systemic point of view it is hard to partition the work and that leads to inefficiencies. A single change in an API implementation which results in dozens of engineering groups re-implementing stuff, versus a migration model that allows people to move over gradually. The promise of open source is "no capital expense" but I worry the down side will become "much higher operational expense". And when the opex cost of open source is > than the capex + opex cost of closed software, then closed software wins. Of the folks who have tried to prevent that equation from shifting in favor of closed source, I only see Redhat as a success story.


There totally is a fitness function for open-source. People who find that an open-source program creates more hassles than it solves don't use it. Folks who blog about their hassles induce other people not to use it.

There are a number of open-source projects - web.py, Mongo, Angular, Ember - that I am actively avoiding because IMHO their benefits to me don't outweigh risks or previous hassles experienced. Much better to use tried-and-true open source projects like MySQL, Postgres, vanilla JS, and Django that I have had good experiences using. There are many others like Polymer or Docker that I am intrigued by, would love to give a try, but am sitting on the sidelines until other people uncover the most obvious pitfalls.

Economically, open-source projects are just startups where the monetary cost of use is zero. That doesn't mean that the total cost of use is zero, and as you're evaluating potential products, you need to be careful to figure in time expenditure spent debugging as a cost. But you are incredibly naive if you believe that just because you are paying a million dollars for a piece of software, you will spend less time debugging and integrating it than a well-used piece of open-source software. Hell, I once was the college intern writing that million-dollar piece of software. All it takes to sell something is chutzpah.


   People who find that an open-source program creates more hassles than it 
   solves don't use it. Folks who blog about their hassles induce other people 
   not to use it.

Would that it were so, but people still use mongo. You seem to be positing perfect knowledge, whereas crummy software seems to creep in places either because of marketing or because it's the default in some system.

Docker, btw, is awesome. You can use it to replace virtualenv (which only handles python libs) and system lib isolation in a much more scriptable, more robustly deployable fashion. So you can take python projects developed at different times that use different revs of packages (and those packages can require different libs) and make deploying them way better. Try it, it's just awesome.


The same problems with imperfect knowledge exist with closed-source software too, but they're worse because you can't examine the source yourself and the vendor often has an incentive not to let flaws come to light.


> People who find that an open-source program creates more hassles than it solves don't use it.

But bad open-source software can live on to upset the productive lives of others forever, as the distribution cost is essentially free. With bad closed-source software, if the producer goes out of business (without a buyer), typically it becomes unavailable.


This is both a feature and a bug. It is definitely caveat emptor; you should do your diligence before pulling in any piece of third-party software. However, there are many, many critical systems that incorporate unmaintained open-source libraries. libxml, for example, gets a bugfix release about every year, while TeX has been asymptotically approaching version pi since 1989. That's fine; those libraries are stable, they do what they do, and I'd much rather have them available than have them disappear like closed-source alternatives.


This mirrors my experience as well. Mature well known open source is usually excellent quality. Many of the more "modern" packages are often a lot more hassle.


You can't misuse closed source APIs? How would you know something is O(n) without seeing the source code?


I wondered the same thing too. I think the answer is here:

> people using closed source will start delivering faster and faster as they effectively partition the review/quality question to the person selling them the software

I guess the theory is that if you have a customer/vendor, contractual relationship, then you can task the closed-source vendor to describe, fix, or alter their API.

But my (albeit limited) experience is that the customer is not always right; sometimes the vendor just says "that's the way it is" and you're stuck either living with it or porting to a different company/technology. Conversely, it's often possible to hire a vendor or consultant to provide expertise and support for some aspect of an open-source stack. And because it's open-source, if you don't like your consultant you can go find another one without having to change your technology.

In short, I think the speed of innovation thing might have more to do with internal teams trying to manage too much diversity of technology, than with some inherent shortcoming of open source.


>What this means in practice is that companies that use open source extensively in their operation, become slower and slower to innovate as they are carrying the weight of a thousand different systems of checks on code quality and robustness, which people using closed source will start delivering faster and faster as they effectively partition the review/quality question to the person selling them the software and they focus on their product innovation.

I think...your experiences at Google have altered your world view to the point where you don't see how thing are happening at other organizations. Google's monolithic codebase where everything builds against everything else may work(?) for them, but the alternative is disciplined module management.

You don't have to ever bump a version of working code if it's doing its job. Good open source projects should absolutely (publicly) test against performance regressions. New versions should be minor/incremental and source compatible.

I've never worked on a codebase the scale of Googles', but I fail to see how you can't mitigate your concerns, nor do I see commercial software the solution.


It's funny, my first job out of college was at a startup with an ex-Sun CTO, and he insisted on a single monolithic codebase. After he left I tried to organize an (aborted) project to modularize the codebase, since many of the other engineers ran into it as well.

Having worked at Google in the interim, and having a number of acquaintances at Microsoft (which uses the multiple-repository approach) I can see the pros and cons of both. The biggest benefit of the single codebase isn't technical, it's cultural. When you have a number of interdependent modules, then every change request and new feature has to go through that module's owner. If it's not their top priority (and it won't be), then getting your work done suddenly has a hard dependency on a team who is...generally pretty unresponsive, at least from your POV. The result is a lot of finger-pointing and political infighting, where every division thinks that every other division is a bunch of bozos.

The nice thing about Google's system (which, IIUIC, was Sun's as well, and is also Facebook's) is that when you have a hard dependency on another team and their priority list doesn't line up with yours, you can say "Well, can I make the change myself and you review it?" and the answer is usually yes. That means that people's default worldview is to assume busyness, not malice or stupidity, which makes the company as a whole function much better together. Yes, it creates a huge mess that someone will eventually have to clean up. But now you have a lot of options for how to clean it up, not a single point of failure: you can have the feature implementor do it, or the code owner, or it may get replaced entirely if the system is rewritten, or the feature may be unlaunched and no longer necessary, or someone may write an automated Clang or Refaster tool to fix a bunch of instances at once.

I'd compare it a lot to democratic capitalism: it's the worst system that exists, except for all the rest. When I was at Google, we all complained about how everybody else checked in changes that added complexity to our code. But if we couldn't do that, we'd all be complaining about how everybody else prevented us from getting our work done. I know which problem I'd rather have.


>When you have a number of interdependent modules, then every change request and new feature has to go through that module's owner. If it's not their top priority (and it won't be), then getting your work done suddenly has a hard dependency on a team who is...generally pretty unresponsive, at least from your POV.

Is that true? Again, I've never done development on that scale, so I could totally be wrong, but I'm seeing this a problem with how changes are handled and releases are published.

I do work on some Open Source projects that are used by some big companies, the project has very good reliability because of CI, specification testing, perf testing for regression to the point that merging in minor bug fixes and pushing a minor release is pretty trivial.

Let's say only some of the teams practice good module management, ok fine, you fork the repo, fix the changes, publish your fork as your new dependency and move along on your way.

Does this take discipline? Absolutely, but it severely reduces complexity in the code. It mandates a certain level of pureness by forbidding certain dependencies in the vast majority of the codebase. The upsides are tremendous when compared to the slight inconvenience of fixing the occasional bug in someone else's module.


Imagine what happens at scale if everyone forks the repo when a critical feature they need isn't being worked on by the core team. Suddenly you have dozens of forks. Now imagine that the core team eventually gets around to publishing that bugfix you really do want. All of these incompatible forks need to down-integrate the change and work around the little tweaks they've added. You have a mess.

Open source works because the projects that become popular can stake out a particular territory and say "This is our responsibility, this is your responsibility, these are the hooks you can tap into, and if you don't like it, fork or use a different project." Flip the perspective and imagine yourself as the user of an open-source project. There's always a huge amount of work you have to do to customize exactly what the framework gives you to what the product needs. This is okay, because that's the bargain you know you're signing up for when you use the open-source framework, and it is after all what you're getting paid to do. If you don't believe it saves you more effort than it costs you, then don't use it. (In fact, there's a huge graveyard of projects on GitHub where the author tosses their personal way of doing things over the wall and then wonders why nobody uses it or contributes to it. This is why: the benefits don't outweigh the costs.)

Now imagine that you're at a big corporation. There is a lot of work that somebody has to do to make the product work right. All of you are getting paid by the same entity, the company. Who does the work? The team who wants it? The team whose codebase it's in?

Open source projects have the luxury of defined interfaces; they get to set the boundaries of what their code does themselves, and can turn away customers who need more than they provide. Internal teams do not: if they say no to a critical feature (regardless of how different it is from their existing interface), the product doesn't ship, and then an executive gets mad. So often adding features is non-negotiable, the features are significant enough that there's a real risk of impacting the stability or performance of the core product - and yet it's still the right thing for the business to do. The big problem is that each individual team wants clean code and well-defined interfaces, yet the benefit accrues to the company as a whole...hence, resentment between teams about who's got to spend significant work mucking up their pristine design.


I appreciate the response, but I don't follow the argument.

>All of these incompatible forks need to down-integrate the change and work around the little tweaks they've added. You have a mess.

Is that really better than having a mess without any demarcation of the software? Either way people are going to make a change. Your way involves no disincentive to making quick fix that just your team needs. Modularized codebases means that before submitting that PR or forking, you're forced to consider how this will affect everyone who uses the library.

I understand your point about modularizing codebases causing more friction between teams. That's quite insightful and I haven't considered that. However, I think that gets down to a cultural issue of what do you value more, clean code that is easier to comprehend, or fostering geniality between teams.


I learned a lot from this post; thanks! As a MS employee (but in research), it is fascinating to study these differences. I also believe that the single codebase view is inevitable given the need for agility in the modern marketplace.


Consequently any decision to use such an API seems to require one first read and review the entire implementation of the API, otherwise you get the experience that NetFlix had.

There's no getting around this, you or someone that you trust (not necessarily at your company) needs to read and review both the API, and implementation details for open source software that you use. Open source software isn't a hardware store you can dip into to get the latest parts that you need for your project. Adding a dependency makes your code depend on other peoples code. Just like internal code should go through a code review, so should external dependencies. This also explains why for a lot of companies, it's easier to write their own thing rather than need to keep on top of other people's changes.

I happened to write about this yesterday which explains my thoughts further http://danielcompton.net/2014/11/19/dependencies.


Especially in high availability/load systems like Netflix IMO you need to reduce complexity and that means less modules that depend on 69 other modules, each of which depends on 69 modules etc.. What's going on as this sw proliferates is insane.

Then, you've got to be intimately familiar with every piece. There's no excuse when the source is readily available. I give these guys credit though they seem to take responsibility instead of just saying "express sucks". Some of their design choices seemed a little shaky.


This is precisely why — for some products in some industries — NIH is a reasonable strategy for writing good programs.


Please forgive me if I'm misinterpreting you, but the lament you make about open source software seems to me to be more about distributed systems. It is an interesting observation, and now that you've pointed it out I can observe the pattern at past and present employers. But I have seen that pattern on internal software, written by the company for the company. This makes me think the problem is an architectural one.

I hesitate to comment about the relative velocities of open source vs proprietary software because I do not have enough experience with commercial, third party software. My sample size is too small, but I'm inclined to agree with you.

I don't disagree with your Machiavellian conspiracy, either, but I've worked in marketing and advertising so I know that some of the villains are on the payroll. Maybe there needs to be a third category? There's open source software, written by someone who has no particular relationship to you. There's commercial software, written by someone who has a positive economic relationship with you. And then there is ... corporate?... software, written by someone who might think they have a zero sum relationship with you.


I would say this is more of a mature software versus new software rather that open versus closed source . Node isn't even at version 1 yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: