Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

From the article:

> What did we learn from this harrowing experience? First, we need to fully understand our dependencies before putting them into production.

Is that the lesson to learn? That scares me, because a) it's impossible, and b) it lengthens the feedback loop, decreasing systemic ability to learn.

The lesson I'd learn from that would be something like "Roll new code out gradually and heavily monitor changes in the performance envelope."

Basically, I think the approach of trying to reduce mean time between failure is self-limiting, because failure is how you learn. I think the right way forward for software is to focus on reducing incident impact and mean time to recovery.



Without over-training on this one incident, and without guidance on how to get from here to there (I'm still working on that):

1. Don't get suckered by interfaces, share code. If you create code for others to share ("libraries"), stop trying to hide its workings.

2. You don't have to learn how everything works before you do anything. But you should expect to learn about internals proportional to the time you spend on a subsystem. Current software is too "lumpy" -- it requires days or months of effort before yielding large rewards. The first hour of investigation should yield an hour's reward.

3. "Production" is not a real construct. There will always be things that break so gradually that you won't notice until they've gone through all your processes. Give up on up-front prevention, focus instead on practicing online forensics. And that starts with building up experience on your dependencies.

More elaboration: http://akkartik.name/post/libraries2

My attempt at a solution: http://akkartik.name/about

My motto: reward curiosity.


> I think the right way forward for software is to focus on reducing incident impact and mean time to recovery.

So in this case, guarantee you have a strong means of evaluating performance and maybe even include it by default just to be sure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: