Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

an ok guideline unless, like most proverbs/sayings/mottos/slogans it gets used as a absolute rule.

Amen!

a roller coaster ride through 10 files for what could have been a 30 line function.

sometimes it's better for the whole world to just copy-paste-alter your code.

Then what happens when you have almost a hundred copy/pasted slightly rewritten 15-30 line variations on the same theme? How do you refactor then? (Yes, I have seen this is production systems, and yes, it was very critical code!) As you say, it comes down to cost/benefit.

Basically, you just have to keep potential refactoring/rewrite costs down so that you are never trapped. Caveat: You can seldom predict the risks as well as you think you can. What you can depend on, is being observant to historical patterns in your codebase. It's hard to predict the business needs and the architecture in the future. On the other hand, it's often quite easy to see the historical trends in the codebase.

Really, the analogy of the physical file room is a great one. (Or if you're not familiar, a library, a tool chest, or any kind of physical inventory system works too.) You can have a file clerk that seems like "super-filer" because he never "wastes" time by putting files back, but this never works out in the long term. The same goes for a filing staff that never reorganizes the file room. Also, one can generally see the long term disaster developing long before it results in the dramatic disaster. You can see which shelves are getting filled up, and which drawers are getting overstuffed, usually weeks or months ahead of time.



> Then what happens when you have almost a hundred copy/pasted slightly rewritten 15-30 line variations on the same theme? How do you refactor then? (Yes, I have seen this is production systems, and yes, it was very critical code!) As you say, it comes down to cost/benefit.

The only thing that can happen - you understand the core function those 30 variations solve and introduce a full parametrised solution to the problem, then replace all places with calls to it. Generally there would be some way to tell where all these copy-pastes are using something not much more complex than a regex. But even then, maybe just leaving it duplicated is better.


The concern I have with this approach is how easy it is forget to alter one of the variants when you alter one of them.

How do you know the code you're touching has other versions that are semantically the same and should also be altered?[1]

How do you avoid having to fix the same bug several times because you bug-fixed one place but not the others?

How do you avoid the technical debt that builds over time when instances of the pattern within the codebase are each at subtly different "versions" with similar, but not identical, semantics (even though identical would have worked fine?)

[1] Of course, DRY code has the inverse: How do you know if an existing function to do what you want to do already exists so you can avoid duplicating the extraction?

Where my opinion falls today is: A slightly more complex solution is often less risky and more maintainable than the straightforward duplication solution because at least the complex one looks complex to a would-be maintainer who will at least be aware of things up front, whereas duplicated code can have a bunch of hidden costs whenever it's touched that won't necessarily become apparent until later when the presence of that technical debt throws a monkey wrench into unrelated plans.


If your variants need to be similar, that alone is a great reason to abstract. It makes the abstraction more valuable.

Meanwhile, there could easily be transformative code that just computes some stuff you often need. In those cases altering one variant need not effect the others.


No, what will happen is that one of your colleagues fixes a bug in a couple of places, someone else fixes some other bugs somewhere else, and at the end the buggy duplicated code becomes a buggy mess in which no one has any idea of what should be the right behaviour. How, how, leaving duplicates and starting this bloody mess can be ever better? I honestly can't see it.


Seems equivalent to the case where you have one generalized function with a bunch of obscure special cases coded into it. I feel like having 30 functions means you can easily trace which parts of the program are exercising which special cases.


The only thing that can happen - you understand the core function those 30 variations solve and introduce a full parametrised solution to the problem, then replace all places with calls to it. Generally there would be some way to tell where all these copy-pastes are using something not much more complex than a regex.

Knowing where these things were wasn't a problem. They were all on the class side of certain classes. (Yes, this was Smalltalk, but this entire subsystem didn't have a single instance variable in it!) I was on a team of 10, with some very smart guys. We all wanted to "understand the core function" in this subsystem but what it really was, was an object system, where objects were expressed as consecutive entries in a series of arrays. Every method resembled some kind of complex merge with multiple arrays and multiple incrementing indexes and varying side effects embedded in nested conditional logic. Only one developer understood the underlying object model, and she wasn't apt to share. Rather, it was the source of her job security. (Most days, she spent in the cafe on the 1st floor, reading a book, until she got notifications, then had to "consult.") If you pointed out the "unusual" nature of an entire Smalltalk subsystem without a single instance variable in it, she started talking to you about her PhD in Math.

No, you aren't such and genius, and myself and my colleagues such dullards, that we only needed you to show up and point out a few simple truths.


If you change something that has that many parameters there's a very high chance you will introduce a bug. You better have a test for each variations. Or simply keep them separate so that they do not affect each other.


When you have to change something like that, it's time to write some tests. Regardless of if the code is in a single place or distributed.


When you have to change something like that, it's time to write some tests.

So what do you do if that's not practical? See my other comment in this subtree where I give more background information. Sometimes the cost/benefit doesn't work out at that moment. (And believe me, we would've loved to refactor that whole thing!)


Cry, mostly.

This type of thing is going to be terrible regardless of if it was in one place, if you can't test it.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: