This “the only thing that matters about code is whether it meets requirements” is such a tired take and I can’t imagine anyone seriously spouting it has has had to maintain real software.
The developer UX are the markdown files if no developer ever looks at the code.
Whether you are tired of it or not, absolutely no one in your value you chain - your customers who give your company money or your management chain cares about your code beyond does it meet the functional and non functional requirements - they never did.
And of course whether it was done on time and on budget
As a consumer of goods, I care quite a bit about many of the “hows” of those goods just as much as the “whats”.
My home, which I own, for example, is very much a “what” that keeps me warm and dry. But the “how” of it was constructed is the difference between (1) me cursing the amateur and careless decision making of builders and (2) quietly sipping a cocktail on the beach, free of a care in the world.
“How” doesn’t matter until it matters, like when you put too much weight onto that piece of particle board IKEA furniture.
I know where they fucked up and cost me thousands of dollars due to cutting corners during build-out and poor architectural decisions during planning. These kinds of things become very obvious during destructive inspection, which is probably why there are so many limitations on warranties; I digress.
He’s mildly controversial, but watch some @cyfyhomeinspections on YouTube to get a good idea of what you can infer of the “how” of building homes and how it affects homeowners. Especially relevant here because he seems to specialize in inspecting homes that are part of large developments where a single company builds out many homes very quickly and cuts tons of corners and makes the same mistakes repeatedly, kind of like LLM-generated code.
So you’re saying that whether it’s humans or AI - when you delegate something to others you have no idea whether it’s producing quality without you checking yourself…
> you have no idea whether it’s producing quality without you checking yourself
No, I can have some idea. For example, “brand perception”, which can be negatively impacted pretty heavily if things go south too often. See: GitHub, most recently.
I mean, there are already companies that have a negative reputation regarding software quality due to significant outsourcing (consultancies), or bloated management (IBM), or whatever tf Oracle does. We don’t have to pretend there’s a universe where software quality matters, we already live in one. AI will just be one more way to tank your company’s reputation with regards to quality, even if you can maintain profitability otherwise through business development schemes.
> So as long as it is meeting the requirements of “it stays up consistently and doesn’t lose my code” you really don’t care how it was coded…
I don’t think we’ll come to common ground on this topic due to mismatching definitions of fundamental concepts of software engineering. Maybe let’s meet again in a year or two and reflect upon our disagreement.
If you maintain software used by tens of thousands to millions of people, you will quickly realize that no specified functional and non-functional requirements cover anywhere near all user workflows or observable behaviors.
If you mostly parachute in solutions as a consultant, or hand down architecture from above, you won’t have much experience with that, so it’s reasonable for you to underestimate it.
AWS S3 by itself is made up of 300 microservices. Absolutely no developer at AWS knows how every line of code was written.
The scalability requirements are part of the “non functional requirements”. I know that the vibe coded internal admin website will never be used by more than a dozen people just like I know the ETL implementation can scale to the required number of transactions because I actually tested it for that scalability.
In fact, the one I gave to the client was my second attempt because my first one fell flat on its face when I ran it at the required scale
I'm not talking about scalability requirements. I'm talking about the different workflows that 10 million people will come up with when they use a program that won't exist in any requirements docs.
You're not understanding what I'm saying. If you go tell your agents to add this new feature to an app, and you do it by writing up a new requirements doc. If you don't review the code, they will change a million different "implementation details" in order to add the new feature that will break workflows that aren't specified anywhere.
The code is the spec. No natural language specification will ever full cover every behavior you care about in practice. No test suite will either.
If you don't know this, you haven't maintained non-trivial software.
And have you never seen what a overzealous developer can do and wreck havoc on an existing code base without a testing harness? Let a developer lose with something like Resharper which has existed since at least the mid 2000’s
If your test don’t cover your use cases, you are just as much in danger from a new developer. It’s an issue with your testing methodology in either case.
And there is also plan mode that you should be reviewing
Of course they can. Those kinds of developers cause problems constantly. It's one of the biggest reasons we have code reviews. Automated tests help too.
But even with all of that we still have bugs and broken workflows. Now take that human and remove most of their ability to reason about how code changes affect non-local functionality and make them type 1000x faster. And don't have anyone review their code.
The code is the spec, someone needs to be reviewing it.
I personally haven't made my my mind either way yet, but I imagine that a vibecoding advocate could say to you that maintaining code makes sense only when the code is expensive to produce.
If the code is cheap to produce, you don't maintain it, you just throw it away and regenerate.
If you have users, this only works if you have managed to encode nearly every user observable behavior into your test suite.
I’ve never seen this done even with LLMs. Not even close. And even if you did it, the test suite is almost definitely more complex than the code and will suffer from all the same maintainability problems.
For one you don't let random devs hop on and off projects without code reviews, which is what people who say they don't care about the code should be doing.
And 2 clearly agents are worse at reasoning through code changes than humans are.
And the team lead with 7 developers isn’t going to be doing code reviews of all the code. At most he is going to be reviewing those critical paths.
I could care less about the implementation behind the vibe coded admin website that will only be used by a dozen people. I care about the authorization.
Even the ETL job, I cared only about the performance characteristics, the resulting correctness, concurrency, logging, and correctness of the results.
>And the team lead with 7 developers isn’t going to be doing code reviews of all the code. At most he is going to be reviewing those critical paths.
Why would the team lead need to review all 7 developers? If you're regularly swapping out every single developer on a team, you're gonna have problems.
>I could care less about the implementation behind the vibe coded admin website that will only be used by a dozen people. I care about the authorization.
If you only have 12 users sure do whatever you want. If you don't have users nothing is hard.
It was 12 users who monitored and managed the ETL job. If I had 1 million users what difference would the front end code have made if the backend architecture was secure, scalable, etc. if the login is taking 2 minutes. I can guarantee you it’s not because the developer failed to write SOLID code…
There you go arguing with strawmen again. I don’t give a single flying flip about SOLID, or Clean Code, or GoF. People who read Clean Code as their first programming but and made that their identity have been the bane of my existence as a programmer.
It’s not about how long something is taking although that is an observable behavior. It’s about how 1 million users over time will develop ways of using your product that you never thought about, much less documented or tested.
Perhaps you’ve heard the phrase “The purpose of the system is what it does”?
The system is the not the spec or the tests. An agent is only reasoning about how to add a new feature, and the only thing preventing from changing observable behavior is the tests. So if an agent is changing untested behavior it’s changing the purpose of the system.
Thats not exactly a great argument depending on undefined behavior. Should I as a developer depend on “undefined behavior” in C (yes undefined behavior is explicitly defined in C)?
On a user facing note, I did a project where I threw stats in DDB just for my own observation knowing very well that was the worse database to use since it does no aggregation type queries (sum, average, etc). I didn’t document it, I didn’t talk about it and yet the developer on their side used it when I specifically documented that he should subscribe to the SNS topic that I emit events to and ingest the data to his own Oracle database.
No library maintainer for instance of C# or Java library is going to promise that private functions that a developer got access to via reflection is not going to change.
I’m solely responsible for public documented interfaces and behaviors.
Oh and that gets back to an earlier point, how do I know that my systems will be able to be maintained? For the most part I design my systems to do a “thing” with clearly defined entry points and extension points and exit points and interfaces. In the case I’m referring to above - it was a search system that was based on “agents” some RAG based, one using a Postgres database with a similarity search, and an orchestrator. You extend the system by adding a new lambda and registering it and prioritizing results if the agent with my vibe coded GUI.
Apple is famous for instance for not caring if you tried to use private APIs and it broke in a new version.
This is a topic I happen to know a little about. You as a programmer should probably avoid UB for the most part, but the key point here is that programmers don’t follow this rule.
A while back a study found that SQLite, PostgreSQL, GCC, LLVM, Python, OpenSSL, Firefox all contained code that relied on unsigned overflow. Basically even though the C spec says it’s UB, almost every CPU you’ll run into uses twos compliment so it naturally wraps around.
When compiler authors tried to aggressively optimize and broke everything they had to roll that back and/or release flags to allow users to continue using the old behavior.
This kind of stuff happens all the time. The C spec is nearly worthless paper because what matters is what the compilers implement not what the spec tells them to implement. If you spend time talking to LLVM folks, breaking the world because they changed some unspecified behavior is one of their top concerns.
And this is programmers who know how to read specs.
Imagine you’re working on software used by nearly ever major movie studio. You think those users have ever read the spec for the software they are using? They don’t care about UB, they don’t even know the concept exists.
It doesn’t matter how well tested I think my software is. Even very simple software will have unspecified and untested behavior. You give the software a little time and some users and they will start exploiting that behavior. It I unleashed some agents on our code base to implement well architected features, without reviewing their output, and could somehow magically ensure that they didn’t break any workflow that we had documented, tested, or that was even known about to our organization, the head of NBCUniversal would be on the phone with my bosses bosses bosses boss demanding we change it back to the way it was within 24 hours.
Users depend on what the system does, not what you as a designer think it does. The purpose of a system is what it does. Not what it says it does.
We’ve been having this argument since the waterfall days. The code is the spec. We aren’t architects drawing blueprints. The code is the blueprint. If it was that easy to design systems like this all code would already be generated from UML graphs and flowcharts like we’ve been able to do for decades.
This “the only thing that matters about code is whether it meets requirements” is such a tired take and I can’t imagine anyone seriously spouting it has has had to maintain real software.