So many! I manage a fund that buys small e-comm brands, and at this point the whole thing runs on a combination of AI and tools created with AI. My favorite is one that scrapes my Alibaba/WeChat/WhatsApp/email supplier convos daily and uses that to build a dashboard tracking the status of my orders.
This is definitely the most interesting question in a ton of AI applications. I think folks should be really be spending a lot of time on figuring out how to deterministically check AI outputs in a way that's reliable in order to reduce the amount of work a human has to check, and to build tools that speed up the checking process.
Thinking about all of the fake citations in legal submissions that have come up of late, it seems pretty straightforward to set up a regex that captures all forms in which a cited case might be written (I could be wrong but I'd assume there's some standard variety of formats) and search those against a database (again assuming such a database exists) to ensure they all exist.
Then for the tougher problem of making sure that the cited cases say whatever the document citing them says they do, you could have an LLM run through the document, pull out the text with the case name and text about why it's being cited, then read the case and try to determine whether the reason for citing it is valid. Rather than just give a yes/no, you'd put the doc in front of the user and let them jump from citation to citation. On each citation, it'd pop up a card that shows the literal text of why it's being cited, a judgement from the LLM of whether it matches what the case says, and snippets of text from the case as evidence + deeplinks to that text within the case.
Or maybe you wouldn't even want to give the LLM's judgement since people might rely on that without reading, but there's definitely a way to speed up the review.
I believe OpenEvidence does something like this with medical papers. If you ask it a medical question, it doesn't answer so much as link you directly to the relevant papers so you can read them and determine if they're useful. Avoids all of the potential risks of using an LLM but still hugely valuable and time-saving for docs.
1. ChatGPT 3.5 wrote me a script to pull some data out of Shopify and write it to a Google Sheet. Nothing remotely impressive by today's standards, but I had just commanded a computer to write code in plain English and it worked!
2. I own a bunch of e-comm brands, and with every new image model I tried to get product photography. Nothing worked until Nano Banana Pro, when suddenly I gave it a crappy iPhone pic of a product and got back a fully usable whitebox photo of it. Then I tried making the sort of infographic-style images you usually see on Amazon, and it nailed those too! In hindsight they weren't perfect, but more than good enough to use. I was about to ship that product to my photographer, and I would've had my designer make the infographic images, so that was the first time AI actually replaced a human contractor for me. Pretty big "Oh shit this is going to seriously impact employment" moment. Wrote about it here: https://theautomatedoperator.substack.com/p/ai-just-took-my-...
For the most part if you give it sample photos that have sufficient coverage of different angles, it's very good at faithfully reconstructing the product from whatever angle you choose.
The one exception I've encountered is baby mobiles. It really does not understand the physics there.
That depends if AI gets to the point where it can fully replace workers, as opposed to just augmenting them. I heard Alex Imas on a podcast recently talking about how a SWE can be running 10 agents to be 10x as productive, then that SWE is more valuable so firms should want to hire more SWEs and pay them more.
That works for a while, but what if AI gets to the point where it can manage the 10 agents as well as the SWE? Of course you could say the SWE can now manage 10 agents who each manage 10 agents so he's even more valuable, but that has to break down eventually. You don't need 1,000 SWEs each managing 10,000 agents - you hit a bottleneck in the ability to give them work fast enough (even if you need the SWE at the top at all).
I think it's easier to think of from the perspective of blue collar labor. It's further out there time-wise, but let's assume we get a humanoid robot that can do any labor a human can do. It costs $25,000 and maybe a couple grand a year to operate. Works 20 hours a day when it's not charging.
The construction worker it replaces isn't going to start managing a team of robots construction robots - there's already a GC doing that, and you can't scale building nearly the way you can scale writing code because of physical constraints. When the robot I've described exists, a huge swathe of the population is going to be unemployed. There's no competitor to hire them because the competitors just get robots too.
>The construction worker it replaces isn't going to start managing a team of robots construction robots
Some of them will. You've slashed construction labor cost to 5% of what it was before. With that and a similar reduction throughout the supply chain means we're going to start building a lot more stuff.
Even if we start building a lot more stuff, you still don't need those construction workers. You have a GC to manage the whole project, aided by an AI who's handling scheduling/operations/logistics. You have a detailed plan to build against.
So why do you need former construction workers to manage the robots? Why can't the GC and management AI run the whole thing?
Maybe there's some scenario where you still need something like one licensed person from each skilled trade to be responsible for the robots employing those trades. But there's no way you need everyone who worked on building sites managing robots, no matter how much construction you're doing.
That's pretty far away from where we're at. If things do get that far it's not going to be a problem. Eventually the robots will murder us in our sleep and our worries will be over.
For which poor unemployed people who just got laid off due to AI are the Robots building house for? More abstractly, for whom are we creating are these productivity miracles and surplus for. Does a rich person suddenly need a million iPhones for himself and himself alone?
"Past performance is not indicative of future results". People stopped being in farms because automation freed manual labor and people could move up in the value chain. We are getting to the point where there will be no chain left to go up to, at all. If you don't see the difference, I don't know what else to tell you.
this time we're automating humans that can do anything that humans can do.
you should be asking what happened to the horses which were replaced by the tractors.
the answer is elsewhere in the thread, their pop dropped 88% or w/e, we didnt need them anymore.
"Becoming viable" does not mean "automatically put into execution". You still need to take overall demand in account.
Consider this: if demand was not a factor, anyone living in a moderately wealthy country would be practicing labor arbitrage and sending money to poorer places. Ask yourself why this doesn't happen.
1) corporations moved manufacturing to where labor was cheap, but brought back the goods to sell them. This only works for as long as there is healhy consumer market somewhere. If AI really gets to automate most white-collar work, there will be no healthy consumer market left anywhere around the world.
2) The essay touches on this: any of the previous offshoring / job displacement movements happened on a much longer timeframe than what is being pushed now by the powers that be.
In a world where it’s dramatically cheaper to build infrastructure like roads, power, and plumbing, lots more land becomes desirable as a place to live.
Take Phoenix, for example, once air conditioning became cheap and pervasive.
The argument isn’t that AI brings the labor cost down to 0 in isolation. It brings the labor cost of the same amount of production down. So you get more production (more things = more supply -> lower prices) out of less labor.
We don't need "AI" to figure out that technological advances increase productivity. The problem with yout argument is assuming that increased productivity mean overall reduced costs. It does not.
Healthcare, housing, education all have gone up despite increased productivity. Then you have things that are already so automated that there is no way to make them cheaper unless sacrificing quality - food, clothing, etc.
Then we have all the types of consumer products that have prices completely decoupled from the cost of labor. No one in their right mind the "cost of labor" has any relation whatsoever with Apple charging $1000 for an iPhone and/or Motorola charging $180 for a Moto G.
Healthcare, housing, education all have gone up despite increased productivity.
The hypothesis of Baumol’s Cost Disease is that these industries are exactly where we should expect prices to rise because they’re still dependent on low-productivity-growth human labor.
We were talking about infrastructure costs under increasing labor productivity. Now what are we talking about?
If the premise is that AI won’t improve productivity in industries like healthcare, education, and housing construction, then why are we worried about “the dead economy”?
No. You are getting it backwards. The premise is: even if AI improves productivity, we the people are not going to benefit from it.
The mistake you are making is that you are assuming that a system where productivity per unit of labor is higher automatically translates into increased global output. It does not. This idea of a dead economy theory is precisely the concern we are heading to a world where machines can make practically everything on the cheap, but it won't matter because the moneyed class won't need to satisfy the demands of the general populace.
So we have a bunch of billionaires sitting around, surveying a world where a much smaller amount of labor will produce a much larger amount of output, and they collectively decide not to hire that labor or spend capital on the technology that generates that output in combination with that labor because… they have enough money already?
No, ffs. You are missing that if they can do whatever they want with just "a small amount of labor", then the whole system gets to a point where Capital becomes the bottleneck for global productivity. People can not be trained faster than the machines can be created, so all that capital will go to an increasingly smaller number of workers.
To illustrate the point: Facebook laid off thousands of developers at the same time that it was hiring AI researchers, paying them tens of millions of dollars as a signing bonus.
Facebook (Meta) mostly “makes” ad space. So in that case, they’re making more / better ad space for the same inputs.
Online advertising is a competitive business, so that means more bang for the buck for Facebook’s advertisers. Now those advertisers have more money to invest in making more / better of whatever they make.
Its also the 'labor theory of value'. That's the economic theory that Communism is based on. It has never been accurate and wasn't even considered legitimate during Marx's lifetime. It has possibly the worst track record of predictions of any theory ever conceived by people. Yet somehow academics still reference it. Nobody who actually is impacted by making the wrong economic predictions does though. Funny that...
> where subways cost 1/20th of what they do today.
1) We are talking about reducing the cost of labor, not overall costs.
2) Your logic only applies in the micro, not on the macro. If the cost of producing one thing goes down while population keeps their purchasing power, then what you are saying would make sense. The whole point of the article is that accelerated automation can bring a scenario where the cost of producing "things" would go down, but the economically active population would shrink drastically.
Someone will always have to prompt the AI, it can't just do that on its own. Or rather, maybe it can (you can just prompt it to "kindly do the needful" in a completely unspecific way) but the results won't be any good.
Sure, but the question is at what layer of abstraction do you have to prompt the AI?
You used to have to prompt the AI by starting to write the actual line of code you want, which it could autocomplete. Then you had to prompt it to write simple scripts or functions. The amount of scope you can prompt keeps getting bigger and bigger. Eventually, you have a PM or a CEO just telling it what features you need. Maybe it's a PM and a designer and a CEO and a CTO, but it will eventually get to the point where the number of people you need to do the prompting shrinks orders of magnitude from company sizes today. Maybe you just give the AI some money, prompt it to start a money-making business, and it goes out and does the same research and analysis that a seasoned entrepreneur would do to find an opportunity then builds out the business from there.
> the results won't be any good
Maybe, but I wouldn't bet on that. The trend over time has been that the results from prompting AI to do things have gotten better. I used to prompt it to build me dashboards and it would fail spectacularly. Now it one-shots them. Maybe the code is terrible (though doesn't matter for me, I'm the only one using it and I can verify the dashboard content is correct), but if the trend continues, it'll get better. Maybe the trend won't continue, but I've yet to come across a good explanation of why AI capabilities will just top out and cease improving forever.
The problem with the studies is that they're cases in which specific groups within the broader economy lost jobs. Those aren't really comparable to the (theoretical) path of job displacement of AI for a couple of reasons:
1. Those people didn't get substantial, ongoing financial assistance. If we end up in a UBI world, particular one where the UBI people get is high enough to get more or less anything that's not very scarce (e.g. land in coastal cities), the negative economic component of job loss is removed.
2. Everyone else still had jobs. When you lose your job and everyone else continues to work and be successful (or at least you perceive that to be the case), there's a big hit to the status and meaning in your life. If everyone is affected in the same way, then your relative status to others remains unchanged, and everyone collective needs to reorient society to find their meaning.
I'm not saying it will go well, but I do think there's a theoretically possible path where there is large scale unemployment but because we have nigh infinite productivity, everyone has access to unlimited non-scarce resources (including luxury cars and fine foods and whatever medical treatment they need), and we end up with an enormous number of competitive leagues of everything, events centered around music and arts, dinner parties and all manner of other social activities that are what give people meaning.
Let's grant the premise. UBI, significant enough to live well on, luxury cars and dinner parties for all.
Who sets the amount? Who controls the infrastructure producing the unlimited resources? What happens when you vote the wrong way, or protest the wrong policy, or simply become inconvenient?
A population with no economic function has no leverage with which to resist a reduction, a condition, or a withdrawal. You're describing a world where 99% of the people are entirely dependent on the goodwill of whoever owns the machines, and you're treating that goodwill as an unchanging variable. The history of every human institution suggests that power without accountability eventually behaves like...power without accountability. Even assuming the benevolence of the people holding all the cards isn't naive optimism, it's the same mistake that makes people say real communism just hasn't been tried yet.
Oh yeah, to be clear I fully agree with everything you've said. My core argument is that there will be sufficient economic productivity for everyone to live incredibly well. Whether or not that happens depends entirely on the people who control both economic and political power. That keeps me up at night, and things could go horribly wrong.
I guess the optimistic side of me thinks that benevolence wins out because there's no cost to it. There is plenty of competition among the wealth for scarce resources, but food, medicine, and mass-produceable luxury goods are effectively free. Given that, it's probably just easier to give those away to everyone than to crush most of humanity by force. But that is absolutely naive optimism, because I really have no control in this situation and prefer feeling naive optimism to pessimism.
And on the communism front, I will just say that I find it some combination of deeply amusing/ironic/depressing that the people on the far left protesting AI because it'll take jobs are protesting the very technology that could, in fact, lead to the first successful incarnation of communism!
> Whether or not that happens depends entirely on the people who control both economic and political power.
Then we're doomed. The kinds of people who seek and amass that power are not the kinds of people who will treat the teeming unemployed masses with respect and largess.
> protesting the very technology that could, in fact, lead to the first successful incarnation of communism
Communism's failures are due to human social factors, not technological. You can't fix social problems with technology.
> Communism's failures are due to human social factors, not technological. You can't fix social problems with technology.
Yeah, that's fair. The argument would be that the core reason communism failed is because people inherently want to have greater status than those around them, so the ones who were in charge used their power to grant themselves a greater share of resources in order to demonstrate their greater status. If we have infinite non-scarce goods, whoever's in charge can still let everyone have as they need of those while demonstrating their greater status through non-scarce goods.
To be clear this isn't a prediction (I have no idea what's going to happen!), just the case I could see for this being the first version of communism that works. Though also it's not really communism, because everyone is not in fact equal; it's something like a pseudo-communist giant welfare state.
My feeling is it's not as bad of a metric as people think. Companies don't fully know the best way to use AI and things are changing rapidly, so you want people using a lot of tokens even on stuff that seems maybe kind of dumb on the surface, because if you find one useful thing and share it in the org that makes up for a lot of failures.
But I do think you also need to say, "To be clear, don't game the system. Any token usage that is even remotely justifiable as useful for the business is fine, and we will give you a lot of latitude. But if you're in the top 10% of token users, we are going to review your token usage, and if we find that you have a dozen agents perpetually running writing slam poetry, you're going to get fired."
NVidia will probably sue you for doing that, though.
Remember that the entire mantra of "productivity is a measure of how many shovels you break and replace" is only ever echoed by the one selling the shovels.
Great stuff. My favorite genre of writing about AI is seeing how it can be practically applied to non-tech jobs/businesses. Wish we had more of this.
I'm curious about the 60% automation of financial/forensic analysis - what's missing? Is it stuff that's purely blocked by model capabilities, or are there places where scaffolding is likely to bridge the gaps?
Also curious about the workflow - is this more individual, LLM-driven features or agentic workflows? Looked like the former from the product video but there wasn't a ton of UX shown there.
I ask largely because this seems like the sort of thing where you could really start to string these features together in such a way that you start with a description of the case and whatever files you have, and then an agent does its analysis of the docs, spins up action items (get missing docs, confirm that X ambiguous doc is what the AI characterized it as, etc.) and tracks the progress of all of them, leaving your forensic accountant there in a supervisory role, managing and providing expertise.
It feels like that's the way a lot of expert analysis jobs like this are headed. I've been working on the same sort of flow to use agents to manage my business. Started with LLM skills that could be used to handle tasks I used to do myself, and since then I've increasingly been having AI use those skills on its own without me invoking them and chain things together into full blown workflows. Some parts I'm still supervising closely, but others that have been working consistently for a while I now don't really watch unless Claude flags something for me to review on my dashboard.
100% agree with this. I think takes like OP's would be much more interesting if they staked out a position in the future. I think it's pretty uncontroversial to say that someone with a great deal of technical expertise is going to be a hugely more effective LLM user today.
The question that really matters is whether that will continue to be the case. My guess is that technical expertise matters less over time, and the ability to specify the desired outcome is eventually the only thing that becomes important. But I could be wrong! The direction this all goes is pretty fuzzy in my mind.
> My guess is that technical expertise matters less over time, and the ability to specify the desired outcome is eventually the only thing that becomes important
if you look at LLMs based coding as another step up in programming abstraction then it's clear this is the case. Think about the progression of programming languages. Over time, we go further and further from the hardware and closer and closer to specifying the desired outcome. The terminology, structure, and completeness of a user story that guides a codingagent to the desired output, and only the desired output, is the new programming language.
> if you look at LLMs based coding as another step up in programming abstraction then it's clear this is the case. Think about the progression of programming languages. Over time, we go further and further from the hardware and closer and closer to specifying the desired outcome. The terminology, structure, and completeness of a user story that guides a codingagent to the desired output, and only the desired output, is the new programming language.
But that entire narrative follows from one, single, very big "If". It is not a given that AIs are a step up in abstraction.
Like, copying the answers in a test isn't considered an abstraction, I don't consider copy-pasting AI into your codebase an abstraction.
in the case of tools like claudecode there's no copy/pasting. Claudecode updates files directly, runs tests, starts/stops server, everything else on its own (with your permission).
I guess to take it a step further, you can lay your requirements in order with guidance in a markdown file called 'myprogram.md'. Then tell ClaudeCode to read that file and do what it says. In that way, myprogram.md, actually your requirements doc, is the programming language being turned into the 1s and 0s the computer understands.
I think the problem with this logic is it's based on the capabilities of LLMs today and really fails to address the prospect that they will continue to improve.
I used to be a PM and am technically literate enough but can only very minimally write code. I have been using LLMs to build (or try to, at least) internal tools for my business since GPT-4.
In the early days, I'd get a little ways, then the LLM would start breaking things, and I'd try but fail to get it to fix things. But over successive generations, I was increasingly able to get it unstuck by offering suggestions on where it may have gone wrong. With Opus 4.7, I don't even really have to do that - if something isn't working it's usually sufficient to just tell it what's broken. It can figure out how to fix it without my input. And of course fewer things are broken in the first place.
So I think I'm very well positioned to understand how these things are improving - better able to get the LLM to do what I want than the post OP quoted from /vibecoding (though I am 99% sure that post is actually AI slop), but less so than most of the people posting in this thread. As they've improved, whatever ability I have to guess at the causes of problems based on my experience having seen things go wrong with products I've PMed has become less necessary to getting the right outcome.
I expect that trend to continue - increasingly the LLM won't need the guidance of people with a great deal of technical expertise. I basically no longer have to attempt to diagnose problems in order to get them fixed, though with the caveat that I am building internal tools for which I am the only user, so certainly much simpler in scope than the stuff OP is talking about.
> Without guidance, LLMs tend to paint themselves into a corner, because they’re generating code to solve individual prompts, not thinking holistically about an application’s architecture.
The crux of what I'm trying to say here is that I absolutely believe that this line is 100% true today, but I would be deeply cautious about assuming that it will continue to be true given the improvements in LLMs over the past few years.
If we assume that there will be an AI that is perfect in terms of ability to find vulnerabilities, cheap to run and widely available to everyone, then anyone can run it on any piece of software before deploying it. All vulnerabilities get found before they can be exploited.
One of the big challenges with cybersecurity is that attackers only need to find one exploit, while defenders need to stop everything. When you have a large surface area and limited resources, it's much easier to be the side that only has to succeed once. AI eliminates the limited resources problem.
I write a Substack about the whole thing and have a pretty comprehensive list here: https://theautomatedoperator.substack.com/p/15-ways-im-using...
reply