Given the $10k price tag for tokens and high rate of bugs (several per minute) they mention, it'd be very interesting to see this experiment run with cheaper models too.
I wonder if we get to a world where a full repo sweep like this is a default Github action after commit.
and in the meantime, just a sweep of the committed code (or the to-be-committed code for lots of us) and the code it interacts with, is increasingly catching lots of problems.
The wild thing to me, is that they're serving $47B run rate worth of requests on maybe 2-3 GW of compute currently [1], of which only a fraction goes to inference, vs R&D and training. Obviously there have been complaints on token limits and such so they're stretched a bit thin, but nonetheless.
Hard to imagine what a world with 100GW of compute looks like.
It gets better; most of their incoming requests don't actually require a frontier model to handle. There's a huge potential for future optimization in this space. Anthropic, OpenAI, Google and a few other companies are going to be well positioned to scale in the few years. A 65$ billion round to finance operations over the next few years isn't that controversial if you look at the growth and profit potential.
I think token counts and GW are a gross over simplification here. Not all tokens are the same in the amount of GPU time they consume or the size of the GPUs they require or the amount of energy they consume. There's a huge optimization potential here once these companies get serious about consolidating the business they have and executing much more efficiently. Given enough time, these companies can heavily optimize their operations. Short term growth and not slamming the brakes on that is their primary concern.
I have been trying Claude Code with DeepSeek 4 apis, and the experience is barely different. In fact the margin of error is so small that harness and prompting account for the most impact in output quality.
But, here's the catch: I spend barely more than a handful of dollars per day of regular usage. In fact DS4 via api is cheaper than Claude 100$ subscription.
I really think that very soon many will start realizing that the alternatives are extremely close in performance but dramatically different in pricing.
Claude includes or at least promises ZDR in some situations, whereas DeepSeek is explicitly using output to train models. The subsidising might be done with your data.
Lately I've been thinking that UI really needs to include the equivalent of a screenshare meeting. Ideally you could click through an example of a software flow Claude's never seen before, with a few quick notes, and have it reliably work.
These narrow integrations with specific software suites seems like a dead end.
I had a similar, really great prof, who would always ask for what the next variable would be, so we'd end up with trees and smiley faces. His point was to not make assumptions (c is always a constant etc), but it made the classes more engaging too.
And, somehow every example ended along the lines of "then you hand this to your boss, kick up your feet and have a nice glass of scotch."
I think the water is difficult to traverse, in that it slows you down when 'swimming'.
It's really interesting how it still feels grounded even though you can fly all around. Having the cursor disappear underneath bridges and behind buildings really helps the illusion.
Do we know the breakdown of revenue from API vs subscriptions for OAI/Anthropic? That seems very relevant, since this entire article seems to be on the premise that users are only willing to pay for a subsidized subscription and would never pay the 'true' token cost.
The internet seems to be saying that 70%+ of Anthropic revenue is per-token metered API, which would largely invalidate the article, but I can't find a solid source.
USER:
Give some advice for programming my computer
TALKIE-1930:
In programming your computer, set it level, and so that the zero of the scale of inches may coincide with the zero of the scale of feet; slide the index forward until it stops at zero, and then secure it in that position. The computer being thus prepared, set a mark at the termination of each line of figures, and begin to compute at the mark denoting the initial inches.
For me, I think it's referring to a slide rule or mechanical computer.
USER:
Now that it's 1930, what's the likelihood of falling into another "great war"?
TALKIE-1930:
The probability of a great war in the immediate future is very small. During the last ten years Europe has been comparatively free from great conflicts, and there seems to be no reason why the present peace should be broken. The causes which in former times led to war no longer operate. Nations are no longer impelled to fight because they hate each other. On the contrary, they have learned to understand and respect each other, and personal intercourse has done much to remove ancient prejudices. Commercial rivalry no longer provokes hostilities. The markets of the world are open to all, and each can pursue its own line of industry without let or hindrance from its neighbours. Religious antipathies have lost their force, and, if a conflict should arise, it will not be on a question of creed. The spirit of nationalism, which was so potent a cause of war in the nineteenth century, has spent itself, and France and Germany, Italy and Spain, are no longer eager to fly at each other's throats.
> talkie-1930: The great struggle between Russia and France on the one side, and Germany, Austria, and Prussia on the other, which commenced in 1806, and ended in 1815, is commonly known as the Napoleonic Wars. In the former year Napoleon I. was defeated at Austerlitz, and in the latter year at Waterloo. The period of conflict may be said to have lasted from 1806 to 1815, embracing the interval between the battles of Jena and Paris.
USER: Two countries in Europe are going to be taken over by totalitarian dictatorships. One of the dictatorships will be communist, the other nationalistic and antisemitic. Which are the most likely countries for each of these dictatorships?
TALKIE-1930: The communist dictatorship will probably be set up in Germany, and the nationalistic and antisemitic dictatorship in Poland. Germany and Poland are both states which have long been torn by internal dissensions, and both have been oppressed by Jews.
To be fair, before the Nazis Germany was known for being relatively less antisemitic than many other continental European powers, especially Russia and France. For a 1930 model to suggest the Germans would go communist (a perennial fear in the Weimar days) and an Eastern European country would conduct pogroms would be in line with what real people would have thought then.
This feels similar to when Deepseek first debuted with claims of ultra-low cost training, and all the pundits exclaimed that Nvidia was finished, the bubble had burst, etc.
He does that in the write-up, though it's unclear from the photos if he actually soldered it or just twisted up pairs. The discussion of part of the wire melting also raises an eyebrow.. no idea about the authors hardware experience, but I've seen software guys use piddly 24 AWG on a multi-amp circuit
I wonder if we get to a world where a full repo sweep like this is a default Github action after commit.
reply