Hacker Timesnew | past | comments | ask | show | jobs | submit | robotswantdata's commentslogin

Why are people still using Ollama? Serious.

Lemonade or even llama.cpp are much better optimised and arguably just as easy to use.


`ollama serve` and `ollama run`

The devex is great and familiar to folks who have used Docker. Reading through the Lemonade documentation, it seems like a natural migration, but we're talking about two steps for getting started versus just one. So I'd need a reason to make that much change when I'm happy enough with Ollama.


Why not? Also serious.

It seems to just work every time I try to use it, the API is easy to work with, the model library is convenient. I've never hit any kind of snag that makes me look elsewhere.


Serious answer: I don't use it that much, it's what I happened to download like 1.5 years ago, and it works fine. Happy to see what may be a speed boost, and have little interest in switching to something else (unless my situation changes, of course).

i like ollama, mostly because the cli is pretty nice. its desktop app has stupid choices like if a model can support tools then the ui should give me the "search" option but it only shows for cloud models.

i have ran lmstudio for a while but i don't really use local models that much other than to mess about.


You can also use OpenWebUI locally which should give you a nice friendly UX once you set it up.

Don’t really get the purpose for this apart from throw away projects.

For vibe coders is it really “hours” setting up a database these days? GCP cloud sql + drizzle ORM is minutes and actually scales unlike a spreadsheet, heck Claude can even write you a deployment script and run it over GCP CLI.


Cloud SQL costs gazillions, sheet is free (other than selling your data)

>sheet is free (other than selling your data)

Except the sheets-to-api SaaS charges $9/month if you want more than 250 requests.


Cloud sql lowest tier is pennies a day, this ninja platform is also not free.

A spreadsheet is a misclick away from corruption, why not spend another prompt on getting Claude to configure a db?


Which works out at $100 USD / year. You might think that's trivial, but when you start provisioning multiple environments over multiple projects it starts to add up.

It's a shame that Google haven't managed to come up with a scale to zero option or serverless alternative that's compatible.


Sheet Ninja is 108 USD / year and has tiny capacities for every metric. SQLite is free and would stomp this in every aspect on low budget hosting. Even a tiny API that stores CSV would be magnitudes more efficient.

But what would scare me the most, is that google can easily shut this thing down.


It is trivial to set up a database on GCP given that you know what you are doing and I would pay Google for that stability and support for setting up multi-tenancy and region.

Using Google spreadsheets as a backend will just cause them to charge everyone later.

Sheet Ninja isn't free. Even on their side, "free" does not mean what you think it means.


setup a DB project , use same cloud sql instance for all DBs. Did that for years on non prod or experimental projects. $100 is a bargain for what you get in terms of resiliency

> Cloud sql lowest tier is pennies a day

Unless things have improved it's also hideously slow, like trivial queries on a small table taking tens of milliseconds. Though I guess that if the alternative is google sheets that's not really a concern.


You can fire up a burstable postgres for about $20/mo

Most are lucky to get a few sign ups.

> Cloud SQL costs gazillions,

WTF is "Cloud SQL"?

I have a postgresql server running on a $5/m VPS that I add DBs to as and when I explore some new idea.



SQLite is enough for 98% of all of these use cases, and 100% of the ones this would appeal to

Don't do this guy, cloudsql costs a lot.

Costs a lot? It’s a bargain for globally resilient infrastructure.

db-f1-micro is about $10pm inc storage for something that just works and can scale, be shifted on prem etc. you can run all your vibe coded slop on one instance.


I think it can be useful if you want to use an existing Google Sheet, or if your users want to modify the database directly in Google Sheets, even though it seems pretty risky.

DGX workstations, expensive but allow PCI cards as well.

https://marketplace.nvidia.com/en-us/enterprise/personal-ai-...


It's hilarious that not a single one of these has pricing listed anywhere public.

I don't think they expect anyone to actually buy these.

Most companies looking to buy these for developers would ideally have multiple people share one machine and that sort of an arrangement works much more naturally with a managed cloud machine instead of the tower format presented here.

Confirming my hypothesis, this category of devices more or less absent in the used market. The only DGX workstation on ebay has a GPU from 2017, several generations ago.


Nvidia doesn’t list prices because they don’t sell the machines themselves. If you click through each of those links, the prices are listed on the distributor’s website. For example the Dell Pro Max with GB10 is $4,194.34 and you can even click “Add to Cart.”

I don't mean the small GB10s.

If you try to find the pricing of the GB300 towers even on the manufacturer sites, you'll see that it's not listed for any of the six or so models.


Because that's a different price point, that's getting near 100K, and the availability is very limited. I don't think they're even selling it openly, just to a bunch of partners...

The MSI workstation is the one that is showing some pricing around. Seems like some distributors are quoting USD96K, and have a wait time of 4 to 6 weeks [0]. Other say 90K and also out of stock [1]

--

  0: https://www.cdw.com/product/msi-nvidia-gb300-wkstn-72c-grace-cpu/9087313?pfm=srh
  1: https://www.centralcomputer.com/msi-ct60-s8060-nvidia-dgx-station-cpu-memory-up-to-496gb-lpddr5x-nvidia-blackwell-ultra-gpu-1x-10-gbe-2x-400-gbe.html

> I don't think they're even selling it openly, just to a bunch of partners...

Yes, that's my point.


Isnt that because nobody has released one yet? They are brand new

I don't think it's so odd, very few products above ~$50k have final prices listed for anyone to buy 1-click.

Workstations above 50k are not that uncommon.

Older xeon based workstations easily reach that number.


If you put a 50 or 80K workstation in the HP store, it will say:

"Purchasing limit reached. To complete your order and provide you with the best customer experience, please call 1-877-888-8235"


'Important' people in organizations get them. They either ask for them, or the team that manages the shared GPU resources gets tired of their shit and they just give them one.

Yes, I agree this is the use case.

Since the user here is not paying for it directly, the manufacturer does not have any incentive to list prices anywhere.


There were plenty of them around when I worked at Nvidia. They definitely exist.

You have seen plenty of third party GB300 DGX workstations?

How much do those workstations cost? All of the different manufacturers links on that page lack pricing info and you have to contact them for pricing.

Cheapest i know if is around $96k

$4000

$4k is for GB10 (DGX Spark reference design). $90-100k is for GB300 (DGX Station reference design).

Ignore the expected negativity, many here have not used the latest gen of voice agents in development. Even if used as a router , prefer that to waiting to get through

I was agreeing with all the nay-saying comments, but yours made me see the idea as good. I guess the word "luxury" ruined it for OP.

But a speech-to-text and text-to-speech system that I know is "understanding" me would be great rather than waiting music. The shop could even sell it as "As a small shop, most of our employees are busy fixing cars, so we are using AI to help with calls" (Although then people who are anxious about AI stealing jobs might hang up). The robot can ask me what I need, and then say "So for [this service], the price would be..." (to tell the caller what it has understood).

If the AI can even look at gaps in the shop's schedule and set an appointment time, the customer might even be happy that they just spent a minute on the phone instead of 10+...


I would rather just be sent to a regular old answering machine. Dealing with an AI is dehumanizing. In almost every single case where I actually need to call a place, its because I need to talk to them about something an automated system like booking an appointment, can't handle.

Congrats..?

A friend of mine worked for a call center that did car rentals, old people would call them and ask to rent a car.

Maybe the AI system should have "Press 1 to talk to AI, press 2 to leave a message" so experts like you can press 2.


I know it's intended to be dismissive, but I would appreciate the choice.

Even if the new model that came out last week totally fixed all the problems this time for real, most people's experience with chatbots is that they are prone to misunderstanding or making false statements. "Hallucinations"

I have yet to experience any degree of confidence in any output from an LLM, so I'd rather leave the message. I don't know how common this point of view is.


brutal market for lemons: the last 100 times they heard robovoice on the phone they had a terrible experience, and any money you spend fixing this is wasted because the customer cant tell your robovoice is actually honest and capable of making commitments because they all sound perfectly confident and correct even the ones who know nothing and will promise anything

Sounds like the typical dealer experience minus the ai

The current deployments of chatbots are not the bar to compare with. There’s an incoming wave of extremely capable agents and process reimagining that is going to be highly disruptive.

Been in this space over a decade and this time really is different. It’s hard for humans to perceive the exponential, it will be slow then sudden.


Lets go back to waterfall even harder and write the super correct and detailed design doc.

You jest but this is precisely what we have done. Our customers have downright rejected SCRUM. It is considered a waste of money.

At a recent AI workshop management made clear that they see AI as rendering sprints and scrums obsolete, that Kanban makes a lot more sense, and that estimating effort/story-points is also becoming meaningless. Which is a strong silver lining if you ask me.

I want to understand how AI leads to this outcome.

I think it's to do with the bottleneck shifting away from code generation and towards specifying and reviewing and integrating code. The process of working with AI agents to produce specs, tech specs, code, and reviews lends itself more to a flow-based structure (like kanban).

Bear in mind this is a B2B enterprise company with a mix of legacy and greenfield. And management has invested heavily into designing a robust spec/context-based workflow for using agents. Might be different elsewhere.

Personally I don't think scrums, planning, retros etc were better than kanban even before AI, at least if you have switched-on, motivated and smart people on your team. They actually made things less agile, and story-points give a false sense of predictability. Imo the crucial factor may be that AI agents are smart and switched-on (with the right context).


Its a good excuse to move away from a shitty process, I'll take it! Fuck SCRUM, fuck Agile. No one was doing it anyway. I had to quit an Agile job because I was shipping shit without ever getting a lick of feedback, and this was not some webdev low stakes work, it was for planning expensive real world installations.

The next AI I’m working on is going be amazing and change the world. Please back my series A you won’t regret it.

(Let’s not talk about my blockchain startup and my VR startup and my NFT startup). My house is nice though.


Are you sure about that one?

What exactly will these agents be able to do with enough consistency, accuracy, and reliability that people will want to hire them over humans?

In my experience with even the most basic implementation of agents, i.e. customer service chat bots, I literally cannot stand interacting with them even once. They are extremely unhelpful and I will hang up or immediately ask to speak to a human.


Obviously your support chatbot with talk to your flavor of clawd that will call Claude Code that will code a solution that will be reviewed by Codex that will merge and release it and then will ping clawd that will send an email to the user announcing that their issue has been fixed. /s just in case

I’ve been involved in building a system that reads structured data from a special form of contracts from a specific industry. Prices, clauses, pick up, delivery, etc. A couple hundred datapoints per contract. We had many discussions around how to present and sell an imperfect system. The thing is, the potential customers are today transcribing the contracts manually and we quickly realized that people make a ton of mistakes doing that. It became obvious when we were working on assertion datasets ourself. It’s not a perfect system and you have to consider how you use the data (aggregating for price indexing for instance), but we’re actually doing better than what people are achieving when they have to transcribe data for hours a day.

The voice agents in development right now feel 100x the current chatbots deployed by companies.

I had same opinion till a few months ago, now would prefer the [redacted company so as to not give free marketing] AI agent. You’ll start seeing this wave in around 3-6 months as most are in trial


Just sounds like gassing because you are invested in it yourself.

Most support agents lack... well, agency. If you connect a chatbot to an FAQ, that's exactly what you get. Just another instance of enterprise software being badly designed, badly written etc. It doesn't mean that it's actually an impossible problem.

They won't ever give agents the ability to actually do things for customers that can impact the company in some kind of negative fashion. At least not willingly.

That's sort of the whole point of talking to customer service though. Getting something done that you want that involves them having to do work for you. AKA you taking value from the company.

So yeah they're basically always going to be useless garbage if put together according to business requirements.

Other services should just be automated already.


They'll do the same thing we do in software development - proper sandboxing, context curation, reviews on high impact actions. I presume real customer service is really expensive, as I've seen many companies prefer to just quickly refund, or drop you as a customer entirely, rather than fix your problem. It can't get much worse than that.

It's always different this time. It always will only take a couple more months or years. And then people move on to the next hype topic.

> It’s hard for humans to perceive the exponential, it will be slow then sudden.

True, but also there are perception biases that lead us to believe progress is exponential, even though it might as well be an S-curve.

I'm having a hard time finding the right terms, but I'm sure there is some bias to think that "the line goes up".


Well ain't this a chronological oddity. Always 6 months away!

I don't want Codex dammit! I'm a Claude Code man.


Wasn’t the point of openclaw to YOLO your credentials to the internet?

Only ever a creative prompt injection away from a leak.

Saw some smarter people using credential proxies but no one acknowledges the very real risk that their “claws” commit cyber crime on their behalf once breached.


Are we sure Claude Scale™ won’t appear next month? A specialist agent that turns your vibe coded mess into a production grade scaled solution on their infrastructure.

Expect anthropic to want to capture more of the supply chain over time


If they could they would, and if they can they will. Maybe it will appear next month, and maybe 5 years from now, and we don't know which of these is more likely. But I think that if agents could actually produce good, reliable software than can evolve over time, there's little they couldn't do even beyond software. So it won't be (just) the software developers being replaced, but also software users.

Yeah which is why the solution has to be legislative. These companies are trying to take over the entire industry and even if they won’t have as good a solution as someone who only focuses on one thing they have the capital, distribution and name recognition to kill any upstarts

12 channel ddr5 5600 ECC is around 500gbs which in real world works very well for large MoE

You mean 500 GB/s, not Gb/s (actually 537 GB/s).

Unfortunately that does not matter. Even in a cheap desktop motherboard the memory bandwidth is higher than of 16-lane PCIe 5.0.

Therefore the memory bandwidth available to a discrete GPU is determined by its PCIe slot, not by the system memory.

If you install multiple GPUs, in many MBs that will halve the bandwidth of the PCIe slots, for an even lower memory throughput.


> in many MBs that will halve the bandwidth of the PCIe slots

Not on boards that have 12 channels of DDR5.

But yeah, squeezing an LLM from RAM through the PCIe bus is silly. I would expect it would be faster to just run a portion of the model on the CPU in llama.cop fashion.


It is much faster, yeah. llama.cpp supports swapping between system memory and GPU, but it’s recommended that you don’t use that feature because it’s rarely the right call vs using the CPU to do inference on the model parts in system CPU memory.

Edit: the settings is "GGML_CUDA_ENABLE_UNIFIED_MEMORY=1"... useful if you have unified memory, very slow if you do not.


llama.cpp afaik does not run portion of the model on the CPU. --cpu-moe just offloads weights to RAM, but they are still loaded to GPU for compute.

Talking about dual socket SP5 EPYC with 24 DIMM slots, 128 PCIe 5.0 lanes

It’s fast for hybrid inference, if you get the KV and MoE layers tuned between the Blackwell card(s) and offloading.

We have a prototype unit and it’s very fast with large MoEs


Where’s the opt out ?

hackernews is very upfront that they do not really care about deletion requests or anything of that sort, so, the opt out is to not use hackernews.

Time to sue them to oblivion :D.

By posting comments on this site, you are relinquishing your right to that content. It belongs to YC and it is theirs to enforce, not yours. https://www.ycombinator.com/legal/

There is no such thing under https://qht.co/ when you create your user.

Max Schrems would like a word

Is this legal advice?

Do you want it to be? I think it's safe to assume that most comments are _not_ legal advice.

Create a new account every so often, don’t leave any identifying information, occasionally switch up the way you spell words (British/US English), and alternate using different slang words and shorthand.

And do what I do - paste everything into ChatGPT and have it rephrase it. Not because I need help writing, but because I’d rather not have my writing style used against me.

I can't stand this and will actively discriminate against comments I notice in that voice. Even this one has "Not because [..], but because [..]"

I get your sentiment, though I think it's likely that people, on average, are going to organically start writing more and more like LLMs.

It's already begun.

The good rephrasing will not include that voice.

This just gives OpenAI that data.

Perhaps you could use a local translation model to rephrase (such as TranslateGemma). If translating English to English doesn't achieve this effect then use an intermediate language, one the model is good at to not mangle meaning too much.


I run Qwen 3 locally, but I mention OpenAI on HN so people understand what I’m referring to.

do the following:

sample content from users on this page: https://qht.co/leaders

and ask the LLM to rephrase it in their voice


I'm actually working on a browser extension to do just this with adversarial stylometry techniques

Look up "adversarial stylometry"

funnily enough, if everyone did this (at least make a new account often), it would prove more destructive to what HN (purposefully) wants to do than deleting the occasional account data

The back button

Then one day forgetting to close the door of the crate…..

But the dog is so used to the crate…

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: