I would counter that voice assistants are not a failed experiment. I would sugge...

rwc · on Jan 11, 2024

Things don't seem to be going well at Humane, either: https://www.fastcompany.com/91008117/ai-pin-humane-layoffs

chucksmash · on Jan 11, 2024

No horse in the race, but the layoff is only ten people from a 250 person company that is still hiring. I'm a little surprised it even merited an article.

modeless · on Jan 11, 2024

I agree, I just think these new projects based on LLMs should be considered a different category, AI agents or something. The traditional grammar-based voice assistant architecture used by Google Assistant, Siri, Alexa, Bixby, etc is an abject failure, and not for lack of trying.

sokoloff · on Jan 11, 2024

If the Alexa/Echo-category of devices are meant to be some widely flexible and pervasively useful device from Star Trek, they're a failure.

If they're meant to be a usable $30 kitchen timer and music player, they're pretty great.

modeless · on Jan 11, 2024

Google Voice Search used internal codename "Majel" in reference to Majel Barrett, who played the voice of the Star Trek computer. That was explicitly the ambition. It just didn't work out.

sokoloff · on Jan 11, 2024

OK. I believe you. They accidentally created something else that was very useful but different from what they set out to do.

Is that a story of failure or of success?

Starbucks launched to sell beans and espresso machines. YouTube launched as a video dating site. Are they also failures?

modeless · on Jan 11, 2024

I guess where we disagree is "very useful". If Google Assistant stopped working tomorrow I would hardly care at all. There are a couple of scenarios where it's slightly more convenient than using my phone (assuming I don't encounter one of its many failure modes) and that's about it. I'm sure the hands free aspect is important for certain people in certain situations but I think the vast majority of people just don't see a lot of value from it.

sokoloff · on Jan 11, 2024

Amazon alone has sold over half-a-billion Alexa-enabled units (around 10 of them to me).

I think people see more than $30 of value in them, at least as their revealed preferences suggest.

modeless · on Jan 11, 2024

Those sales were subsidized in expectation of future profitability that will never come (at least not without a ground-up redesign of the product around LLM-based AI agents or some other paradigm). Economically Alexa is a "colossal failure": https://arstechnica.com/gadgets/2022/11/amazon-alexa-is-a-co...

pests · on Jan 11, 2024

Yeah, my crock pot and ceiling fan are "Alexa-enabled". That's two.

usrusr · on Jan 11, 2024

But how often did that computer voice-acted by her really do significantly more than "set timer to thirty minutes"? Outside of some broken plots on the original series ("we have insufficient data to know the truth, let's ask the computer who will tell us anyways!"), it really was mostly mundane voice assistant stuff.

(I'm deliberately excluding the "ten words to 'author' a holodeck scene" part, that had always been stretching my imagination a little too far, more "this can't work!" than space travel and transporter beams. Then stable diffusion happened)

shiroiuma · on Jan 16, 2024

There were some scenes where Riker on the bridge asked the computer essentially a SQL query: "give me a list of star systems with parameters that fit X and cross-reference by Y..." "There are 3 systems which fit your query: ..."

ChildOfChaos · on Jan 11, 2024

That might have been the initial hope of the team, before Google killed it. It's been on the graveyard for years with zero updates, my google assistance nest mini is arguably worse than when I bought it.

I believe it is possible to have made it better, but they didn't try, they just gave up, like much of google products.

fshbbdssbbgdd · on Jan 11, 2024

I think way too much money has been sunk into those projects if their ambition is just to be a $30 kitchen timer and music player.

mlyle · on Jan 11, 2024

Yup, as far as something that can turn on a light, run a timer, convert some units, tell me the weather, and have an -okay- shot at some categories of random questions instead of me getting out my computer-- the google assistant is just fine.

pseudosavant · on Jan 11, 2024

They do succeed in that way many times a day every day in my house

a_gnostic · on Jan 11, 2024

As '80s telescreens they were fantastic.

fshbbdssbbgdd · on Jan 11, 2024

I think many parts of the architecture can be reused - in Alexa terms, all of the “skills” that integrate the assistant with various other services. IMO one of the main problems with assistants is that I don’t know what skills are available or how to invoke them. It’s like I’m a wizard who has to memorize all the spells I could be casting. It never happens because I don’t care enough. I think LLM’s could potentially help my making it easier to discover and invoke those skills.

7thaccount · on Jan 11, 2024

This "spells" is such a great way to explain how it feels to me to use these assistants. I'll play with one if I'm at a friend's house, but honestly can't see the appeal. Telling Google to change the color of the lighting or brightness just seems like something that is mostly a gimmick unless you're maybe disabled and then it may be a big quality of life improvement. The other stuff doubly so.

With ChatGPT I can see the appeal for certain tasks like having it create a custom text adventure for you, but I can't see it being too useful in my day to day life yet.

modeless · on Jan 11, 2024

"Skills" will be obsolete very soon. AI agents will use the same software tools and services that humans do. They won't need special separate AI-only interfaces.

I'm not excited about the Rabbit R1 as a hardware device but their software vision is exactly right and there are new startups coming out of stealth seemingly every day now attacking this problem.

vineyardmike · on Jan 11, 2024

Skills are just APIs that conform to a similar look. We'll definitely continue to have AI-only or developed-for-AI APIs for future "agents" to act against. They probably won't spend much effort formatting text to sound good to a person, but the infrastructure is here.

modeless · on Jan 11, 2024

I disagree. These special APIs will not have the breadth of capabilities that the human UI does, so AIs will use the human UIs out of necessity. But I think in the long term we will eventually see a simplification of UIs. As it becomes less common for humans to actually use them, they will no longer need fancy animations or dark mode or client-side validation or pretty styling. In the extreme, a return to plain HTML forms that a human can use in a pinch but are mostly used by AI agents. At that point I guess you're blurring the lines between UI and API.

usrusr · on Jan 11, 2024

Isn't it the exact opposite? Interfaces we use every day can be dead simple, all they need is that they don't change behind our back. The accelerator pedal does not come with a footover pop-up "keep pressed to make car go". Interfaces we use once in a leap year on the other hand, that's where we need all the hand-holding we can get.

ramraj07 · on Jan 11, 2024

Not sure if any comment that holds Humane in high regard can be taken seriously though.

aspectmin · on Jan 11, 2024

It’s interesting. I had not heard the latest. Their initial videos looked promising.

I ordered a rabbit r1. I can see it’s missing some key functionality (Bluetooth for headphones so everyone doesn’t have to hear?) But… I think it’s an example of a first, promising, step to the era of real voice assistants/agents.

CWIZO · on Jan 11, 2024

> Bluetooth for headphones so everyone doesn’t have to hear?

This is a critical missing feature. I would never ever use this unless it's in my ear alone.

And I hope it doesn't become socially acceptable to be carrying these around forcing everyone else to listen to whatever this device has to tell the user. There is already far too many noise and inconsiderate people in this world.

usrusr · on Jan 11, 2024

But how is the input side sufficiently compatible with any state other than being alone? Voice surely does not qualify?

Even gesture control would be a tough sell, and even that only if it's not "AR, where you push buttons projected across your field of vision" but "AR, where you can do the equivalent of gamepad buttons with hand movement anywhere the device can see the hand". Voice output (earbuds) would be far too slow for that kind of interaction, because you can't skim a list. Compared to the strictly sequential nature of audio, screens are the equivalent of embarrassingly parallel.

By the way, that slowness of voice output vs screen is also what I consider the true motivation companies had for creating those essentially free voice assistants: searching for product/service on a screen, even if it's just a small handheld screen, makes you pick from a list. With voice in the other hand, going through the list is so slow and cumbersome that the chances for just picking the first, "I'm feeling lucky", are much, much bigger. The value of placement (bought directly or bought indirectly, "this must be very relevant because we know how much they can spend on our other ad services") is just so much bigger with voice. Chances are people are less likely to listen to the second hit on voice than to go to the second page on screen.