Terrified to consider what happens when these scammers get hold of large language models here in a year or so. Rather than fading into the background as this article posits, I expect people to have models finetuned on convincing them to make purchases/send money. Probably trained by being pitted against other models which have been trained on the mark's social media feeds. Train the scambot to perfectly push your buttons by having it practice against your own style of thought as embodied by your social corpus.
"Train the scambot to perfectly push your buttons by having it practice against your own style of thought as embodied by your social corpus."
Fortunately, that's not really what these language models can do. They can easily be trained to mimic you. They can be trained to mimic what normal people reply to you with. But there's no way to train the transformer-based high-probability-next-word AIs to be superhumanly good at fooling you into doing something, on the grounds of lack of training data, and probable inability to represent such a complex topic in their internal representation. And the humans doing this stuff are experiencing enough success that they probably have no desire to go chasing the super hard targets, with the wherewithal and motivation to chase them down and sue them (or... you know... worse, legal systems aren't a bound on everyone) even potentially across international lines.
You'll know when AI does get to that point, because suddenly the internet will be an amazingly interesting place with all sorts of amazingly good arguments you can't hardly resist. I imagine few of us experience that sort of internet. (If you do, uh, watch out.)
You don’t need to automate the whole process, just use language models to establish rapport for a few weeks and have humans pick up the gullible ones at the bottom of the funnel.
For sure. Or even for a few days to start. It's basically the same playbook as Waymo: get computers to do more and more of the boring parts, having human operators take over when necessary, and using the additional data generated to improve the system.
> But there's no way to train the transformer-based high-probability-next-word AIs to be superhumanly good at fooling you into doing something, on the grounds of lack of training data
The conversations of all those human scammers would be prefect training data for this. You even know exactly what conversations led to payouts. Assuming you can get all your data in one place, of course.
My context is someone who isn't already falling for the scams. It is true that you can train a model to follow through to those who fall for the scams the scammers already know, which is a fair point. My point is that you're not going to get a superhuman AI out of our current transformer technology that can talk you into believing you're a superintelligent camel from Arcturus IV and if you don't immediately turn over your credit card number, the Star Alligator of the Galactic Core is going to eat your homeworld.
GPT-3 may even gamely try to do exactly that with the correct prompt! But it'll fail. The result won't be cognitively dangerous to anyone with a grip on reality, it'll be risible.
>You'll know when AI does get to that point, because suddenly the internet will be an amazingly interesting place with all sorts of amazingly good arguments you can't hardly resist. I imagine few of us experience that sort of internet. (If you do, uh, watch out.)
I've long assumed they do the exact opposite - try to filter out people who likely see through the game so they don't waste their time mining a hill with no gold.
And they do this by intentionally making basic mistakes or other easy to spot errors so the clever people will just see themselves out and by the time their funnel gets to an actual human scammer, they have a highly probable sucker.
Exactly! Even Microsoft had a paper on this ‘Why do Nigerian Scammers Say They are from Nigeria?’ [1].
'By sending an email that repels all but the most gullible the scammer gets the most promising marks to self-select, and tilts the true to false positive ratio in his favor.’
A theory which would be more convincing except that [i] saying they're from Nigeria also filters out all the gullible people with spam filters, and yet despite spam filters now preventing the majority of gullible people from responding, the scripts haven't changed [ii] the more straightforward explanation is that they say they're from Nigeria because their ultimate objective is getting you to send money to Nigeria...
Ultimately if you're in the business of spamming people on the other side of the world in the hope that 0.001% of them will ultimately send a money transfer worth a month's wages in local currency, your time probably isn't so valuable you can't afford to deal with everyone that replies
I just looked through a variety of some of the most recent spam messages I’ve received, with Fastmail’s SpamAssassin setup, with a user Bayes filter (I’ve easily had enough spam to achieve that), Vade checking, and various block lists (some pertaining to sender, some other properties).
It’s a surprisingly even mixture of content, sender-related metadata, other message-related metadata, and unattributable (e.g. SH_HBL_EMAILS, ME_VADESCAM). Most of the time, any two of those four would be enough to reach the spam threshold of 5. Regularly, any one of at least three of them.
I should note that what I’m calling “sender-related metadata” is not penalising unknowns: it’s only penalising known-bads. Thus, it’s not really about sender reputation as a whole, but rather established bad sender reputation. The only form of penalising of unknowns that I’m aware of with Fastmail is when the sender is on a domain name registered in the last I think 72 hours.
When it comes to the more tailored things (oh, you somehow managed to spend two hours looking at my site, particularly liking my Rust FizzBuzz article, and wonder if I wouldn’t mind sharing a link to your Python guide, and you keep pestering me?), it’s only content, with everything else neutral. (In the specific example I cited there, the first message got BAYES_00, the second got BAYES_50, and the third BAYES_99 + BAYES_999 perhaps due to me manually marking the previous ones as spam but probably also from introducing the term “guest post” which I imagine my Bayes filter regards dimly.)
(I like the fact that I can inspect Fastmail’s spam filtering to quite some degree, and you can talk to their support about it as well and get more detail when desired. The big ones like Gmail are just completely opaque, with people poking and prodding at the edges to try to understand its caprice. Disclosure: I worked for Fastmail for a few years.)
That's fine, the point is to assign a cost and a liability paper trail to people who send, say, 100,000 emails.
It doesn't affect the average user and it presents very nominal hoops for the high volume user to step through while erecting substantial barriers to criminals.
The various companies that spam me to offer their services and correctly remove me from their mailing list still had no right to put me in their mailing list from the beginning.
Per the article the scammers pay a minimum of $8,000/person, plus cost to feed, imprison, etc. Pretty sure that a model that only requires electricity and GPUs to run will work out being less expensive than this especially when you consider that GPT-N (YaLM-1T?) will be able to run as many scams as you have GPUs to run inference on concurrently, increasing your possible take, and won't have to sleep.
I think we can probably rule out OpenAI and equivalent cloud services allowing people to use their APIs to run phone scams. It's even worse PR than bots saying racist things.
And if they need to train their own model, you can get a lot of slaves and poor wannabes for the price of one competent NLP engineer, and the slaves and poor wannabes are less likely to decide they're the brains of the outfit and cut you out of the loop.
I am morbidly curious what the locations, salaries and working conditions are like. Because obviously they have to recruit people who have some basic level of English language literacy, so that commands a bit of a wage premium (even in India or Bangladesh) over truly unskilled labor.
The article goes into this a little. Obviously there are different groups doing this with different setups, but in some documented examples it's basically slavery
Yeah. I suspect we're only months, or a few years at most, from automation that's as good or better than these human slaves. Then they can A/B test their way to increasing effectiveness.
An interesting twist will be to pull the voices of your friends off social media videos and impersonate them to you.