There is a unfortunate incentive created when a "business" (MiTM) depends on "bot traffic", i.e., the continued nuisance of bot traffic, to make money
If the "bot traffic" declines, then the "bot protection business" goes down with it
Cloudflare communication are sometimes careful to refer to traffic _labeled as_ bot traffic versus actual bot traffic
Because the "business" relies on the existance of "bot traffic", theres an incentive to broaden the scope of what is labeled as "bot traffic"
The false positive rate can be high. The public should see those statistics, and in truth it may be infeasible to compile them when theres no verification and the entire system relies on heuristics
"Bot protection" can be used to gather fingerprints for marketing
It can be used to force users to use certain software, e.g., certain browsers, and to enable Javascript subjecting users to data collection, surveillance and ads
Originally the motivation for avoiding "bot traffic" was based on behaviour, e.g., exceeding acceptable rates of usage, making too many requests in a given time period, exceeding rate limits
Now it's available to exclude traffic based on criteria such as what browser someone is using. NB. This is more than "user-agent string". The company forces people to sign NDAs before telling them what it is doing to fingerprint www users
If residential proxies are the problem then why not go after the companies that provide them
The truth is that those companies are not the problem. Their customers are so-called "tech" companies
Perhaps it's these so-called "tech" companies that are the problem
Certainly the problem is not the individual www user who doesnt use an "approved" graphical, Javascript-enabled browser who gets blocked or fingerprinted trying to make a single request
But thats who suffers from "bot protection" so that so-called "tech" companies can profit from data collection, surveillance and ads
> Originally "bot traffic" was based on behaviour, e.g., exceeding acceptable rates of usage, making too many requests in a given time period, exceeding rate limits
> Now it's available to exclude traffic based on criteria such as what browser someone is using
I'm pretty sure user-agent-based bot detection predates every request-rate-based method by quite a few years.
>It can be used to force users to use certain software, e.g., certain browsers, and to enable Javascript subjecting users to data collection, surveillance and ads
>Certainly the problem is not the individual www user who doesnt use an "approved" graphical, Javascript-enabled browser who gets blocked or fingerprinted trying to make a single request
The alternatives to javascript fingerprinting are either ineffective (TLS fingerprinting and/or IP rate limits), or even worse for privacy (eg. attestation).
>If residential proxies are the problem then why not go after the companies that provide them
> The alternatives to javascript fingerprinting are either ineffective (TLS fingerprinting and/or IP rate limits), or even worse for privacy (eg. attestation).
Javascript fingerprinting itself is ineffective, these kind of checks only stop the most basic bots and I'd argue the same for attestation.
It's ineffective in the sense that in the worst case, bots can buy used iPads or whatever and use a robot arm + camera to do the scraping, but each incremental step increases the cost for scrapers. TLS fingerprinting means you can't use curl/requests and call it a day. Javascript makes it even more complicated by requiring a headful browser to solve challenges. The purpose is to increase the cost, not to eliminate all bots.
If the "bot traffic" declines, then the "bot protection business" goes down with it
Cloudflare communication are sometimes careful to refer to traffic _labeled as_ bot traffic versus actual bot traffic
Because the "business" relies on the existance of "bot traffic", theres an incentive to broaden the scope of what is labeled as "bot traffic"
The false positive rate can be high. The public should see those statistics, and in truth it may be infeasible to compile them when theres no verification and the entire system relies on heuristics
"Bot protection" can be used to gather fingerprints for marketing
It can be used to force users to use certain software, e.g., certain browsers, and to enable Javascript subjecting users to data collection, surveillance and ads
Originally the motivation for avoiding "bot traffic" was based on behaviour, e.g., exceeding acceptable rates of usage, making too many requests in a given time period, exceeding rate limits
Now it's available to exclude traffic based on criteria such as what browser someone is using. NB. This is more than "user-agent string". The company forces people to sign NDAs before telling them what it is doing to fingerprint www users
If residential proxies are the problem then why not go after the companies that provide them
The truth is that those companies are not the problem. Their customers are so-called "tech" companies
Perhaps it's these so-called "tech" companies that are the problem
Certainly the problem is not the individual www user who doesnt use an "approved" graphical, Javascript-enabled browser who gets blocked or fingerprinted trying to make a single request
But thats who suffers from "bot protection" so that so-called "tech" companies can profit from data collection, surveillance and ads