I'm pretty sure there are folks involved in doing drug testing for many sports so saying are doing nothing seems hyperbolic. Are there specific things you think the bodies in charge of drug testing should be doing but aren't? Genuinely curious.
Not sure how this helps. Olympic events already have relative rating systems that ranks all the participant: pretty complicated and sport dependent systems that determine qualification for the games and competition amongst all the competitors at the games. The problem how to have separate competitions for different groups of participants when there isn't a universally shared agreement on who should be in which group.
If you have a relative skill rating system, then there's no need to split competitors into groups. But if you insist, then you can split them based on skill ratings (define a rating range for beginner, intermediate, advanced, etc). And for games with one-on-one matchups, sampling from a gaussian centered on each player's skill rating is good enough.
It doesn't.In tennis a 14 UTR whatever wins against a 13 UTR whatever. UTR is your effectiveness rating against every other player. Same in chess with ELO.
The issue is woman would disappear from profesional sports. Sinners 16.27 rating means that he double bagels Sabalenkas 13.29 essentially 100% of the time. The 500th ATP player has a UTR of 13.81, half a point is quite a bit stronger, do he's still very much stronger than Sabalenka. You probably have to start looking well into the thousand somethings for something that is consisently beaten by her.
Only the top 200 players make money, the top 100 good money, and the top 50 ridiculous money.
So women would not be in something like top 2000 of tennis players or worse. Which would basically remove any incentive for women to participate in pro tennis at all.
I don't get how you can compare Sinner's UTR against Sabalenka's when they're based to two disparate group scores? Doesn't there need to be at least a modicum of cross-pollination to make a meaningful comparison?
There is some cross pollination. Women can play vs men, just usually don't. I'm fairly certain singles UTR is universal across players, it only distinguishes between doubles and singles UTR.
UTR can also include unranked games if one of the players submits a score and the other approves it.
Basically proving my point. Very few women in top chess. Currently there are 0 women in top 100 chess players. Only 3 women were ever in the top 100 chess players. And chess is not even a game where men have a natural advantage like in almost all of the physical sports.
I don't deny that there are very few women in top chess, but that wasn't your point. You said it would end up being all men at all the skill rating levels, which is not true. Take chess as an example: there are a lot more women at around 1500 elo than at 2500 elo. So if you host an intermediate-level tournament just for players around 1500 elo, plenty of women will participate.
The ratio of men to women who are at 1500 Elo in chess is like worse than 90:1, so no, you host an intermediate level tournament and it will be almost all men. Well, mostly boys but that’s current chess for you.
But it’s not just that. If there are no top women in any kind of leagues in chess, that will only further discourage women from participating competitively in chess in the first place.
Note that most competitive women chess players play in women’s only tournaments even though they can easily join open men’s tournaments as well. For various reasons, one being that these women’s only tournaments are where they have the best chance of winning or being in the top k for prizes.
The male-to-female ratio at 1500 elo is not 90:1, but more like 9:1. 10% is a visible minority.
But I see where our disagreement is. You think there ought to be more women in chess. I think different people can do different things, so women don't need to match men in every statistic and vice versa. If we open it up to universal participation and it turns out to be a male-dominated game, then let it be. I don't think there's anything wrong with that.
> I think different people can do different things, so women don't need to match men in every statistic and vice versa. If we open it up to universal participation and it turns out to be a male-dominated game, then let it be. I don't think there's anything wrong with that.
You don't have a say though, others want to see women play chess against each others and happily pay for and organize that event. Or do you want to make female only events illegal? As long as they are legal they will continue to be held.
Any tool that auto-updates carries the implication that behavior will change over time. And one criteria for being a skilled professional is having expert understanding of ones tools. That includes understanding the strengths and weaknesses of the tools (including variability of output) and making appropriate choices as a result. If you don't feel you can produce professional code with LLM's then certainly you shouldn't use them. That doesn't mean others can't leverage LLM's as part of their process and produce professional results. Blindly accepting LLM output and vibe coding clearly doesn't consistently product professional results. But that's different than saying professionals can't use LLM in ways that are productive.
Even if one-shot LLM performance has plateaued (which I'm not convinced this data shows given omission of recent models that are widely claimed to be better) that missing the point that I see in my own work. The improved tooling and agent-based approaches that I'm using now make the LLM one-shot performance only a small part of the puzzle in terms of how AI tools have accelerated the time from idea to decent code. For instance the planning dialogs I now have with Claude are an important part of what's speeding things up for me. Also, the iterative use of AI to identify, track, and take care of small coding tasks (none of which are particularly challenging in terms of benchmarks) is simply more effective. Could this all have been done with the LLM engines of late 2024. Perhaps, but I think the fine-tuning (and conceivably the system prompts) that make the current LLM's more effective at agent-centered workflows (including tool-use) are a big part of it. One-shot task performance at challenging tasks is an interesting, certainly foundational, metric. But I don't think it captures the important advances I see in how LLM's have gotten better over the last year in ways that actually matter to me. I rarely have a well-defined programming challenge and the obligation to solve it in a single-shot.
Seems like the ability to distinguish LLM versus 'good human' writing depends on the size of the writing sample you have to look at (assuming you think it can be done). And that HN-scale posts are unlikely to be a long enough for useful discernment.
Within a few years, LLMs will be indistinguishable from human text.
Think how easy it was to tell the differences a year or two ago. By 2030 there will be no way to ever tell.
The same is true of all video, and all generated content. The death of the Internet comes not from spam, or Facebook nonsense, but instead from the fact that soon?
You'll never know of you're interacting with a human or not.
Why like a post? Reply to it? Interact online? Why read a "news" story?
If I was X or Meta or Reddit, I would be looking at the end.
Teslas have the wrong sense-gear, coupled with immense randomness. Pesky pedestrians. Waymo seems to be doing quite well in comparison. Regardless, a cat isn't a dog, and real-world navigation isn't posting on Facebook.
It would be better to make a direct point, such as "It will never be flawless". That's not really a problem here, it only need be flawless most of the time.
My point was more just that assigning a year to "no way to ever tell" seems as fraught as assigning a year to virtually any technological achievement we haven't seen yet. :) My strong suspicion is that by 2030, LLMs will be everywhere in a real sense, but the output quality won't be materially better than we have now -- the LLMs will simply be much more efficient and less resource-intensive (and, perhaps, the training corpuses in common use will be less full of legal minefields than the current batch). I could absolutely be wrong, but I don't think so.
LLMs won’t destroy social media any more than it already is.
I don’t think I have ever had a meaningful human interaction with anyone on Twitter, Meta, or Reddit without already knowing them from somewhere else. Those sites are about interacting with information, not people. It’s purely transactional. Bots, spam, and bad actors are not new.
Meta has been a dumpster fire of spam and bots for over 15 years, the overwhelming majority of its existence.
Reddit has some pockets of meaningful interaction but you have to find them and the partitioned nature means that culture doesn’t spread across the site. It’s also full of bots and shills.
Nobody tells stories about meeting people on Twitter. At best it’s a microblog platform and at worst it’s X.
Common people go to such sites for updates from friends, or to follow celebrities.
Their friends will start using more and more AI, ans celebrities will become all AI.
Why read a friend's page, if it's just AI drivel. Same for a celebrity.
It doesn't even need tp be true. Burned once, people will never trust again. The humiliation of writing messages that your friend only has a bot summarize, and reply to, will kill it.
Imagine you speak to your friend, and they haven't even read any measages you wrote, but their AI responded? And you in turn. Imagine you've had dozens of conversations, but it was with a bot instead of your friend.
Your trust will be eroded.
SPAM doesn't act like your friend. A bot does.
And the inability to distinguish will be the clincher. And yes, you won't know the difference, not after the AI is trained on their sent mail folder.
Please expand more on the idea that LLM's are not trained on English to begin with. Not sure what you mean by this as clearly many LLM's are trained on data that contains a lot of English. For instance GPT-1 seems to have been trained on a purely English corpus.
And there's 'nothing wrong' with just writing code with variables named 'a1, a2, a3'. But when some poor sod has to dig through your mess to figure out what you had in mind it turns out that having an easier to discern logical structure to your code (or html) makes it better. I've dug through a lot of html. And there's a ton of ugly code smell out there. Layers and layers of "I don't really know what I'm doing but I guess it looks okay and I'll make it make sense later". I'm sure it pays the bills for someone. But it makes me sad.
Seems like if evolution managed to create intelligence from slime I wouldn't bet on there being some fundamental limit that prevents us from making something smarter than us.
reply