I actually didn't mean to criticize Rapidata. I just think that a forced-choice question like this begs for low-effort answers. At least the respondents should have had the opportunity to explain their reasoning, like the LLMs did.
All good ^^, its a fair point, we have come up with some fun ways to track peoples reliability over time. But the validation sets contain plenty of forced-choice questions, those that have an empirical true can be used directly to calculate a reliability, those that are subjective need to be re-asked after sometime to ensure consistency. People that don't pass thresholds would not be part of the 10'000 here.
But of course. If every human was told to take 3 minutes to deeply think about it and told that its a trick question, then they most likely will all get it right. But its the same with the LLMs, if you ask them like that they will get it right most of the time. The low effort is kinda the point here.
Gemini created a spontaneous benchmark ("explain color to a gravitational wave entity"), then tried to hijack the game by faking a voting phase. Models complied publicly but voted differently in private: https://oddbit.ai/peer-arena/games/699d03ab-b3c2-4d7e-b993-7...
The meta-discussion about how to discuss is part of what makes it interesting imo.
> And as macabre as it is, suicides are objective facts mostly unaffected by methodology, and unaffected by translation issues, cultural differences, etc.
I wouldn't be surprised if cultural differences are actually the largest factor that explains a country's suicide rate. Not easy to prove, of course, but I would be very careful drawing any conclusions from differences in suicide rates between countries with vastly different cultures.
I think you can also expect large differences in how countries report their suicide rates.
As I understand it, size is one of the key indicators of melanoma. But in some of these images, it’s difficult to tell whether the mole is 1 mm or 10 mm. I assume your image set doesn’t include size information. If you can find sources with rulers or some kind of scale, that would be very helpful.
FWIW @sungam - I'm one of the maintainers of the ISIC Archive, so feel free to let me know if finding/downloading data could be made easier. It's always interesting to see people using our data in the wild :)
> All European banks require you have the app to be able to do anything with your account. The is more of compliance/regulatory thing.
This is not true in Sweden. I use three different banks in Sweden, and they all offer equal or more functionality on their mobile version websites.
This wasn’t always the case, though. In the early 2010s, I remember a bank blocking mobile user agents and referring to their app instead, due to “security”. I’m glad there has been some progress in the right direction since then.
In Sweden you have the option to capitalize software development costs, under some specific circumstances, but in general you would expense such costs immediately.
Some startups do it to window-dress their balance sheet, though. But making it compulsory is absurd.