It will be fascinating to see the facts of this case, but if it is proven their algorithms are discriminatory, even by accident, I hope workday is held accountable. Making sure your AI doesn't violate obvious discrimination laws should be basic engineering practice, and the courts should help remind people of that.
An AI class that I took decades ago had just a 1 day session on "AI ethics". Somehow despite being short, it was memorable (or maybe because it was short...)
They said ethics demand that any AI that is going to pass judgment on humans must be able to explain its reasoning. An if-then rule says this, or even a statistical correlation between A and B indicates that would be fine. Fundamental fairness requires that if an automated system denies you a loan, a house, or a job, it be able to explain something you can challenge, fix, or at least understand.
LLMs may be able to provide that, but it would have to be carefully built into the system.
I'm sure you could get an LLM to create a plausible sounding justification for every decision? It might not be related to the real reason, but coming up with text isn't the hard part there surely
It seems solvable if you treat it as an architecture problem. I've been using LangGraph to force the model to extract and cite evidence before it runs any scoring logic. That creates an audit trail based on the flow rather than just opaque model outputs.
It's not. If you actually look at any chain-of-thought stuff long enough, you'll see instances where what it delivers directly contradicts the "thoughts."
If your AI is *ist in effect but told not to be, it will just manifest as highlighting negative things more often for the people it has bad vibes for. Just like people will do.
Yes, they will, they'll rationalize whatever. This is most obvious w/ transcript editing where you make the LLM 'say' things it wouldn't say and then ask it why.
I believe the point is that it's much easier to create a plausible justification than an accurate justification. So simply requiring that the system produce some kind of explanation doesn't help, unless there are rigorous controls to make sure it's accurate.
> Fundamental fairness requires that if an automated system denies you a loan, a house, or a job, it be able to explain something you can challenge, fix, or at least understand.
That could get interesting, as most companies will not provide feedback if you are denied employment.
Fair point. Maybe the requirement should be that the automated system provide an explanation that some human could review for fairness and correctness. While who receives the explanation may be a separate question, the drawback of LLMs judging people is that said explanation may not even exist.
the way i understand it is that the law says decisions must be reviewed by a human (and i am guessing should also be overrideable), but this still leaves the question how the review is done and what information the human has to make the review.
I hate this. An explanation is only meaningful if it comes with accountability, knowing why I was denied does me no good if I have no avenue for effective recourse outside of a lawsuit.
No, I'm not super certain, but I believe most solvers are trained to be game theory optimal (GTO), which means they assume every other player is also playing GTO. This means there is no strategy which beats them in the long run, but they may not be playing the absolute best strategy.
It doesn't seem to be the example here, but I know that X.transpose() does not work if X is a (3,) and not a (3,1), which is the common way to present vectors in numpy. MATLAB forcing everything to be at least rank-2 solves a bunch of these headaches.
Interesting... I wrote a similar post about MATLAB's syntax a while ago, and I still think MATLAB is one of the best calculators on the market.
RunMat is an interesting idea, but a lot of MATLAB's utility comes from the toolboxes, and unless RunMat supports every single toolbox I need, I'm going to be reaching for that expensive MATLAB license over and over again.
Yep! Makes sense. Though I think the cost of writing these toolboxes is lim --> 0.
Will have a really solid rust inspired package manager soon, and a single #macro to expose a rust function in the RunMat script's namespace (= easy to bring any aspects of the rust ecosystem to RunMat).
I wouldn't be so sure that writing those toolboxes is cheap. You need an aerospace engineer to write the aero toolbox, or you are going to miss subtleties. I assume you need a biologist to write the biology toolboxes. All of these domain experts are really expensive, and I would not trust a toolbox that hadn't been review by them.
Even then... the reason we use the aero toolbox is because everybody in the aero industry trusts that MATLAB's results are accurate. I don't need to prove that the ECEF<->Keplerian conversions are correct, I can just show that I'm using the toolbox function and people assume it's correct. The aero toolbox is trusted.
When I've had to write similar code in Python, it's a massive pain to "prove" that my conversation code is correct. Often I've resorted to using MATLAB's trusted functions to generate "truth" data and then feeding that to Python to verify it gets the same results.
Obviously this is more work than just using the premade stuff that comes with the toolbox.
Any MATLAB alternative faces the same trust issue. Until it reaches enough mindshare that people assume that it's too popular to have incorrect math (which might not be a good assumption but it is one that people make about MATLAB) then it doesn't actually mimic the main benefit of MATLAB which is that I don't need to check its work.
It's funny that you listed 1-based index as a strength, and another poster here lists it as a weakness. Goes to show there's really no agreement when it comes to indexing!
Yah, it is very strange to equivocate using AI as a spell checker and a whole AI written article. Being charitable, they meant asking the AI re-write your whole post, rather than just using it to suggest comma placement, but as written the article seems to suggest a blog post with grammar errors is more Human™ than one without.
Does this handle covariance between different variables? For example, the location of the object your measuring your distance to presumably also has some error in it's position, which may be correlated with your position (if, for example, if it comes from another GPS operating at a similar time).
Certainly a univarient model in the type system could be useful, but it would be extra powerful (and more correct) if it could handle covariance.
Using this sampling-based approach you get correct covariance modeling for free. You have to only sample leaf values that are used in multiple places once per evaluation, but it looks like they do just that: https://github.com/mattt/Uncertain/blob/962d4cc802a2b179685d...
I've been wondering for a while if a program could "learn" covariance somehow. Through real-world usage.
Otherwise, it feels to me that it'd be consistently wrong to model the variables as independent. And any program of notable size is gonna be far too big to consider correlations between all the variables.
As for how one might do the learning, I don't know yet!
Ok, but what happens when lib-a depends on lib-x:0.1.4 and lib-b depends on lib-x:0.1.5, even though it could have worked with any lib-x:0.1.*? Are these libraries just incompatible now? Lockfiles don't guarantee that new versions are compatible, but it guarantees that if your code works in development, it will work in production (at least in terms of dependencies).
I assume java gets around this by bundling libraries into the deployed .jar file. That this is better than a lock file, but doesn't make sense for scripting languages that don't have a build stage. (You won't have trouble convincing me that every language should have a proper build stage, but you might have trouble convincing the millions of lines of code already written in languages that don't.)
Python says "Yes." Every environment manager I've seen, if your version ranges don't overlap for all your dependencies, will end up failing to populate the environment. Known issue; some people's big Python apps just break sometimes and then three or four open source projects have to talk to each other to un-fsck the world.
npm says "No" but in a hilarious way: if lib-a emits objects from lib-x, and lib-b emits objects from lib-x, you'll end up with objects that all your debugging tools will tell you should be the same type, and TypeScript will statically tell you are the same type, but don't `instanceof` the way you'd expect two objects that are the same type should. Conclusion: `instanceof` is sus in a large program; embrace the duck typing (and accept that maybe your a-originated lib-x objects can't be passed to b-functions without explosions because I bet b didn't embrace the duck-typing).
> I assume java gets around this by bundling libraries into the deployed .jar file. That this is better than a lock file, but doesn't make sense for scripting languages that don't have a build stage. (You won't have trouble convincing me that every language should have a proper build stage, but you might have trouble convincing the millions of lines of code already written in languages that don't.)
You are wrong; Maven just picks one of lib-x:0.1.4 or lib-x:0.1.5 depending on the ordering of the dependency tree.
Gradle suffers the same exact issue by default, because it inherits it from Maven (they use the same repository). You need to go out of your way to enable strict versioning policies and lock files.
Maven and Gradle make up the vast majority of all Java projects in the wild today. So, effectively, Maven is Java in terms of dependency management.
> Gradle suffers the same exact issue by default, because it inherits it from Maven
It's not the exact same issue because Gradle and Maven have different conflict resolution:
Maven dependency conflict resolution works with a shortest path, which is impacted by declaration ordering. Gradle does full conflict resolution, selecting the highest version of a dependency found in the graph.
I'm glad you bring up Kojima, because I think he's a master of this New Literalism. I just watched my partner play Death Stranding 2, and it feels like every other cut-scene has an NPC turn to camera and explain the themes of the game. And I love it! And it doesn't detract from the games ability to express those themes through metaphor and game-play.
Obviously subtlety is good, but choosing to be very literal can be an interesting artistic take. I don't think Kojima was thinking about how to dumb-down his message for audiences. I think its a genuine artistic choice rooted in his style. While I didn't like it for other reasons, I think the same can be said for Megalopolis. I loved the scene were it's just a full screen interview with Catiline, even if it was kinda dumb.
There's probably something interesting about how both the ten thousandth grey-CGI marvel movie and these more experimental artists are drawn to hyper-literalism in the now, probably with some thoughts about the social internet thrown in. I'll have to think about it.