Just using git, you'd send a set of patches, which can be reviewed, tested and applied individually.
The PR workflow makes a patch series an undivisible set of changes, which must be reviewed, tested and applied in unison.
And stacked PRs tries to work around this issue, but the issue is how PRs are implemented in the first place.
What you really want is the ability to review individual commits/patches again, rather than work on entire bundles at once. Stacked PRs seems like a second layer of abstraction to work around issues with the first layer of abstractions.
The teams that I have worked with still apply the philosophy you’re describing, but they consider PRs to be the “commit”, i.e. the smallest thing that is sane to apoly individually.
Then the commits in the PR are not held to the standard of being acceptable to apply, and they are squashed together when the PR is merged.
This allows for a work flow in which up until the PR is merged the “history of developing the PR” is preserved but once it is merged, the entire PR is applied as one change to the main branch.
This workflow combined with stacked PRs allows developers to think in terms of the “smallest reviewable and applicable change” without needing to ensure that during development their intermediate states are safe to apply to main.
Doesn’t this mean that a first review might request that a specific change be reverted, and then a later reviewer reviews that reversion? That’s essentially reviewing a noop, but understanding the it’s a noop requires carefully checking all previous now-invalidated changes.
Squashing is fine if you’re just making a mess of temporary commits as you work and you don’t want to keep any of those changes separate in master, but that’s not a useful review workflow. A lot of times I’ve built a feature in a way that decomposed naturally into e.g. two commits: one to do a preparatory refactor (which might have a lot of noisy and repetitive changes, like changing a function signature) and another to actually change the behavior. You want those changes to be separate because it makes the changes easier to review; the reviewer quickly skims the first commit, observes that it’s a mechanical refactor, and the change in behavior has its own, smaller commit without all the noise.
“What if there’s feedback and you need to make changes after the code review?” Then I do the same thing I did before I posted the code review: make separate “fixup” commits and do an interactive rebase to squash them into my commits. (And yes, I do validate that the intermediate commits build cleanly.)
There’s nothing you get from stacked PR’s that you don’t also get from saying “please review my feature branch commit by commit”.
Yes what you’re describing is literally the thing GitHub has built but instead of having to make a bunch of compromises, there is dedicated UI and product metaphor for it.
Some examples of compromises:
You can’t merge partially merge a large “review commit by commit” PR so you are forced to wait until it is all ready to merge.
> You can’t merge partially merge a large “review commit by commit” PR so you are forced to wait until it is all ready to merge.
These are two different use cases. I thought we were talking about the one where a set of changes is more readable commit by commit but you still want to merge the whole set of changes, not the one where the change is too big to review and merge at once so you have to break it up into multiple reviews. The latter use case is more rare—frankly, it’s a bit of a red flag otherwise—and wasn’t difficult anyway.
Microsoft didn’t need to build anything because it was already built into Git. The only problem is, if people knew how to use Git, Microsoft couldn’t lock them into a proprietary version control platform.
Exactly! A stack of PRs is really the same beast as a branch of commits.
The traditional tools (mailing-lists, git branches, Phabricator) represented each change as a difference between an old version of the code and the proposed new version. I believe Phabricator literally stored the diff. They were called “diffs” and you could make a new one by copying and pasting into a <textarea> before pressing save*.
The new fangled stuff (GitHub and its clones) recorded your change as being between branches A and B, showed you the difference on the fly, and let you modify branch B. After fifteen years of this we are now seeing the option for branch A to be something other than main, or at least for this to be a well supported workflow.
In traditional git land, having your change as a first class object — an email or printout or ph/D1234 with the patch included — was the default workflow!
Right, a PR is "just" a set of commits (all must be in the same branch) that are intended to land atomically.
Stacked PRs are not breaking up a set of commits into divisible units. Like you said, you can already do that yourself. They let you continue to work off of a PR as your new base. This lets you continue to iterate asynchronously to a review of the earlier PRs, and build on top of them.
You often, very often, need to stage your work into reviewer-consumable units. Those units are the stack.
Still not sure this is the right solution. My problem is if one of your first stages gets rejected in review or requires significant changes, it invalidates so much work that comes after it. I've always when possible preferred to get small stuff merged in to production as it happens rather than build an entire feature and put it up for review.
> it invalidates so much work that comes after it.
No, not necessarily.
I work on a large repo and new features often involve changes to 3 different services: 2 from the backend, and the frontend UI. Sending a single PR with changes to all 3 services is really not ideal: the total diff size in a feature I added recently was maybe 600+ lines, and the reviewers for frontend and backend changes are different people. The changes in the 2 backend services can be thought of as business logic on one side and interactions with external platforms on the other. The business logic can't work without integrating calls to external APIs, and the UI can't work without the business logic.
These days I open 3 separate PRs and the software only works once all 3 are merged and built. It would be great to have all of them as a single package that's still testable and reviewable as 3 distinct parts. The UI reviewer can check out the whole stacked PR and see it running locally with a functional backend, something that's not possible without a lot of manual work when we have 3 PRs.
The LLVM community used this model for years with Phabricator before it was EOL'd and moving to GH and PRs was forced. It's a proven model and works very well in complex code bases, multiple components and dependencies that can have very different reviewer groups. E.g:
1) A foundational change to the IR is the baseline commit
2) Then some tweaks on top to lay the groundwork for uses of that change
3) Some implementation of a new feature that uses the new IR change
4) A final change that flips the feature flag on to enable by default.
Each of these changes are dependent on the last. Without stacked PRs you have o only one PR and reviewing this is huge. Maybe thousands of lines of complex code. Worse, some reviewers only need to see some parts of it and not the rest.
Stacked diffs were a godsend and the LLVM community's number one complaint about moving to GitHub was losing this feature.
this works much better in Phabricator because commits to diffs are a 1:1 relationship, diffs are updated by amending the commit, etc., the Github implementation does seem a bit like gluing on an additional feature.
Seems to be quite simple, an App which wants to access this info just needs to set the permission for it.
Chrome doesn't seem to request that permission, so the OS doesn't provide the location-data to the app. So Chrome rather ended up in this state by doing nothing, not by explicitly doing something...
If your app targets Android 10 (API level 29) or higher and needs to retrieve unredacted EXIF metadata from photos, you need to declare the ACCESS_MEDIA_LOCATION permission in your app's manifest, then request this permission at runtime.
This is a common approach to "privacy" taken by orgs like Google.
You don't get to access or export your own data in order to protect your privacy, but Google still gets 100% access to it.
Some messaging apps do the same and won't let you take a screenshot of your own conversations. Like, someone sent me an address, but I can't take a screenshot to "protect my privacy".
Seems to be quite simple, an App which wants to access location info from images just needs to set the permission for it.
Chrome doesn't seem to request that permission, so the OS doesn't provide the location-data to the app. So Chrome rather ended up in this state by doing nothing, not by explicitly doing something...
If your app targets Android 10 (API level 29) or higher and needs to retrieve unredacted EXIF metadata from photos, you need to declare the ACCESS_MEDIA_LOCATION permission in your app's manifest, then request this permission at runtime.
Which messaging apps are those? I have only seen such behavior for one-time photos, where it makes sense (although one-time photos are security theater because nothing prevents you from taking a photo of the screen with another device).
Imagine my surprise when I attempted to record the iPhone mirroring application, which was running on macOS. Apple did a great job on their DRM because I simply recorded a black screen while I was attempting to play back a video from an app on the phone.
I'm sure it's given some businesses the confidence to invest in iOS app development, but it felt bad.
Yes, or a video editing app that wants you to buy it.
I'm not _entirely_ upset Apple is encouraging the market to develop high-quality solutions by allowing them to protect their revenue.
But it felt bad as if they were reaching into my Mac.
My iPhone is Apple’s playground. They let me use it. But I own my Mac, and if my eyes see something on the screen it feels dumb to send Tim Apple and Reed Hastings into my homeoffice telling me “no no get a capture card(?) or set up a DSLR to record your screen. But no direct recording big guy!”
I also attempted to package ROCM on musl. Specifically, packaging it for Alpine Linux.
It truly is a nightmare to build the whole thing. I got past the custom LLVM fork and a dozen other packages, but eventually decided it had been too much of a time sink.
I’m using llama.cpp with its vulkan support and it’s good enough for my uses. Vulkan so already there and just works. It’s probably on your host too, since so many other things rely on it anyway.
That said, I’d be curious to look at your build recipes. Maybe it can help power through the last bits of the Alpine port.
Interesting how Vulkan and ROCM are roughly the same age (~9 years), but one is incredibly more stable (and sometimes even more performant) for AI use cases as side-gig, while the other one is having AI as its primary raison d'être. Tells you a lot about the development teams behind them.
I've built llama.cpp against both Vulkan and ROCm on a Strix Halo dev box. I agree Vulkan is good enough, at least for my hobbyist purposes. ROCm has improved but I would say not worth the administrative overhead.
I realize it does not address the OP security concerns, but I'm having success running rocm containers[0] on alpine linux specifically for llama.cpp. I also got vLLM to run in a rocm container, but I didn't have time to to diagnose perf problems, and llama.cpp is working well for my needs.
I like the idea of Pijul, and checked it out a couple of years ago. Some basic quality of life features were missing, and are still missing.
For example, diffs can't show context. They show lines removed and added, but can't show the previous and following line because of implementation details.
It’s possible this is just a Nest limitation? It appears that all the data is there to construct a diff with context, and I’d hope the CLI would do so…
OTOH, this article goes too far the opposite extreme:
> We isolated the vulnerable svc_rpc_gss_validate function, provided architectural context (that it handles network-parsed RPC credentials, that oa_length comes from the packet), and asked eight models to assess it for security vulnerabilities.
To follow your analogy, they pointed to the exact room where the gold was hidden, and their model found it. But finding the right room within the entire continent in honestly the hard part.
> Their job literally depends on them finding Mythos to be good, we can't trust a single word they say.
TFA article is literally from a company whose business is finding vulnerabilities with other people's AI. This article is the exact kind of incentive-driven bad study you're criticizing.
Hell, the subtitle is literally "Why the moat is the system, not the model". It's literally them going, "pssh, we can do that too, invest in us instead"
Weird that they're co-opting the "Assisted-by:" trailer to tag software and model being used. This trailer was previously used to tag someone else who has assisted in the commit in some way. Now it has two distinct usages.
I like skills because they rely on the same tools which humans rely upon. A well-written skill can be read and used by a human too.
A skill is just a description for how to use an existing CLI tool. You don't need to write new code for the LLM to interact with some system. You just tell the LLM to use the same tool humans do. And if you find the CLI is lacking in some way, you can improve it and direct human usage benefits from that improvement too.
On the other hand, an MCP requires implementing a new API for a service, an API exclusive to LLMs, and keeping parallel documentation for that. Every hour of effort put into it is an hour that's taken away from improving the human-facing API and documentation.
The way skills are lazy-loaded when needed also keeps context clean when they're not used. To be fair, MCPs could be lazy-loaded the same way, that's just an implementation detail.
> If files get deleted on the local host, they get deleted from OneDrive/Dropbox too.
Dropbox, at least, does offer file history but I'm talking about protecting against hardware failure here more than a user deleting their own files. That's the use-case I've personally dealt with more often than not. "I dropped my phone in the pool, how do I get my pictures back", "My laptop won't turn on anymore, just shows a folder with a question mark on it when I try to boot", etc. Self-inflicted or just general hardware failure is the main issue people deal with in my experience.
reply