Maintaining shadow branches for GitHub PRs

17 points by MaskRay a day ago on lobsters | 24 comments

jmtd | a day ago

What (in your opinion) are the drawbacks of merging the main branch back into the PR branch: I ask because you don’t mention it (at least in “the problem”) section

[OP] MaskRay | a day ago

The PR branch likely has an outdated base. When I switch between the latest main branch and the PR branch, there will be unneeded rebuilds (due to unrelated file change). For a large project like llvm-project, lagging for 1 day ensures that I will have to rebuild almost everything.

OK, add some commands for demonstration:

git switch main; git pull; ninja -C build

# Switching to a feature branch with an outdated base requires numerous rebuilds.
git switch feature0
git merge origin/main  # I prefer `git rebase main` to remove merge commits, which clutter the history
ninja -C build

# Switching to another feature branch with an outdated base requires numerous rebuilds.
git switch feature1
git merge origin/main
ninja -C build

jmtd | a day ago

I found having separate work trees for separate in-progress PRs really helpful for working on OpenJDK (another large code base). In fact since OpenJDK’s major versions are largely maintained in separate repositories, this also helps to share git object storage between them, and to make cherry-picking and otherwise referencing stuff from each of them trivial in the others.

[OP] MaskRay | 23 hours ago

I agree that git worktree is situationally useful. For example, I use /tmp/llvm-22 for the release/22.x release branch.

However, each work tree requires a build tree, increasing the number of builds, which is exactly the issue I want to avoid...

jevinskie | a day ago

Comment removed by author

roetlich | a day ago

But, why not just use merge instead of rebasing all the time?

edk- | a day ago

I can't speak for OP but I really don't like using merge commits for features, and forbid it on anything I maintain. It breaks bisect as well as just generally making the history hard to follow. Doing merges and then one big rebase at the end can get messy and even if you recorded conflict resolutions during your merges, ime they don't always apply.

On the other hand keeping your feature branches rebased works great with either merge commits or rebase merges. And it is a lot nicer to work with locally if you share my habit of working on multiple things at the same time and then sorting the changes into appropriate branches later. So I always do it, and live with the less-than-great way it shows in forges.

natkr | a day ago

It breaks bisect

No it doesn't. Just run git bisect start with --first-parent.

as well as just generally making the history hard to follow.

Always run git log with either --graph or --first-parent. If your GUI doesn't support them, get one that actually cares to understand the data model they're built against.

Doing merges and then one big rebase at the end can get messy

Yes, running the break-my-history command will break your history. I'd argue that its name should reflect that property, but I don't make that decision.

edk- | a day ago

Well, no, the problem I have with bisect and merges is fundamental to the concept of a merge: you can get a correct answer out of it, but the correct answer is less useful when it blames the merge, rather than a specific one of the commits that were merged.

git log --graph becomes unreadable pretty fast to me, even for relatively slow development with short-lived branches. The information is displayed, but it is just inherently more complicated information than a linear history. At work there are often dozens of active branches at any given moment and the graph is just impenetrable noise.

natkr | a day ago

If you want to drill down into the branch then you can just… start a new bisect.

edk- | a day ago

I will admit this has only happened to me a couple of times, but that doesn't work if the failure was introduced by the act of combining both sides of the merge. If your history is linear you don't have this problem because one of the commits has to come second and it'll get the blame. The first time this got me I considered writing a tool that would rebase the parallel histories against each other both ways to identify both of the clashing commits, but it seems kind of tricky and it couldn't be automatic if there were conflicts, so I didn't think it would be worth it.

natkr | a day ago

There I'd argue that the integration (merge) is the right place for the error to be. Now you have both a ground truth for the original context where the branch worked, and a record of any decisions made in the process! I'd take that any day over gaslighting myself about whether it ever worked.

jevinskie | a day ago

As a practical point, for the author, because LLVM policy does not allow for merge commits.

[OP] MaskRay | a day ago

https://github.com/llvm/llvm-project/ is configured to only allow "Squash and merge" . (It allowed "Merge" temporarily when new subprojects (bolt/ mlir/) were initially integrated ). The merge commits in the PR do not matter. GitHub will do an internal rebase when a PR is merged.

I am not a git power user. Dealing with merge commits is very inconvenient. Beside the bisection inconvenience,

# Listing fixup commits ignoring upstream merges requires the clumsy --first-parent.
git log --first-parent

llvm-project uses the "Pull request title and description" setting for squash merging. This means updating the final commit message requires editing via the web UI. I prefer editing the local commit message and syncing the PR description from it.

Added the arguments above to the blog post.

tobin_baker | a day ago

"Squash and merge" is only appropriate for low-trust environments where developers can't be trusted to clean up their history before merging. I guess that lowest-common-denominator default is appropriate for GitHub but not for a project with skilled and conscientious contributors.

tavianator | 19 hours ago

Do you not believe that LLVM has "skilled and conscientious contributors"?

stig | a day ago

It’s worse. Squash and Merge in an environment where folks can merge their own work once they get an approval means your repo gets filled with commit messages containing pure garbage, including many lines of “fix” “typo” “wat” “try this”. That’s why we don’t allow squash and merge in my team’s repo at work. (I trust my team members, but not all drive-by contributors.) It’s the only way to ensure that we can review the commit messages going into the repo.

enpo | a day ago

I like to tell a story in a complex PR. By doing that, you can review commit by commit instead of everything at once.

When I work locally, I amend commits and break them up if necessary before pushing for the first time. When I have pushed the branch for the first time, I usually address feedback in a separate commit at the end, but it would have been more smooth to amend the original commit.

When the commits are detailed and split up like that, it's very nice to git blame years afterwards. I'm my own favourite author when doing blames because the commit messages are (imho) very good :)

dutchie | a day ago

You can (and should!) still do that and merge your branch in with a merge commit instead of a fast-forward though. It's up to an individual development group whether they want to do a traditional merge or a rebase/fast-forward (or even a squash merge, if they really want to invalidate all that hard work making commits atomic).

In my head, rebase is a tool intended to move commits from one place to another. All the --interactive amending stuff is just something that sort of ended up in the code that had to rewrite commits anyway (no idea if this is actually how the development actually shook out).

[OP] MaskRay | a day ago

Sorry that the blog post was not clear.

So, if I updated my main branch and plan to switch a feature branch, I always run

git rebase main feature

to minimize the number of modified files. (Then I use pr-shadow to maintain a shadow PR branch (e.g., pr/feature) that only receives fast-forward commits (including merge commits)).

I don't know how to do merge origin/main without unnecessarily updating file timestamps. The merge commits look clumsy so I tend to avoid them.

natkr | a day ago

If you work on several branches in parallel then you can use worktrees to materialize each into a separate directory and avoid them clobbering each other.

dutchie | a day ago

I was mostly responding to my reading of @enpo's comment, which I took to mean that using rebase instead of merge lets you create a nice clean history. I wasn't speaking to the blog post at all really: I've not had much experience of a repo which is fast-moving enough to require this "shadow PR branch" strategy. In pretty much every project I've worked on it's been fine just to rebase/force-push if necessary (and often mine will be the only PR "in-flight" anyway).

To restate my point, It's perfectly reasonable to use git rebase -i to amend your branch while never changing the base commit and then creating a merge commit onto the main branch at the end.

enpo | a day ago

Yes, I was also commenting the general case of merge vs. rebase.

My comment touched on how I organize and prettify the feature branch itself. When the branch has passed review, I merge it to the main branch (or rebase if the project requires it – but I must admit that I get sad if I have to squash AND rebase, because the nice commit history is worth keeping).

roetlich | a day ago

Comment removed by author

ThatsInteresting | a day ago

Am I missing something? Isn't force pushing not recommended to begin with?

I work in an environment where it's mostly not permitted to force push and enforced by hooks so the workflow ends up being: you work on your branch, you rebase, it fails, you fight with it a bit, you abort that, you create a new branch, review the reflog and cherry pick, fix as you go, curse everyone and their ancestors and pets, you realize your separate commit upstream was the one that caused your problem, curse git, stomp away and get coffee, fix it, push 😅 Really, I usually do ff-only on my local branches, rebase minimally, and have to do merges upstream so try to avoid having extra ones because that makes a potential rebase difficult. Maybe I'm just fortunate that we don't stomp on each other's branches that often.

[OP] MaskRay | a day ago

Force pushing is indeed discouraged. Listed some problems:

  • The UI displays "force-pushed the BB branch from X to Y". Clicking "compare" shows git diff X..Y, which includes unrelated upstream commits—not the actual patch difference. For a project like LLVM with 100+ commits daily, this makes the comparison essentially useless.
  • Inline comments may become "outdated" or misplaced after force pushes.
  • If your commit message references an issue or another PR, each force push creates a new link on the referenced page, cluttering it with duplicate mentions. (Adding backticks around the link text works around this, but it's not ideal.)