Reviewing so called Pull Requests at $dayjob

18 points by runxiyu 12 hours ago on lobsters | 14 comments

After 5 years working in open source it is somewhat shocking going back into the corporate world and seeing how unprofessional the version control and communication practices are by comparison. Top-posting replies to email is, of course, the norm for the rare occasions where text-based communication doesn't flow through near-ephemeral chat software. The concept of amending commits into a reasonable history of atomic changes doesn't exist, every PR is just a series of "fixing PR comment" commits that are all squashed into a single commit with a verbose templated message that is too long to read.

sunshowers | 11 hours ago

I don't think top-posting is any more or less professional than interleaved responses. It's just a different community norm.

Atomic and bisectable commits are wonderful, but the GitHub UI makes amends painful for reviewers. (Apparently they're working on stacked PRs which should make the situation better.) I use a tool called spr to do amends + rebases locally and turn them into new commits + merges on the remote. It's a bit jank but otherwise works quite well — I routinely make stacks of 5-10 PRs and landed a 41 commit stack at work last summer this way.

Meta, which doesn't use GitHub, is all-in on stacked diffs. I know this because I helped set up these workflows during my time there.

calvin | 7 hours ago

I don't think top-posting is any more or less professional than interleaved responses. It's just a different community norm.

Yeah, I have no qualms against top posting. Mailing lists for (Unixy) open source software are pretty much the only places I ever see people complain about it; mailing lists in general, corporate email threads, personal emails, etc are top posting and no one cares.

atk | 3 hours ago

I don't think it's a matter of professionalism. And yes, it is a different norm.

But that being said, it's absolutely the worst format of communication.

Not only is it absolutely impossible to read such an email chain from a single email. Microsoft's own garbage email clients struggle with the format that I believe they themselves popularised.

Combined with the fact that HTML emails, corporation's favourite, have no remotely standardised concept of "this section is a quote from another email", it's absolute hell dealing with long email chains in this format.

I guess you can also add idiotically lengthy email signatures with an A4 page worth of ~~GPL~~legal boilerplate to the list of things to blame.

I really think anyone who doesn't think that top posting is crazy has never been in a sufficiently corporate, sufficiently long, and sufficiently branched and forwarded email chain of doom.

Interleaved/bottom posting stays sensical way longer than Microsoft's favourite format.

Ambroisie | 8 hours ago

It's just a different community norm.

While I agree with you, and won't fault people for it, I keep finding grating, as my uni's insistence on respecting the netiquette has enlightened me to the superiority of inline-posting.

winter | 7 hours ago

I use a tool called spr to do amends + rebases locally and turn them into new commits + merges on the remote. It's a bit jank but otherwise works quite well — I routinely make stacks of 5-10 PRs and landed a 41 commit stack at work last summer this way.

Which fork for jj do you use, if any?

PuercoPop | 3 hours ago

https://github.com/jennings/jj-spr spr is not a fork of jujutsu it works on top of regular jujutsu

winter | 2 hours ago

Right, I was asking which fork of spr do jj people use :) thanks!

: PuercoPop | an hour ago
I linked you to the repository that ~sunshowers spr fork recommends https://github.com/sunshowers/spr

oger | 11 hours ago

No idea which web-based review system the author is using; but at least Github and Gitlab allow viewing the diffs between force-pushes (e.g. in Github there's a line in the Conversation tab like "oger force-pushed the fix-issue-12345 branch from abcd1234 to 5678cdef" and a "Compare" button).

I still find it difficult to review these "diffs of diffs", but I guess that's not so much a UI issue but rather that this is a difficult problem in general. Or are there UIs that present this in a more manageable way?

lalitm | 10 hours ago

In my experience, Gerrit has a far better experience of these "diff of diffs" by tracking each push to the server as a "patchset" and allowing you to diff arbitrary patch sets against each other. This is very similar to how Critique, Google's internal code review tool does it though it operates more on a "file by file" level than Gerrit.

roryokane | 10 hours ago

GitLab tracks pushed versions and supports arbitrary diffing of them: GitLab docs > Merge requests > Compare diff versions. I’m not sure if that’s the same thing you’re referring to – do Gerrit’s diffs of patchsets compare the files within them (which is what GitLab does) or each diff (with - and + markers) against the target branch?

One problem with GitLab’s system is that if the MR is rebased against master in version 4, then the diff from MR versions 2 to 6 will include both changes to code in the MR and changes to other code on master. The changes from master are usually irrelevant for a reviewer wanting to understand how the MR has changed since their last review, though I can imagine it would sometimes be useful. At least the MR’s main page includes a log like “rebased against master (2 commits)”, so a reviewer has the option to read those commits’ changes separately, then mentally subtract them when they read the diff from MR versions 2 to 6.

: lalitm | 10 hours ago
It's been a very long time since I used Gitlab seriously so it's very possible it's gotten much better.

do Gerrit’s diffs of patchsets compare the files within them (which is what GitLab does) or each diff (with - and + markers) against the target branch?

It supports both. You have a drop down selector to choose the base and the target and it shows you the diff between those two. The base can be the actual branch you're targeting or one of the patchsets. And the target can be any patchset after the base.

My above comment definitely still applies to GitHub though whose tracking of diff of diffs is poor.

It also has a annoying habit of losing the anchor for comments meaning you have to go to the main timeline page to resolve them.

chipx86 | an hour ago

Yeah, we've handled this problem for about 20 years or so in Review Board through an interdiff filtering algorithm. It's not AI or anything, but it analyzes the two uploaded diffs defining the range and the resulting interdiff and the upstream changes, and generates a resulting interdiff that filters out any upstream changes, showing only the author's changes.

Works well, and doesn't require you to do anything special with the diffs or use any particular version control system. But it's a hard problem and edge cases can always happen.

I actually just wrapped up work on v3 of this algorithm last week for an upcoming release.