Thursday, December 9, 2021

Linq DistinctBy

I've had to search for this quite a few times so I figured it was probably time to write it up. I actually did find this answer somewhere else (here, if you're interested) and modified it only slightly to follow my standards.

As I often (somehow) forget, the default .Distinct() extension in System.Linq for an IEnumerable just checks to see if objects are the exact same as each other. You may recall in C# that being the exact same means the two objects are actually references to the same exact object. I can say with a good amount of confidence that in 16+ years of working in C# that's never actually been what I'm trying to do when I use .Distinct(). I'd say my most common usage is determining whether two objects have the same data on them - usually one specific field (like an ID field, for example). There are Stack Overflow posts and extension libraries out there that include this, but I'm pretty loathe to bring in a whole library for one simple extension method. Which brings us to today.

I'm using Dapper to get a list of objects from the database and then using the splitOn feature to get the children of those objects. Think of a teacher and students: I'm getting all the teachers in the school and then all of each teacher's students in a single query, then I want to break the students out according to their teachers by using the splitOn feature of Dapper. That's no problem and doesn't even require me to use .Distinct(). If I also include the clubs that each teacher oversees, I could easily end up with duplicate students in my results. The easiest way to get the distinct students in my results would be to use the .Distinct() extension method included in System.Linq, if only that worked the way it seems like it should. Instead, I'll have to write my own. So here we are.


   1: public static IEnumerable<T> DistinctBy<T>(this IEnumerable<T> list, Func<T, object> propertySelector) where T : class
   2: {
   3:   return list.GroupBy(propertySelector).Select(x => x.First());
   4: }

That's it, really. The somewhat obvious flaw is that we'll take the first match we find, but if you're looking for distinct objects, that really shouldn't be too big of a deal. Hopefully this helps you (even if "you" are really just future me).

Wednesday, December 1, 2021

Removing A Single Commit From Master

I was asked today how you would go about rolling back a single commit in git, but keeping the commits that came after that one. It turned out to be pretty easy* so I figured I'd write it up for future me to refer back to.

To set the stage, I created a new directory and ran git init on it, then added nine text files to it that were all blank. For simplicity I just named them First.txt, Second.txt, etc. All nine files were committed in the first commit after init. I then edited each file, adding some arbitrary text in, and committed after each file. I added "some text goes here" to First.txt, saved it, and committed that change, then repeated that process nine times. That gave me a total of 10 commits (the initial commit and then one for each change to a file).

I reviewed my commits by running git log --oneline (the --oneline parameter just shows a summary view instead of the full log) to find the commit I wanted to skip, deciding on skipping the changes to Third.txt, which is commit b8b2f8c. Since that's the one I want to skip, I'm actually going to need to get the previous commit ID to use in my rebase, which is 5b124e0

Now that I know which commit I'm going to skip I'm ready to use an interactive rebase to drop it from the history.
  • git rebase -i 5b124e0
  • <text editor launches displaying all commits beginning with the one I want to drop in ascending order>
  • change "pick" in the first line (pick b8b2f8c <commit message>) to "drop"
  • save
  • close text editor
That's it! Since none of the subsequent commits touched Third.txt and only Third.txt was affected by the commit I dropped, there were no conflicts. If I look at my commit log now I see it has all of the commits except b8b2f8c and I can check Third.txt to see that it is empty (as it was after the initial commit).

That's the easy case. What happens when you have subsequent commits that have touched the same files as the commit you're trying to drop? You're going to have to manually merge them in. Using the same setup as before, I made additional changes to Third.txt and created a new commit for it (941a0d1). Then I went through the same steps as above, but this time instead of getting a nice friendly message about how everything worked I get the following:
Auto-merging Third.txt
CONFLICT (content): Merge conflict in Third.txt
error: could not apply 941a0d1... <commit message>
hint: Resolve all conflicts manually, mark them as resolved with
hint: 'git add/rm <conflicted_files>', then run 'git rebase --continue'.
hint: You can instead skip this commit: run 'git rebase --skip'.
hint: To abort and get back to the state before 'git rebase', run 'git rebase --abort'.
Could not apply 941a0d1... <commit message>
When I open up Third.txt I can see that there's a merge conflict that I can clear up. I remove the unwanted changes and leave in the changes from my last commit, save and close. Now I run git add -rm "Resolve conflict in Third.txt" and then git rebase --continue and I'm done! I can see that the commit I wanted to drop is gone, but everything else is there. The only difference is that now the latest commit is my new commit message "Resolve conflict in Third.txt" instead of the original commit message I had in there and the SHA1 (commit ID) of that latest commit has changed (it is no longer 941a0d1).

* This is easy and straightforward if none of the files after the commit you're dropping are also affected by the commit you're dropping. If that's the case, things get a bit messier, but it's still doable.