Monday, November 14, 2022

If you don't tolerate it in new code you should not tolerate it in old code either

Let's assume that you are working on a code base and notice that it has some minor issue. For argument's sake we'll say that it has some self written functionality and that the language's standard library has added identical functionality recently. Let's further assume that that said implementation behaves exactly the same as the self written one. At this point you might decide to clean up the code base, make it use the stdlib implementation and delete the custom code. This seems like a nice cleanup so you then file merge request to get the thing changed.

It might be accepted and merged without problems. On the other hand it might be blocked, either temporarily or permanently. There are several valid reasons for not merging the change. For example:

  1. That part of the code does not have sufficient tests so we don't know if the change is safe.
  2. A release is coming up soon, so we are minimizing all changes that might cause regressions, no matter how minor. The merge can only be done after the release is out.
  3. We need to support old platform X, whose stdlib does not have that functionality.
  4. It's not immediately obvious that the two implementations behave identically (for example, because the old implementation has load-bearing bugs) so a change might introduce bugs.

Getting blocked like this is a bit unfortunate, but these things happen. The important thing, however, is that all these are solid technical reason for not doing the cleanup (or at least not do it immediately). Things get bad when you get blocked by some other reason, such by your reviewer asking "why are you even doing this at all". This is a reasonable question, so let's examine it in detail.

Suppose that instead of submitting a cleanup commit you are instead submitting a piece of completely new functionality. In this MR you have chosen to reimplement a piece of standard library code (for no actual gain, just because you were not aware of its existence). The review comment that you should get is "You are reimplementing stdlib functionality, delete that code here and use the standard library instead". This is a valid review comment and something that should be heeded.

The weird thing here is that this is in a way the exact same change, but it is either not acceptable or absolutely necessary depending on whether parts of the code are already inside your repo or not. This is weird and should not be the case, but human beings are strangely irrational and their "value functions" are highly asymmetric. This can lead to lots of review fighting if one person really wants to fix the issue whereas some other one does not see the value in it. The only real solution is to have a policy on this, specifically that submitting a change that fixes an issue that would be unacceptable in new code is, by itself, a sufficient reason to do the work but not to merge it without technical review. This shifts the discussion from "should this be done at all" to "what are the technical risks and merits of this change", which is the way reviews should be done.

I should emphasize that this does not mean that you should go and immediately spend 100% of your time fixing all these existing issues in your code base. You could, but probably it is not worth it. What this guideline merely says that if you happen to come across an issue like this and if you feel like fixing it then you can do it and reviewers can't block you merely by "I personally don't like this" line of reasoning.

A nastier version

Suppose again you are a person who cares about the quality of your work. That you want to write code that is of sufficiently good quality and actually care about things like usability, code understandability and long term maintainability. For you people there exists a blocker comment that is much worse than the one above. In fact is the single biggest red flag for working conditions I have ever encountered:

But we already have [functionality X] and it works.

This question does have a grain of truth in it, existing code does have value. Sadly it is most commonly used as a convenient way to hardblock the change without needing to actually think about it or having to explain your reasoning.

Several times I've had to make some change into existing code and hours and days of debugging later found that the existing self written code has several bugs that a stdlib implementation definitely would not have. After reporting my findings and suggesting a cleanup the proposal has been knocked out with reasoning above. As a consultant I have to remain calm and respect the client's decision but the answer I have wanted to shout every time is: "No it's not! It's completely broken. I just spent [a large amount of time] fixing it because it was buggy and looking at the code makes me suspect that there are a ton of similar bugs in it. Your entire premise is wrong and thus every conclusion drawn from it is incorrect."

And, to reiterate, even this is not a reason by itself to actually go and change the code. It might get changed, it might not. The point is the prevailing attitude around minor fixups and how it matches your personal desire of getting things done. If you are a "fixing things proactively makes sense" kind of person you are probably not going to be happy in a "let's hide under the bed and pretend everything is fine" kind of organization and vice versa.

No comments:

Post a Comment