Code commenting: one of the casualties of outsourcing

During college I worked as a computer programmer intern at the National Institute of Standards and Technology. I had the opportunity to work on all sorts of nifty cutting-edge physics simulations using some serious science. Unfortunately, everything was written in VB 6, C++ .NET, or Fortran, but you can’t have it all, and .NET is actually pretty decent compared to some of the alternatives.

One of the programs I worked on was originally written by a Korean researcher working at NIST, thus technically not making it outsourcing, but the problems I’m about to describe are relevant nonetheless. The code was rather hard to understand, especially the variable names, which followed some kind of naming convention that was completely foreign to me. Luckily, the code was actually decently commented. In Korean. Not that it would’ve helped me if I was able to read Korean, because sometime between the original writing of the code and when it got to me, all of the nice UTF-8 comments were corrupted down to ASCII-128. So they appeared as complete gibberish that wouldn’t be understandable by anyone — if you’ve ever viewed binary executable data as text, you know what I’m talking about.

My best guess is that another American maintenance programmer before me edited the program in an IDE that wasn’t set up to understand UTF-8. He must’ve not noticed when all of the nicely formatted Korean comments turned into gibberish — or maybe he didn’t care. Either way, by the time the comments got to me, they were thoroughly worthless. Well, not quite. Their presence at least alerted me to sections of the code that required extra attention, because they were generally non-trivial.

Code maintainability is thus one of the biggest casualties of outsourcing. If the coders you’re outsourcing to don’t speak English, or if they at least don’t bother to comment the code in English, you’ll be facing significantly higher code maintenance costs down the line. That’s just something to keep in mind. In the long run, you save money by hiring local programmers. At least that’s the official line I’m sticking with, seeing as how doing so directly benefits me (hey, did I ever say I wasn’t a biased blogger?).

5 Responses to “Code commenting: one of the casualties of outsourcing”

  1. T2A` Says:

    Binary data doesn’t have to be executable to end up gibberish in a text editor. :P

  2. arensb Says:

    especially the variable names, which followed some kind of naming convention that was completely foreign to me.

    It could be worse, since it sounds as though the variable names were ASCII strings, at least.

    In Perl and, I believe, C9x, any character that can appear in a word may be used in an identifier. In a Unicode world, that means the author could have used Korean identifiers, and your job would have been that much harder.

  3. arensb Says:

    T2A`

    Binary data doesn’t have to be executable to end up gibberish in a text editor. :P

    Heck, if you’ve ever edited a sendmail.cf, you know that it doesn’t even need to be binary to be gibberish.

  4. Ed Says:

    Bad comments are quite grim as well. Sometimes it takes more time to understand the comments someone else wrote, than to understand their code; “proper” comments can be gibberish as well.
    I never did any serious professional programming but in my (hobby) C programs the only comments I tend to write are bad jokes, expletives, complaints about how much time it took to write and debug that, copywrong notices, etc etc. I then pass it along.
    If anyone comes back to me with comments about my comments, I know I got someone really interested in my program and do all the best I can to explain it.

  5. T2A` Says:

    Properly named variables are a big factor in code readability too, but lots of people don’t seem to notice this. The overall layout of functions, methods, and classes plays a big role as well.