Archive for the ‘Code Quality’ Category
Learning a New Programming Language, with Life (part 4)
While preparing a course syllabus on coding in Python, for an upcoming high school class, I remembered static analysis. I haven’t decided how soon to introduce the topic, but I thought I’d better check how my own sample program fared.
No warnings or errors from Python itself. (If there had been I would have addressed them already!).
So I installed Pylint and tried its default settings. How about that. Not zero at all!
In fact, 54 coding convention messages, 12 warnings, and 1 recommendation for refactoring.
It was easy enough to clean up whitespace issues (helped to turn on “view whitespace” in my Notepad2).
And yes, many comment lines were too long. I left-justified (but indented for Python) for easy reading.
Some of the warnings are for my “TODO” comments — an extra reminder to do, or drop, next steps I’d identified earlier. Pylint message count is now down to 2 informational messages, 28 coding convention, 13 warnings, and 1 to refactor. They are valuable for my future Python learning (and teaching):
Locally disabling unused-variable (W0612) (locally-disabled)
I disabled those warnings because I know the code needs those variables. But I’ll have to explain why.
(Pylint doesn’t forget — the 2 new informational messages remind that I’ve suppressed two warnings.)
Invalid attribute name “maxRowMain” (invalid-name)
Not just that name. Most of my object names. I’ll have to find a good object-naming convention and use it.
R: 69, 4: Too many branches (13/12) (too-many-branches)
Just today I heard a Python lecturer on YouTube say, “if you’re not refactoring, you’re not learning.” Yes, that function is the longest. Not so complex, but could be simpler and easier to understand.
Attribute ‘maxRowMain’ defined outside __init__ (attribute-defined-outside-init)
A few of those also. I will have to go back and learn again about __init__: when to use it and why.
All in all, a good learning session, and direction on what I need to learn next.
Thanks to static analysis with Pylint.
Very Short Coding Standard
I’ve read any number of coding standards. They’re usually long, hard to imagine any developer memorizing them, and not automatically testable. So I was pleased to see Gerard Holzmann’s attempt at something much shorter, in the June 2006 issue of Computer Magazine: The Power of 10: Rules for Developing Safety-Critical Code.
Holzmann is the lead technologist at JPL’s Laboratory for Reliable Software, so his proposal is definitely worth reading.
But coding standards always bring up arguments about what should or shouldn’t be included, as well as doubts about how, or how costly, to implement.
There’s even a question of whether, in a project that has neither code reviews nor static analysis, which to start first.
In that spirit, then, I offer this Very Short Coding Standard (VSCS) as a possible starting point for projects that need to “do something”. Just two items: one from the human side of things — requires code review to verify, and one based on a static analysis tool that everyone already has installed.
VSCS Rule #1: Use Meaningful Entity Names
VSCS Rule #2: Maintain Zero Compiler Warnings
My experience tells me that you get a big improvement in readability if you can come up with entity names that are obvious to reviewers as well as to yourself. For the power of compiler warnings, read Holzmann’s Rule 10.
Failure to meet either should break the build and be handled immediately by refactoring.
VSCS is short, but not as easy to deploy as it might seem.
If your organization thinks it should do much more, first get this working and then go further.
Good Code is Specific
I caught myself gazing at the cover of Robert L. Glass' updated edition, Software Conflict 2.0: The Art and Science of Software Engineering. (Reviewers say that the book is good and thought-provoking. It is, though I don't know if anyone meant even the cover.) What's on the cover is an off-center photograph of a code listing. I doubt he actually expected anyone to read the code. But I did. With all due credit to Glass, the sample reads, in part:
if(find_func(token)) { /*rich …
call();
[…]
}
else *value = find_v …
get_token();etc.
Judging by the one specific word — "token" — in the code snippet, it's probably part of a parser or compiler of some sort. Or a generic sample written to look that way. But that's just the point: most of the code in software books we expect to learn from is generic. The variable names are bland and neutral, presumably in order to make the example as general as possible.
This is a problem, because real code is specific.
Any software that, to use a popular expression, "adds value", does so precisely because the developer wrote specific functionality to solve someone's problem today.
Wouldn't you expect the source code for such software to reflect that goal?
Digression. Consider furniture. I often do because, at least in our neighborhood, every street seems to have a least one dumpster filling up with out-of-fashion kitchen cabinets and bathroom fixtures. The old making way for the new. It's the age of renovations!
When I was growing up, furniture was solid wood, and purchased as individual pieces chosen for beauty and purpose. As I child, I understood that you had to be very well off to buy custom-made furniture. The rest of us took what was in the store window.
Now things are different. I watch my wife choose the new bathroom sink cabinet, or closet for the kids' bedroom. Modern furniture is made out of malleable components (plywood veneer) cut to fit the exact shape and layout of that small corner of the room where we can fit furniture and still be able to walk around the room.
As I looked at the new cabinet, with its outline following the corner of the wall, and its cookie-cutter back with an opening for otherwise hidden water piping, I realized that I had never learned to build a cabinet like that. Specific. Non-rectangular. Adjusted to fit.
We also went shopping for a new couch (after 10 years of a fine couch that the dog thought we'd chosen to be his bed). There too I got an education. A good couch is custom-built as well: choices of material, thickness and angle of the back, shape of the feet.
Real furniture is specific. Even if it doesn't look custom-built, it is.
I didn't learn that in shop class. We learned how to make square tables, rectangular bird houses. The most complex project maybe was a chess board. Lots of little squares fit into a square frame. After that, I copied what I saw in catalogs, using the same generic patterns. This training did not make me into a professional furniture-maker. Not at all.
Back to software. A lot of code looks like my chessboard. Neat. Functional. Lots of generic variable and function names like "ret_val" and "MessageHandler". Suprisingly hard to understand and maintain, though, because there's little clue, from reading the code, about what the software is supposed to do.
What's the solution? I'm not sure. I know what the end result looks like: code that contains meaningful, specific names. Like "ParseSubjectField" in a spam filter, or "BoilOverTemperature" in a nuclear reactor system.
Maybe a good start is:
- Use meaningful, that is to say, specific, names for things in your code
- Read other people's real code and, where you don't understand, ask, and suggest specific names
- Find a good carpenter and follow him (or her) around