Talk About Quality

Tom Harris

Archive for November 2008

Code Quality and the Machine

with one comment

I’m reading and excellent book Expert C Programming: Deep C Secrets, by Peter Van Der Linden. It’s the book all C programmers need, because it’s an explanation of why this ever-popular language works (or doesn’t work) the way it does. It also prompts me to review why “code quality” is necessary and what it is.

Code Quality

Ways of writing code that affect software maintenance time and correctness (the “people side”), and that affect computer execution performance and correctness (the “machine side”).

Naturally, it follows that good quality code is code which is written so that maintenance is easy and execution is fast, efficient, and correct.

Today, for a change, I’d like to talk about the “machine side” of things. Re-reading about the details of C, a language known for being high-level but “close to the machine”, made me want to review, from the bottom up, what a computer is, so that code, and code quality, can be placed in context.

I’m taking a big risk offering these definitions without looking them up (I may do that later), but here goes. I am trying to give only the essentials—the absolute minimum required to define the terms. Even though I am an electronics engineer, I have deliberately left out the word “electronic” as an unnecessary popularization of one application of electricity. At the same time, apologies in advance to physicists and chemists who will notice my skipping over their levels of mechanics and electricity. Keeping it simple here.

Machine

A thing which allows action at a distance. Generally has a defined input-output function: person does this to it, and it produces that response to the action.

Simple Machine

There’s a famous short list of them out there—here’s a fun example. To name just one, a lever: press this down over here, and over there, that goes up.

State Machine

A machine that has more than one state, or position of its parts, that it can be in. Specific actions take it from state to state. State machines can be mechanical. Even a see-saw is a state machine.

Clocked State Machine

A state machine that proceeds from state to state by having each state create the next action, which action is applied at the next independently-determined, regular time interval. Not suprisingly, the pendulum clock is the prototype, mechanical clocked state machine. Hence the name “clock” in computers (which we didn’t get to yet).

Electric State Machine

A state machine whose “position” is in fact the pattern of electrical charge. Even a lightbulb is an electric state machine. So is a bit of computer memory.

Computer

A clocked electrical state machine. As we will see later, this definition is enough to make it a generic machine—a machine that can do almost anything people want it to do.

Digital Computer

A computer where all the states are combinations of parts’ states which can only take N fixed integer values.

Binary Digital Computer

A computer where N is 2. Generally the two values are called 0 and 1. But of course the 0 and 1 don’t exist physically. They appear as two different charge patterns in the electrical parts of the computer.

Machine Language

A small set of binary numbers, with corresponding computer state-change responses. When a special part of the computer is forced to take on the state represented by one of these numbers (popularly called “loaded into memory”), at the next (one or a few) clock cycle(s), the computer will change to the corresponding new state. Also, a machine language is written by the computer parts manufacturer, and supplied with it.

Assembly Language

A small set of letter combinations which map 1:1 to the machine language. Exist only because most people remember letter combinations better than number combinations.

Computer Program

A list of combinations of language elements (“statements”) that, when loaded into memory along with a “start” instruction, cause a computer to proceed automatically from state to state.

High-level Language

A set of words, and rules for combining them, that, when used in a computer program, which is passed through another computer program (called a “compiler” if processed all at once, or an “interpreter” if processed one word at a time), produce a machine language computer program.

Software Design (activity)

Deciding how a computer program should be organized to best cause the running program to be compiled or interpreted so that the computer will do what was required. (See “Requirements”, immediately below.)

Requirements

A set of statements, in a human natural language, each one containing “shall” or “must”, which mostly describe how a computer should respond to actions applied to it.

Well, that was a lot, but much shorter than a college textbook!

Where Code Quality Fits In

Go back and read “high-level language”. That’s where code quality fits in, and why it’s a challenge. The code must both represent the design description, and meet the constraints of the particular high-level language. Further, that language may have been written for the convenience of the compiler writer. Finally, a large part of the machine language that makes up the running program does not come from the developer’s high-level language program, but from third-party programs written by multiple hardware manufacturers. It’s like copying a painting while looking at the painting in a mirror, and looking at the canvas in another mirror.

No wonder that under those constraints, writing code that is both clear to people, and correct for the compiler, is difficult. But with the twin tools of code review and static analysis, it is possible.

Written by Tom Harris

November 25, 2008 at 10:57 am

Posted in Technology and Society

Tagged with ,

Focusing Code Review

leave a comment »

In an article Using Static Analysis to Find Bugs, in the September/October 2008 issue of “IEEE Software”, authors Nathaniel Ayewah et al conclude,

We need to develop procedures and best practices that make using static-analysis tools more effective than alternative uses of developer time, such as spending additional time performing manual code review or writing test cases.

In an otherwise fine article, just one key word–alternative–is misleading, promoting a common misconception. Static analysis tools are not an alternative to code review, nor can they replace it. Rather, static analysis tools are a guide to code review, helping the developer better use the time in “manual” code review. Of course, “manual” code review isn’t really “manual” — it’s not about the hands, but about the mind, understanding the code.

When using a static analysis tool, like the FindBugs of the article, or other types of tools, that measure and mark up the code for complexity or code coverage, the benefit is twofold. First, an automatic tool goes through a lot of code, and identifies which parts are most urgent to review. Second, the reports pose challenging questions for the code author or maintainer to answer by re-reading and explaining the code.

Where I work, on some projects at least, there is a formal policy for using static analysis tools. They are run automatically on all builds, and the results are tracked towards targets. But most important is the questions they generate: Why is this piece of code written this way? Is it what we meant? Why couldn’t the compiler, or lint tool, understand it without warning? Why does the code coverage tool think there are so many branches in these few lines of code? All these questions are generated by the tool, and mapped automatically to the code. But it still takes code review to answer the questions, and to adjust the code to make it better.

Written by Tom Harris

November 18, 2008 at 11:37 pm

Posted in Code Review