Talk About Quality

Tom Harris

Archive for the ‘Maintenance’ Category

What’s the difference?

with 2 comments

What kind of tool is required for code review? What are we trying to accomplish? How much of the code must be reviewed at a time, and why? These are some of the questions that we get caught up in, that prevent us from deploying effective code review in a development organization. One way to cut through the arguments and move forward is to realize that there are different types of code review, with different goals. Here are some parameters of two major, but different types:

Code Review type 1

Key question: Does this change accomplish what it was supposed to, while doing no damage?
Desired code author response to comments: Revise changes
Under review: changed files with diff
Tool: diff comparer
Required feature as a code review tool: right-click submit comment
Nice-to-have features (but could be in a separate app): right-click definitions

Code Review type 2

Key Question: Is this code designed well, so that it is easy to fix and extend?
Desired code author response to comments: Refactoring / renaming
Under review: any subset of entire codebase, starting from a given area of the code
Tool: source code browser
Required feature as a code review tool: right-click submit comment
Nice-to-have features (but could be in a separate app): code editing

Written by Tom Harris

November 17, 2009 at 12:01 pm

Product Quality: 3-in-1

without comments

It’s pretty well known that there are three ways to improve software quality: improve process, improve methods, and improve tools. But these are the generic concepts — applicable to any software development effort. Today’s topic is more specific and closer to home. Product quality.

Isn’t it obvious? The software product is the running software, and product quality is how well it meets its requirements. Simply put — does it do what people expect it to do, and not crash or lose data?

But organizations which release software focusing only on that one aspect of product quality are often unpleasantly surprised by inconsistency. They are doing all the right things — following a good process, using good methods and tools, and regularly reviewing all those to see what can be improved. Yet sometimes the product is better, sometimes worse. Why?

External product quality — in software, that’s how well the software serves its users when it’s running, is not enough. There’s also internal quality. The readability of the codebase. Code that’s well-written — easy to understand quickly — is easier to maintain. Fewer programmer errors, and fewer schedule slips. We can add to that clear and organized debug output, which avoids false positives, and gives an accurate record of when the running software seemed to be fine, but was really operating on the edge — just barely recovering from failures.

But there’s another part of the product that’s often overlooked. Overlooked when reorganizing teams, or just when parceling out the daily assignments from multiple projects. The developers. If you take a single iteration of a software product, with development at its best, you’ll apply best process, methods, and tools to put out the next version of the software, and maybe even have codebase improvement (cleanup, refactoring) on the list too. Get to release and you deliver good running software, more maintainable code and … the new state of the team that developed it.

The team members may have learned new methods or technologies. Or just how to avoid pitfalls from the past. Equally important, that particular team learned how to work together — not in general, but specifically in today’s process, methods & tools, and codebase. And with each other. This third part of the product cannot be ignored or thrown away, just as nobody would think to delete a code branch delivered to the customer, or switch out a working library in mid-project.

Looking further at this third component of the product — the team that delivered it — we see clearly that those quality improvement concepts of process, methods, and tools do not exist in a vacuum. They are not just abstract ideas. Rather, they exist in the team members who develop a product, and are unified by teamwork — which means this team learning to work together with these people on this product.

So when you think of product quality improvement, plan to improve all of these:

  1. How the product works
  2. How the codebase looks
  3. How the team works

And at the end of an iteration, make sure to review and protect all three. That’s the guarantee that the next version of the product will be even better.

Written by Tom Harris

July 18, 2009 at 11:26 pm

The Only Valid Measure of Code Quality

without comments

It’s Thom Holwerda — keeping things simple for us:

Simple Code Quality Metric

Simple Code Quality Metric

Here’s his site — with a search for “code quality” (warning: if you can’t ignore “Evony” web game ad graphics, stay away).

Written by Tom Harris

July 16, 2009 at 4:13 am

Software Developer

without comments

That’s right, not “development” but “developer”.

The latest issue of The Embedded Muse newsletter—#179 (pdf)—has a reference to an interesting and very different kind of book about C. It’s about the business pressures and the thought processes of C software developers. And a lot more. It’s also long—1600 pages! So download a copy and skim for what interests you:

The New C Standard: An Economic and Cultural Commentary

Some lines that caught my eye, just in the first 100 pages, skimming with the “Page Down” key:

“Software developer interaction with source code occurs over a variety of timescales [for example, those] whose timescale are measured in seconds.”

“The act of reading and writing software has an immediate personal cost.”

“Source code faults are nearly always clichés; that is, developers tend to repeat the mistakes of others and their own previous mistakes.”

“A list of possible deviations should be an integral part of any coding guideline.”

“The following are different ways of reading source code, as it might be applied during code reviews …”

Other related references from the book’s author Derek M. Jones (http://www.knosof.co.uk/):

And somehow related, from another author, Les Hatton (http://www.leshatton.org/):

Code Quality and the Machine

with one comment

I’m reading and excellent book Expert C Programming: Deep C Secrets, by Peter Van Der Linden. It’s the book all C programmers need, because it’s an explanation of why this ever-popular language works (or doesn’t work) the way it does. It also prompts me to review why “code quality” is necessary and what it is.

Code Quality

Ways of writing code that affect software maintenance time and correctness (the “people side”), and that affect computer execution performance and correctness (the “machine side”).

Naturally, it follows that good quality code is code which is written so that maintenance is easy and execution is fast, efficient, and correct.

Today, for a change, I’d like to talk about the “machine side” of things. Re-reading about the details of C, a language known for being high-level but “close to the machine”, made me want to review, from the bottom up, what a computer is, so that code, and code quality, can be placed in context.

I’m taking a big risk offering these definitions without looking them up (I may do that later), but here goes. I am trying to give only the essentials—the absolute minimum required to define the terms. Even though I am an electronics engineer, I have deliberately left out the word “electronic” as an unnecessary popularization of one application of electricity. At the same time, apologies in advance to physicists and chemists who will notice my skipping over their levels of mechanics and electricity. Keeping it simple here.

Machine

A thing which allows action at a distance. Generally has a defined input-output function: person does this to it, and it produces that response to the action.

Simple Machine

There’s a famous short list of them out there—here’s a fun example. To name just one, a lever: press this down over here, and over there, that goes up.

State Machine

A machine that has more than one state, or position of its parts, that it can be in. Specific actions take it from state to state. State machines can be mechanical. Even a see-saw is a state machine.

Clocked State Machine

A state machine that proceeds from state to state by having each state create the next action, which action is applied at the next independently-determined, regular time interval. Not suprisingly, the pendulum clock is the prototype, mechanical clocked state machine. Hence the name “clock” in computers (which we didn’t get to yet).

Electric State Machine

A state machine whose “position” is in fact the pattern of electrical charge. Even a lightbulb is an electric state machine. So is a bit of computer memory.

Computer

A clocked electrical state machine. As we will see later, this definition is enough to make it a generic machine—a machine that can do almost anything people want it to do.

Digital Computer

A computer where all the states are combinations of parts’ states which can only take N fixed integer values.

Binary Digital Computer

A computer where N is 2. Generally the two values are called 0 and 1. But of course the 0 and 1 don’t exist physically. They appear as two different charge patterns in the electrical parts of the computer.

Machine Language

A small set of binary numbers, with corresponding computer state-change responses. When a special part of the computer is forced to take on the state represented by one of these numbers (popularly called “loaded into memory”), at the next (one or a few) clock cycle(s), the computer will change to the corresponding new state. Also, a machine language is written by the computer parts manufacturer, and supplied with it.

Assembly Language

A small set of letter combinations which map 1:1 to the machine language. Exist only because most people remember letter combinations better than number combinations.

Computer Program

A list of combinations of language elements (“statements”) that, when loaded into memory along with a “start” instruction, cause a computer to proceed automatically from state to state.

High-level Language

A set of words, and rules for combining them, that, when used in a computer program, which is passed through another computer program (called a “compiler” if processed all at once, or an “interpreter” if processed one word at a time), produce a machine language computer program.

Software Design (activity)

Deciding how a computer program should be organized to best cause the running program to be compiled or interpreted so that the computer will do what was required. (See “Requirements”, immediately below.)

Requirements

A set of statements, in a human natural language, each one containing “shall” or “must”, which mostly describe how a computer should respond to actions applied to it.

Well, that was a lot, but much shorter than a college textbook!

Where Code Quality Fits In

Go back and read “high-level language”. That’s where code quality fits in, and why it’s a challenge. The code must both represent the design description, and meet the constraints of the particular high-level language. Further, that language may have been written for the convenience of the compiler writer. Finally, a large part of the machine language that makes up the running program does not come from the developer’s high-level language program, but from third-party programs written by multiple hardware manufacturers. It’s like copying a painting while looking at the painting in a mirror, and looking at the canvas in another mirror.

No wonder that under those constraints, writing code that is both clear to people, and correct for the compiler, is difficult. But with the twin tools of code review and static analysis, it is possible.

Read Before Running

without comments

Debugging: How to Start (commonly accepted method)

Search for “debugging” on the internet and most entries will tell you the first step is to reproduce the problem (properly called “the software failure”) and the second step is to create the simplest configuration that also reproduces the problem.

I agree that it is essential to reproduce the problem, because you need to know you’re working on the right problem.

Fun to Run, or “the thrill of the chase”

But programmers, myself included, like to run things—to let the computer do the work—so there’s a great temptation to reproduce the problem, change something, run the software, add some “debugging statements”, and run it again. And there goes an hour of valuable time.

Does readability fit in here somewhere?

If the code is written clearly, and by that I mean that it’s readable, with meaningful variable names and the proper levels of abstraction, then the second step might be to actually read through the code.

A short debugging story

Today, I showed myself both the wrong way, and the right way, to respond to a “bug report”.

Where I work, I’ve turned out to be the owner of a small set of text-parsing scripts, written in awk. We use them to pull out compiler warning messages from the long build logs, so that developers can see them and address them. These parsing scripts are short—half a page of code including comments—and are tested only with the few build logs that I’ve had time to try. Given limited time (it is not my main job to write parsing scripts!), I have to hope that all the build logs for a given compiler are pretty similar, so whatever pattern I’ve identified will be followed in other build logs. It isn’t always so.

A developer noticed (first by eye, confirmed with a simple “grep”) that there were more compiler warnings in his build log than in the parsed csv (comma-separated-value) output file. Not good. I set about debugging!

It took about 5 minutes to reproduce the problem. Really just gathering the sample input and bad output files, and setting up a copy of the relevant scripts. One run (takes about 2 seconds) showed that, indeed, some warnings were missing.

My next step, by the standard debugging technique, was to start printing all the partial results, run the script over and over again, and look to see where it was failing. Then some more time to adjust the failing script and retest.

Haven’t I seen this somewhere before?

After the first 5 minutes (reproducing the problem), I already had a pretty good idea of which script was failing. A sed one-liner that adds newlines after each compilation command line, and before the first warning message, so that the awk script would be able to separate the first warning message as a record and select it. What was odd was that when I opened that script (just one executable line—the rest is comments including change history), the last modification comment was about how I had fixed exactly the same problem before. Briefly I wondered about that, but rushed onward to more than half an hour of running parts of the script pipeline to prove to myself exactly what was failing and why.

I could have saved myself the time.

Debugging: How to Start (the code-reading method)

After reproducing the initial problem (again, just 5 minutes), and arriving at the suspect script with the worrisome comment, I should have stopped running the code, and started reading it. After all, if my last modification was to fix this same problem, then apparently that modification was either wrong, or insufficient.

It turned out, and this was visible by comparing the one-line script and the new build log on which it failed, that it was insufficient. The new build log had some extra spaces at the end of the compilation command line, and the sed one-liner, designed to identify such lines, was unprepared for the extra spaces.

It did take me another 15 minutes to verify that most new build logs had a random number of extra spaces, and thus to pick the right two-character adjustment to the regular expression. Regression testing took another half hour.

But I could have saved myself almost an hour in the middle of “debugging-by-running” if I had applied a few minutes of “debugging by reading the code”.

Even better would have been reading it out loud, or reading and explaining it to someone else.

Written by Tom Harris

February 27, 2008 at 12:04 am

Looking Back, Looking Forward

without comments

Some return for your U.S. taxpayer dollars:

CrossTalk, The Journal of Defense Software Engineering is an approved Department of Defense journal. CrossTalk’s mission is to encourage the engineering development of software in order to improve the reliability, sustainability, and responsiveness of our warfighting capability and to inform and educate readers on up-to-date policy decisions and new software engineering technologies.

Whatever you may think about “warfighting”, CrossTalk sometimes has some really good articles, and even some great ones. Here are two, from the December 2007 issue:

Very good, about software maintenance:

Geriatric Issues of Aging Software

Great, about moving beyond “escaped defects” to defect prevention: 

Advancing Defect Containment to Quantitative Defect Management

Both are thorough and comprehensive—worth reading!

Written by Tom Harris

November 27, 2007 at 7:52 pm