Talk About Quality

Tom Harris

Archive for the ‘Static Analysis’ Category

Code Review for People

with 2 comments

Sometimes I get the feeling that when people consider and talk about code review, they’re talking about different things but they don’t know it. I see code review everywhere during coding.

Try this: put aside the image of code review as a meeting, or even code review as an independent activity. Limit your description of code review to: someone else is reading your code, you’re discussing it together, and there’s effective writing as a result. If you see those three things happening, you’re seeing code review.

Debugging

The great hope was that code review would be solely a preventive activity: do code review before execution, and there won’t be anything to debug. That’s not going to happen. Too much code is delivered under pressure, without code review. But all is not lost. Just review the code when debugging. Look again at the three steps — how does code review, as defined above, change debugging? First, don’t debug alone. Whatever you may think of pair programming for or against, pair debugging is a must. Especially if you wrote the code, having someone else look for the problem, or better, restate the problem, is essential. Second — discussion. Debugging that rushes to a fix often produces the wrong fix. And finally, effective writing. Sure there’s the writing which is fixing the code. But what about adding comments? And perhaps more important and effective, especially if it’s a quick fix: submit a new defect or change request to your tracking system for what really needs changing to reduce the chance of this failure recurring.

Static Analysis

Tools! If I have tools that can find problems, I won’t have to review code. Yes, you will. And you should want to. Static Analysis tools are great for finding simple errors, and potential defects. For narrowing the field of focus. But then, to find out what’s really wrong … code review. Additional benefit of the tool — since it has no feelings, it can be two of you against the reviewer: read the code with a partner, discuss why the tool thinks there’s a problem and whether you agree, and again, effective writing. Not just the fix, but why that fix, and also perhaps an adjustment to the tool so it will do better next time. High-end static analysis systems let you document that right inside the tool.

Testing

Testing, and its partner, code coverage, are often seen as the opposite of code review. Whatever I couldn’t catch by code review (and static analysis) is left for exercising the code under controlled conditions, causing failures, and fixing them. Actually code review is the most important part of developer testing. Not only reviewing the code under test in order to design good white-box tests. But also reviewing the code based on the test results and coverage results. Why does the software intermittently fail this test? Why is it so difficult to cover all the branches in this function? Again, read together, discuss, write. Here the effective writing is likely to be refactoring the code. If someone wants code review “documentation” — give them a diff and a one-sentence explanation of what you did and why.

Code Review is All-Purpose

There are many more activities in the developer’s daily life. Detailed design, branching and merging, delivery — all can benefit from in-line code review with this three-step model of reading, discussion, and effective results-writing. And by acknowledging this model, we see that code review, or almost code review, is happening all the time. Just have to complete the steps to get code review that real people will do.

Thinking about automated testing

leave a comment »

Google’s “Mr. Automated Testing”, Miško Hevery, talks about Software Testing Categorization. From the title, it doesn’t sound like much, but it prompted a lot of good discussion. Some excerpts on key points, edited for length and clarity:

Know what you’re talking about

You hear people talking about small/medium/large/unit/integration/functional/scenario tests but do most of us really know what is meant by that? — Miško Hevery

Need for speed

Let’s start with unit test. The best definition I can find is that it is a test which runs super-fast (under 1 ms) and when it fails you don’t need debugger to figure out what is wrong. — Miško Hevery

Make slow tests faster by running them in the background on spare machine cycles. Works for static analysis tools too. For example, Riverblade’s Visual Lint add-on for Visual Studio and Gimpel Software’s PC-lint. — Talk About Quality

I heard a senior test manager present at a conference who espoused the sophistication of his automated test regime (on a mainframe no less). Countless tests ran every evening, unattended. Lots of clapping in the audience. In the break I met a guy who worked for the presenter. He asked, “do you want to know why the tests run so fast? We took out all the comparison checks”. — Paul Gerrard

You’re right to have a healthy concern. Whether automated tests run in a millisecond or in several hours, they always have to be re-reviewed to see if they test something useful. Passing tests are especially dangerous. Everyone thinks green is great so they don’t look at the tests to discover that features have moved on and the tests don’t test anything. Automated tests are like automatic shift (does anyone still drive stick like I do?): very nice and we take it for granted, but when you press on the gas you still have to look where you’re going so you don’t cause an accident. — Talk About Quality

Test Planning applies to automated testing too

At my previous employer, the standard unit tests took upwards of an hour (for fewer than 2000 tests). This discouraged developers from running them, which resulted in broken builds for everyone. When we worked on fixing this, mock objects provided most of the salvation. However, a good portion of the tests were really integration tests and by correctly identifying them as such, we were able to remove them from the standard suite, and instead run them nightly. What Misko is missing from his post is how test classifications go hand in hand with scheduling tests. — TheseThingsNeedFixed

Written by Tom Harris

July 23, 2009 at 11:56 pm

Code Quality and the Machine

with one comment

I’m reading and excellent book Expert C Programming: Deep C Secrets, by Peter Van Der Linden. It’s the book all C programmers need, because it’s an explanation of why this ever-popular language works (or doesn’t work) the way it does. It also prompts me to review why “code quality” is necessary and what it is.

Code Quality

Ways of writing code that affect software maintenance time and correctness (the “people side”), and that affect computer execution performance and correctness (the “machine side”).

Naturally, it follows that good quality code is code which is written so that maintenance is easy and execution is fast, efficient, and correct.

Today, for a change, I’d like to talk about the “machine side” of things. Re-reading about the details of C, a language known for being high-level but “close to the machine”, made me want to review, from the bottom up, what a computer is, so that code, and code quality, can be placed in context.

I’m taking a big risk offering these definitions without looking them up (I may do that later), but here goes. I am trying to give only the essentials—the absolute minimum required to define the terms. Even though I am an electronics engineer, I have deliberately left out the word “electronic” as an unnecessary popularization of one application of electricity. At the same time, apologies in advance to physicists and chemists who will notice my skipping over their levels of mechanics and electricity. Keeping it simple here.

Machine

A thing which allows action at a distance. Generally has a defined input-output function: person does this to it, and it produces that response to the action.

Simple Machine

There’s a famous short list of them out there—here’s a fun example. To name just one, a lever: press this down over here, and over there, that goes up.

State Machine

A machine that has more than one state, or position of its parts, that it can be in. Specific actions take it from state to state. State machines can be mechanical. Even a see-saw is a state machine.

Clocked State Machine

A state machine that proceeds from state to state by having each state create the next action, which action is applied at the next independently-determined, regular time interval. Not suprisingly, the pendulum clock is the prototype, mechanical clocked state machine. Hence the name “clock” in computers (which we didn’t get to yet).

Electric State Machine

A state machine whose “position” is in fact the pattern of electrical charge. Even a lightbulb is an electric state machine. So is a bit of computer memory.

Computer

A clocked electrical state machine. As we will see later, this definition is enough to make it a generic machine—a machine that can do almost anything people want it to do.

Digital Computer

A computer where all the states are combinations of parts’ states which can only take N fixed integer values.

Binary Digital Computer

A computer where N is 2. Generally the two values are called 0 and 1. But of course the 0 and 1 don’t exist physically. They appear as two different charge patterns in the electrical parts of the computer.

Machine Language

A small set of binary numbers, with corresponding computer state-change responses. When a special part of the computer is forced to take on the state represented by one of these numbers (popularly called “loaded into memory”), at the next (one or a few) clock cycle(s), the computer will change to the corresponding new state. Also, a machine language is written by the computer parts manufacturer, and supplied with it.

Assembly Language

A small set of letter combinations which map 1:1 to the machine language. Exist only because most people remember letter combinations better than number combinations.

Computer Program

A list of combinations of language elements (“statements”) that, when loaded into memory along with a “start” instruction, cause a computer to proceed automatically from state to state.

High-level Language

A set of words, and rules for combining them, that, when used in a computer program, which is passed through another computer program (called a “compiler” if processed all at once, or an “interpreter” if processed one word at a time), produce a machine language computer program.

Software Design (activity)

Deciding how a computer program should be organized to best cause the running program to be compiled or interpreted so that the computer will do what was required. (See “Requirements”, immediately below.)

Requirements

A set of statements, in a human natural language, each one containing “shall” or “must”, which mostly describe how a computer should respond to actions applied to it.

Well, that was a lot, but much shorter than a college textbook!

Where Code Quality Fits In

Go back and read “high-level language”. That’s where code quality fits in, and why it’s a challenge. The code must both represent the design description, and meet the constraints of the particular high-level language. Further, that language may have been written for the convenience of the compiler writer. Finally, a large part of the machine language that makes up the running program does not come from the developer’s high-level language program, but from third-party programs written by multiple hardware manufacturers. It’s like copying a painting while looking at the painting in a mirror, and looking at the canvas in another mirror.

No wonder that under those constraints, writing code that is both clear to people, and correct for the compiler, is difficult. But with the twin tools of code review and static analysis, it is possible.

Focusing Code Review

leave a comment »

In an article Using Static Analysis to Find Bugs, in the September/October 2008 issue of “IEEE Software”, authors Nathaniel Ayewah et al conclude,

We need to develop procedures and best practices that make using static-analysis tools more effective than alternative uses of developer time, such as spending additional time performing manual code review or writing test cases.

In an otherwise fine article, just one key word–alternative–is misleading, promoting a common misconception. Static analysis tools are not an alternative to code review, nor can they replace it. Rather, static analysis tools are a guide to code review, helping the developer better use the time in “manual” code review. Of course, “manual” code review isn’t really “manual” — it’s not about the hands, but about the mind, understanding the code.

When using a static analysis tool, like the FindBugs of the article, or other types of tools, that measure and mark up the code for complexity or code coverage, the benefit is twofold. First, an automatic tool goes through a lot of code, and identifies which parts are most urgent to review. Second, the reports pose challenging questions for the code author or maintainer to answer by re-reading and explaining the code.

Where I work, on some projects at least, there is a formal policy for using static analysis tools. They are run automatically on all builds, and the results are tracked towards targets. But most important is the questions they generate: Why is this piece of code written this way? Is it what we meant? Why couldn’t the compiler, or lint tool, understand it without warning? Why does the code coverage tool think there are so many branches in these few lines of code? All these questions are generated by the tool, and mapped automatically to the code. But it still takes code review to answer the questions, and to adjust the code to make it better.

Written by Tom Harris

November 18, 2008 at 11:37 pm

Follow

Get every new post delivered to your Inbox.