Archive for the ‘Metrics’ Category
Software Developer
That’s right, not “development” but “developer”.
The latest issue of The Embedded Muse newsletter—#179 (pdf)—has a reference to an interesting and very different kind of book about C. It’s about the business pressures and the thought processes of C software developers. And a lot more. It’s also long—1600 pages! So download a copy and skim for what interests you:
The New C Standard: An Economic and Cultural Commentary
Some lines that caught my eye, just in the first 100 pages, skimming with the “Page Down” key:
“Software developer interaction with source code occurs over a variety of timescales [for example, those] whose timescale are measured in seconds.”
“The act of reading and writing software has an immediate personal cost.”
“Source code faults are nearly always clichés; that is, developers tend to repeat the mistakes of others and their own previous mistakes.”
“A list of possible deviations should be an integral part of any coding guideline.”
“The following are different ways of reading source code, as it might be applied during code reviews …”
Other related references from the book’s author Derek M. Jones (http://www.knosof.co.uk/):
- Blog: http://shape-of-code.coding-guidelines.com/
- Article: Coding Guidelines: Fact and Fiction
- Article: The 7±2 Urban Legend (small pdf)
And somehow related, from another author, Les Hatton (http://www.leshatton.org/):
Why We Write Software
“Isn’t it nice to know that, when all else fails us, we have an innate decision-making tool to fall back on?”
Robert L. Glass, “Intuition’s Role in Decision Making” (IEEE Software, January/February 2008)
Yes, Glass admits, estimates that come unbidden from a manager’s subconscious seem the very opposite of quantitative or rational. But most decision-making methods have a common theme: using historical data to decide what to do next. Quantitative estimation puts everything out in the open. Rational support for a desired deadline is at least based on the facts. Intuition, at the other extreme, pulls that data, informally known as “experience”, from the subconscious. And since a good manager has experience, intuition works.
Sort of.
What’s hidden by Glass’ essay are the unstated quality standards that justify each method. Only a complete, quantitative estimate, matched by continuous measurement and adjustment, can promise high quality by the target date. Intuition, on the other hand, even coming from an experienced manager, makes quality likely, but hardly guaranteed. His successes are what keep him in his job, and his few failures are forgiven. Rightly so. He is employed to deliver good enough software, quickly, to meet modern society’s ever-growing appetite for computerized life.
So should we favor intuitive, seat-of-the-pants estimating, with its benefit of early delivery, but its cost of uncertain quality?
Digging deeper, we find that there are some very different reasons why we write software, which strongly influence how we plan our work.
- Profit
- Functionality
- Beauty
Anyone who gets paid for writing commercial software must acknowledge that profit pays his or her salary. Profit leads eventually to pretty good quality, via competition. In the short run, though, we experience a lot of pressure, a lot of errors, and many customer-accepted (if sometimes societally unacceptable) failures. In that context, it seems, we should estimate quickly and intuitively, but admit that quality is expendable.
In a well-funded, non-profit organization (e.g. NASA, famed for error-free software — wholly separate from failure-free flights), software can be about complete functionality. Take as much time as needed to implement everything, perfectly. After all, when the spaceship flies past Jupiter, there’s no second chance.
Where, then, is guaranteed perfection, estimated correctly and delivered quickly? The only place we see that in life is artistic performance. A pianist plays, from memory, a piece that she only decided a few months ago to perform. An hour long, no mistakes. The event starts, and finishes, on time. The customer — the audience — rises to its feet in noisy appreciation. The enabler of perfection is seeking beauty, or doing a thing for its own sake.
Many software developers write software because it’s beautiful, fun, or spiritually rewarding. And that reason engenders the highest-quality work. A quality that delivers on time, with no costly rework. Functionality. Profit too.
Somehow, in the commercial context, that reason why we write software must be harnessed and encouraged by software managers. Otherwise, dismissing a developer’s data with an intuitive estimate is not a fall-back, but falling backwards.
Looking Back, Looking Forward
Some return for your U.S. taxpayer dollars:
CrossTalk, The Journal of Defense Software Engineering is an approved Department of Defense journal. CrossTalk’s mission is to encourage the engineering development of software in order to improve the reliability, sustainability, and responsiveness of our warfighting capability and to inform and educate readers on up-to-date policy decisions and new software engineering technologies.
Whatever you may think about “warfighting”, CrossTalk sometimes has some really good articles, and even some great ones. Here are two, from the December 2007 issue:
Very good, about software maintenance:
Geriatric Issues of Aging Software
Great, about moving beyond “escaped defects” to defect prevention:
Advancing Defect Containment to Quantitative Defect Management
Both are thorough and comprehensive—worth reading!
The Practicing Developer
I’ve often thought about the advantage that athletes and musicians have over software developers: time to practice—lots of it. Most time is spent in exercises and rehearsals, to produce the best peformance.
In software, everyone spends most of the time producing, with a bit of “time off” for learning new technologies. No wonder perfection seems so far off.
A friend pointed out that Dave Thomas has proposed CodeKata as a way of doing that practice. I read it, and while the ideas look good, I felt there was something missing. The “katas” (formal patterns in karate, and here, in design and coding) seemed somehow schoolwork-like and disconnected from real work. I could only imagine myself “practicing” on assignments where I needed the result, even if for something trivial like importing a bunch of e-mails into SharePoint (more about that another time).
I don’t have an answer today, but the question is much bigger than just finding time to practice in software development. So instead, have a look at these posts and presentations where people are discussing the issue, and see what you think.
Level 5 means never having to say you’re sorry (Jeff Atwood)
Big Macs vs. The Naked Chef (Joel Spolsky)
No Best Practices (James Bach)
Herding Racehorses and Racing Sheep (.ppt) (The Pragmatic Programmer)
Competence is a Habit (.ppt) (David Leach)
Continuous Code Review
Continuous code review is reviewing all the code, all the time.
That’s the theory. In real life, some questions come up:
How do we track which code has been reviewed?
All the code?
And when is all the time?
Question 1: How do we track which code has been reviewed?
Answer 1: One cannot, and need not, mark code as reviewed.
One cannot, because, for example, if one reviews a file when it has 100 lines of code, and then later it has 120 lines of code, one may have to re-read all 120 (probably more quickly since much is not new) in order to comment.
And code is not reviewed line by line, but idea by idea.
Use a code editor that maintains a symbol tree so that the reviewer can jump from use to definition and back.
One need not mark code as reviewed because all code should be reviewed.
Question 2: All the code?
Answer 2: Yes, all the code that has been included in continuous code review.
During start-up, one might mark components or modules as having been included in the continuous code review process.
In steady state, by definition, all code is reviewed.
Question 3: When is all the time? Pair programming? Or surely, before every check-in?
Answer 3: None of the above. Try once a week.
The idea is to direct a constant flow of intelligent, objective comments to the developer, as soon after coding as possible.
Code review is not testing, and it is not certification. It is a tool to help the developer fix (improve) the code s/he has just written, to improve the code s/he is about to write, and in general, for the developer to become better at detailed design and coding.
Pair programming is not code review because the “non-driver” is not objective—s/he is part of the pair.
And it does not matter whether code is ready for check-in or not.
All you need is enough newly-written code for a reviewer to understand what the developer is trying to accomplish.
Get started!
The Carbon Rush
Rarely does a topic stir up more controversy than code review. And global warming doesn’t either, though perhaps it should.
I was grateful for the opportunity to see Al Gore’s very slick hour-and-a-half documentary, An Inconvenient Truth, about global warming and carbon emissions. Whatever you may think of Gore, blatant promos for Apple laptops, or computer-generated tear-jerker sequences of drowning polar bears, it is worth seeing as a thought-provoking presentation.
Remember to think.
Some of the questions I was left with after recovering from the “mild thematic elements” (yes, that’s what gave it a PG rating) are:
- What does “carbon neutral” mean? And “carbon offset”?
- Are carbon subtraction programs working as planned?
- Is carbon emission reduction going to save us from a global warming disaster?
Nothing is as simple as it seems.
Carbon neutral refers to “calculating your total climate-damaging carbon emissions, reducing them where possible, and then balancing your remaining emissions, often by purchasing a carbon offset. (Related term: carbon negative.) “
(If anyone has a more formal source than a word popularity article, let me know.)
Hmm. What about “carbon negative”? It’s easy to get distracted on the web, but the only real answer is planting more trees. (All that stuff about buying carbon dioxide emission reductions from other organizations is nonsense — not that it’s not helpful, but it’s not taking carbon out of the atmosphere – just paying for someone else to put less in rather than reducing one’s own emissions.) A bit disturbing to find, then, that even Friends of the Earth has raised concerns about how carbon-reduction tree planting is being carried out. Humbling too—one more example of how we never quite know the effects of one man-made intervention carried out to mitigate another.
Finally, if one can entertain the thought that world climate change might be a bit more complicated than a 100-minute movie, there’s short, medium, and long reading to be found at JunkScience.com.
What seems to me, though, is that the reductions—that are currently fashionable to call carbon emission reductions—are probably good for “old-fashioned” (1960’s) reasons: reduce landfill, keep air clean for breathing, improve personal health through exercise, and increase spiritual health by slowing down.
Will all this get lost in the Carbon Rush?
It
8-year-old, to friend: You don’t know how to spell!
Friend: Oh yes I do!
First 8-year old: The opposite of “enemy”—spell it!
Friend: That’s easy: f-r-i-e-n-d.
First 8-year old: Nope!
Friend: Huh? OK smarty, how do you spell it?
First 8-year old: i-t.
How often are you in a meeting, or a hallway conversation, where everyone jumps right in arguing about who’s doing it, or how to do it right, or how to measure it? If you catch yourself for a moment, and take a step back, you see that everyone is talking about something different. Even if there’s a conclusion or a decision, it didn’t really follow from the discussion.
Taking just one example, let’s talk about code review.
Part of being respectful when deploying code review, as with any process, is first checking if anyone is doing it already, and learning those examples. I spend some part of my time these days looking for those people. Every so often I find someone who says, “Code review? Sure we’re doing it.”
But then I start asking questions. I admit I have an opinion in mind. Some minimal attributes of code review. But listening means asking my question and then keeping quiet, so I do.
I ask, are the issues recorded? (Yes? Can I see them? No? Why not?)
Do you review all the code? (No? Why not?)
What does the code author do with the issues? (How do you know?)
When you do these code reviews, what is your goal? (Why? And are you meeting the goal?)
How often are these code reviews? (Haven’t done any in a while? When did it stop? Why?)
I consider myself fortunate when I actually get answers. I haven’t had any positive, consistent answers yet.
Does that matter?
Well, without these answers, I know nothing about it.
So, code review—are you doing it?
The Dangers of Counting
On Slow Leadership today they’re talking about Occam’s Razor. Choosing the simplest explanation. And the dangers of setting numerical targets.
But metrics are so easy to collect these days. The temptation to count things is great.
Let’s say you are deploying a new process, or tracking progress on an existing one. What harm could there be in counting, looking at the numbers, and making decisions based on them? Isn’t that sound management?
The problem is assuming that by counting something you learn more about it.
Actually all you learn is how many of something there were.
You don’t know anything about those somethings, nor why the count came out as it did.
When you count deployment events, there is no positive integer count that tells you much by itself.
Even zero doesn’t tell you unequivocally what happened. I can think of at least 3 possibilities about the events:
a) Didn’t happen
b) Happened but insubordination or fear among the counters
c) Happened but misunderstanding about what to count
Things get worse if the numbers are positive. How do you know whether a number is large or small? More, or less?
For example, trying to measure software development, let’s say you counted 24 high-severity defects last month. Is that high or low? Is it more or less than the month before? Kind of depends, doesn’t it, on how much code you wrote, how much testing was done, how good the testing was, which defects were recorded, and so on.
Forget distributions — pie charts and the like. How do you know if people understood which category an event belonged in?
What does all this leave us with? No graphs? No counting? No decision-making based on numbers? Back to the Stone Age?
No. It just means something counter-intuitive in today’s modern world of computers: knowledge before counting.
First learn the qualitative details of your events. What is a satisfactory defect entry? A relevant test result? A productive code review? Make sure all participants can recognize them.
Then you can start counting.
1, 2, 3 …
P.S. Want extra credit? Read just the first page of What is Mathematics? by Courant and Robbins. The section is called “The Natural Numbers: Introduction”.