decoration decoration
Stories

GROKLAW
When you want to know more...
decoration
For layout only
Home
Archives
Site Map
Search
About Groklaw
Awards
Legal Research
Timelines
ApplevSamsung
ApplevSamsung p.2
ArchiveExplorer
Autozone
Bilski
Cases
Cast: Lawyers
Comes v. MS
Contracts/Documents
Courts
DRM
Gordon v MS
GPL
Grokdoc
HTML How To
IPI v RH
IV v. Google
Legal Docs
Lodsys
MS Litigations
MSvB&N
News Picks
Novell v. MS
Novell-MS Deal
ODF/OOXML
OOXML Appeals
OraclevGoogle
Patents
ProjectMonterey
Psystar
Quote Database
Red Hat v SCO
Salus Book
SCEA v Hotz
SCO Appeals
SCO Bankruptcy
SCO Financials
SCO Overview
SCO v IBM
SCO v Novell
SCO:Soup2Nuts
SCOsource
Sean Daly
Software Patents
Switch to Linux
Transcripts
Unix Books
Your contributions keep Groklaw going.
To donate to Groklaw 2.0:

Groklaw Gear

Click here to send an email to the editor of this weblog.


To read comments to this article, go here
Coupling and the Maintainability of the Linux Kernel- by Dr Stupid
Thursday, March 31 2005 @ 08:58 AM EST

Coupling and the Maintainability of the Linux Kernel
~ by Dr Stupid

A recently presented paper has the following abstract, something that would certainly gain the attention of anyone interested in Linux kernel development:

Categorization of Common Coupling and Its Application to the Maintainability of the Linux Kernel

Data coupling between modules, especially common coupling, has long been considered a source of concern in software design, but the issue is somewhat more complicated for products that are comprised of kernel modules together with optional nonkernel modules. This paper presents a refined categorization of common coupling based on definitions and uses between kernel and nonkernel modules and applies the categorization to a case study.

Common coupling is usually avoided when possible because of the potential for introducing risky dependencies among software modules. The relative risk of these dependencies is strongly related to the specific definition-use relationships. In a previous paper, we presented results from a longitudinal analysis of multiple versions of the open-source operating system Linux. This paper applies the new common coupling categorization to version 2.4.20 of Linux, counting the number of instances of common coupling between each of the 26 kernel modules and all the other nonkernel modules. We also categorize each coupling in terms of the definition-use relationships. Results show that the Linux kernel contains a large number of common couplings of all types, raising a concern about the long-term maintainability of Linux.

To anyone with a knowledge of software engineering terminology, whether gained through formal education or from the University of Life, the first 90% of the abstract is uneventful; this, though, serves to maximize the impact of the final sentence. A "concern about the long-term maintainability of Linux," no less. Mr A. Linux Kernel went to the effort of writing that reports of his destruction had been exaggerated, but now we find in this paper rumours are circulating of a life-threatening illness.

The full paper is only available to subscribers (note however that one of the authors makes a copy available on his personal website here [PSF]), but we were fortunate to be able to discuss the paper with Andrew Morton, one of the lead kernel developers, in two contexts: first, in a general discussion about coupling and kernel maintainability, and then, after he had read the complete paper, in specific terms related to the thoughts expressed by the authors. As you will see, despite the worries expressed in the paper, the Linux kernel is alive and well.

The researchers, in designing a theoretical model to evaluate the coupling of Linux, have of necessity made certain assumptions to reduce complexity and make the problem amenable to a mathematical, quantitative approach. However, this can lead to inaccurate results: you may recall the possibly apocryphal tale of the mathematical demonstration that bumblebees can't fly. (As an aside, there is also a parallel here with studies showing operating system X to be "more secure" than operating system Y, when on closer inspection the definition of "more secure" is a narrow and potentially misleading, but easy to calculate, statistic figure.)

What is coupling?

"Coupling", a term which uses a visual metaphor of mechanical parts coupled together by a driveshaft, is used widely in software engineering to describe a link between two parts of a system that is not part of an abstracted interface. We make this distinction because the parts of a system have to be linked in some way -- otherwise there would be no system. For the benefit of Groklaw's less technical readers, I'll try to explain the concept in non-software terms (kernel developers may skip the next few paragraphs.)

Imagine that the steering wheel of a car was like the steering wheel one can buy for playing computer driving games -- that is to say, it merely generated an electrical signal that said "a little bit left," "hard to the right," etc. and that this signal was passed to a device under the bonnet that turned the front wheels. You could replace the steering wheel with a similarly wired joystick, or anything that generated an appropriate electrical signal, and you could still drive the car. We would call this an abstracted interface. The communication between the two parts (the steering wheel and the mechanism that turns the front wheels) has been reduced to its conceptual essence of "I want to go left" and "I want to go right."

In a typical car, though (especially one without power steering) the steering wheel is directly and mechanically linked to the front wheels. You could not easily replace the steering wheel with a joystick, because the whole mechanism depends on the wheel being turned left and right. But not only is the interface less abstracted, but it is also highly coupled. You can feel bumps and vibrations coming back up from the wheels on the road. In other words, the coupled interface means that what happens to one part of the mechanism (going over a rock) has a knock-on effect on the other (giving you a pain in the wrists) that wasn't necessarily desired.

Going back to software terms, we would describe modules A and B as coupled if, to operate properly, A relies on B's internal workings to be a certain way, and vice versa. Just as a traditional steering wheel is sensitive to holes in the road, A becomes sensitive to changes inside B. That introduces a risk that when a bug is fixed in B, it may cause an unexpected problem in A. It is this "knock-on effect" result of coupling that makes software engineers -- especially when talking theoretically -- nervous of coupling. They invent approaches like "Model View Controller" to discipline themselves against thoughtless coupling.

However, I hope that the above example also shows you the other side of the coin. The high-tech electronic steering wheel was less coupled, but more complex. There are more elements to go wrong, and a fault may be harder to find. Also, some drivers would like to "feel the road" via the steering wheel, and to give this feedback in the electronic system would require more complex circuitry still. Sometimes, the costs of eliminating coupling in a system outweigh the gains.

Back to the kernel

The paper focused on data coupling; roughly speaking, this is where two or more software parts all make direct use of the same area of computer memory. This can lead to situations where a particular part can have data changed "behind its back," as it were. The developer has to bear this in mind when writing the code, which isn't always easy.

We asked Frank Sorenson to read the paper and here is his comment:

Too many dependencies between modules can obviously be viewed as a bad thing. However, no coupling/dependencies leads to multiple copies of the same thing, which is obviously more difficult to maintain. For example, the Linux kernel contains a library of common functions that may be used in the various modules. A month or so ago, someone realized that 6 different modules all implemented a 'sort' function, all with the same interface to the module. This brought about a push to standardize them, and a single 'sort' function was put into the common function library.

We've already mentioned that the costs of decoupling aren't always justified -- this is a case in point. In this instance, increasing the use of common code -- while increasing the coupling -- reduced the maintenance requirements.

Frank continues:

The article was submitted in July 2003. That's quite a while ago in Linux-kernel-time. A lot has changed since then, and 2.6.x is (in my opinion) more maintainable due to being well-engineered from the beginning. Do the authors have results for the 2.6.x kernel? How does the use of global variables change from 2.4.x to 2.6.x?

The kernel maintainers have pushed to make sure that the interface to kernel functions remains the same. For example, it would not be acceptable to change the way a common function behaves: copy_value(source, destination) should not ever change to copy_value(destination, source) (unless all references are fixed)

Linux modules are generally organized in an hierarchical fashion. This makes it much harder for a change in one area to affect other modules or portions of the kernel.

Obviously, what the authors discuss is a very real danger (not specifically to Linux, but to any sufficiently large project -- such as Longhorn!). The authors don't offer many valid suggestions on how to combat the problem. The fact that Linux is open allows them to do the research, however; the closed nature of Windows prevents people from seeing how Microsoft has addressed this problem (if at all.)

If Linux is too tightly coupled, how about Windows? Having your entire user interface dependent on a web browser -- now that's coupling!

My personal opinion is that the 2.6 is much tidier and more organised than 2.4, which in turn was tidier than 2.2, etc. The direction of the Linux kernel is towards a cleaner, less coupled architecture -- there is an active, ongoing, continuous effort to preserve maintainability. Indeed, patches are frequently rejected purely on the grounds they harm maintainability and have to be re-engineered accordingly.

Andrew Morton's comments

However, you probably didn't read this far to hear Frank and I discussing the kernel, when we have Andrew Morton available. Here's his initial comment on the abstract:

They examined a kernel (2.4.20) which is unchanged in this regard from 2.4.15. We've done three and a half years of development since then! That being said, I wouldn't be surprised if their analysis showed that linux-2.6.11 also has a lot of coupling, even though we have done a lot of improvement work in that and other areas.

But that's OK -- we often do this on purpose, because, although we are careful about internal interfaces, the kernel is optimized for speed, and when it comes to trading off speed against maintenance cost, we will opt for speed. This is because the kernel has a truly massive amount of development and testing resources. We use it.

More philosophically, I wouldn't find such a study to be directly useful, really. It represents an attempt to predict the maintenance cost of a piece of software. But that's not a predictor of the quality! If you find that the maintenance cost is high, and the quality is also high, then you've just discovered that the product has had a large amount of development resources poured into it. And that is so. And it is increasing.

If someone wants to use this study to say that "Linux is likely to be buggy" then I'd say "OK, so show me the bugs". If they're using it to say "Linux kernel maintenance uses a lot of resources" then I'd say "Sure. Don't you wish you had such resources?".

Note that I'm not necessarily agreeing with the study. If they looked at the kernel core then sure, there's a lot of coupling. But that's a relatively small amount of code. If they were looking mainly at filesystem drivers and device drivers (the bulk of the kernel) then I'd say that the study is incorrect -- the interfaces into drivers is fairly lean, and is getting leaner.

Andrew then went on to read the paper in detail. His subsequent comments were rather different:

AAAARRRGGGGHHH! . . .The only thing they've done is look at the use of global variables and they've assumed that using a global variable is a "bad" coupling. And look at the naughty global variables which we've used:

jiffies: This is a variable which counts clock ticks. Of course it's global. Unless they know of a universe in which time advances at more than one speed at a time.

[Dr S: System time has to be global because time is a universal throughout the system.We don't usually worry about Einstein in software development :) ]

And they fail to note that if we did want to "modularize" jiffies, we'd make a change in a single file:

#define jiffies some_function_which_returns_jiffies()

Other examples such as system_utsname, init_task, panic_timeout, stop_a_enabled, xtime and `current' are all by definition singleton objects.

'current' is especially bogus -- this refers to the task structure for the currently-running task. It's not a global variable at all, really. If this is bad, then using the variable 'this' in C++ is also bad.

Geeze. Who reviewed this?

Theory vs Practice

Of course, one can engage in armchair debate endlessly; ultimately, what is needed is some empirical data against which a model or theory can be tested. Coupling, like cholesterol, comes in "good" and "bad" forms. The good form enables a system to work at peak performance, without introducing excessive maintenance costs. The bad form results in a system that is increasingly fragile and hard to scale. Which of these in practice has been uppermost in Linux kernel development?

This kernel mailing list thread from 2002 -- discussing a kernel of similar vintage to that covered by the study -- is of interest. Several people expressed a worry that the kernel would never effectively scale beyond 4 CPUs -- and coupling was one of the issues:

[2-CPU SMP] makes fine sense for any tightly coupled system, where the tight coupling is cost-efficient.

Three years later, have "long-term maintainability" issues in the Linux kernel held it back? Here's what Novell said last July [PDF] on the topic:

"More than 128 CPUs have been tested on available hardware, but theoretically, there is no limit on the number that will work."

This bumblebee continues to fly.


  View Printable Version


Groklaw © Copyright 2003-2013 Pamela Jones.
All trademarks and copyrights on this page are owned by their respective owners.
Comments are owned by the individual posters.

PJ's articles are licensed under a Creative Commons License. ( Details )