I asked Greg Kroah-Hartman if he'd write an article explaining the Linux kernel development process. One of the most common FUD themes is to imply that unknown, untrusted parties are contributing heaven-knows-what to the Linux kernel. This is totally inaccurate, in fact it's upside down from the truth. The truth is every little piece is chronicled from the moment it is submitted. That isn't the only misconception about the Linux kernel development process.
As you know, The SCO Group, in its discovery requests in SCO v. IBM, asked for all non-public IBM contributions to Linux. Linux is developed in public, so when I read their request for nonpublic patches, I realized there is a need to explain the process. Greg is the current Linux kernel maintainer for, as he puts it,
"more driver subsystems than he wants to admit, along with the
driver core, sysfs, kobject, kref, and debugfs code." He
currently works for Novell's SuSE Labs, doing Linux kernel
development-related things. He is also one of the authors of the best-selling book, "Linux Device Drivers." I also asked Andrew Morton what would happen if someone did try to submit a patch privately, because Greg wrote that occasionally that happens if a company or an individual is new to Linux and doesn't realize that Linux is developed in public, that there is a public review process, and a right way to offer submissions. If that happens, then what? Here is Andrew's answer, which matches what Greg writes: Occasionally people will send me a patch off-list. If the patch is trivial
I'll sometimes merge it into my tree and will later send it on to Linus.
But on most of those few occasions when I get an off-list patch I'll ask
the submitter to resend it with a Cc to the appropriate mailing list so
that it gets appropriate review.
But even if a patch is sent off-list to a subsystem maintainer, it is still
open to review in the -mm tree prior to being merged into Linus's tree.
And, ultimately, *all* patches which go into Linus's tree are
simultaneously sent to the `commits' mailing list for all interested
parties to review. All patches on the commits list have the full
attribution trail so we can see who was involved. Because of the commits
list it is simply not possible for anyone to slip a patch into the kernel
without a heck of a lot of developers knowing about it.
IBM, of course, knows the procedure for submitting patches to Linux. So while others who are newer to Linux might get confused and attempt to send directly to a maintainer, IBM is not likely to have ever done so. Even if they had tried, it would have been made public by the individual who received the email or someone further up the chain. The key point of Linux development is that there is a public review process, a review by many eyeballs. The qualilty is built into that development process. Bypassing that public review vitiates that power, so it is avoided. Note that Greg lists two references for those who wish to know how to properly submit a patch. Here's a third, a talk Greg gave in 2002 on proper Linux kernel coding style, one of the many interesting things on his Greg K-H's Linux Stuff web site.
***************************
How the Linux Kernel Development Process Works
~ by Greg Kroah-Hartman
There seems to be a lot of misunderstanding about how code actually
gets into the Linux kernel. People are claiming that code can just get
"slipped into" the main kernel tree without realizing where it really
came from, or without any sort of review process. Obviously they have
never actually tried to get a major kernel patch accepted, otherwise
they would not be making these kinds of claims :)
First, what do we mean when we speak of a "patch"? In order to get any
kind of change accepted into the kernel, a developer has to generate
something called a "patch" and send it to the maintainer of the code
they are changing (more on that process below.) To do this, they make
the changes needed to the specific part of the kernel that they wish to
modify, and then run a tool called 'diff'. This tool generates a human
readable file that shows exactly what lines of code were modified, and
what they were changed into. A very simple example of this can be seen
here:
--- a/drivers/usb/image/microtek.c
+++ b/drivers/usb/image/microtek.c
@@ -335,7 +335,7 @@ static int mts_scsi_abort (Scsi_Cmnd *sr
mts_urb_abort(desc);
- return FAILURE;
+ return FAILED;
}
static int mts_scsi_host_reset (Scsi_Cmnd *srb)
This shows that the file, drivers/usb/image/microtek.c had one line of
code changed. From:
return FAILURE;
to:
return FAILED;
This bit of text can then be emailed to other people, who can instantly
see that yes, it only changes 1 line of code, and yes, this is probably
a correct thing. Then they run another program called 'patch' and give
it this bit of text. The patch program then modifies the specified file
in the specified way. Because the developer uses the program 'patch' to
apply this bit of text, the bits of text themselves have come to be
called 'patches'.
All Linux kernel development is done by sending patches though publicly posted email.
If you take a look at the main Linux kernel development mailing list,
you will see hundreds of these patches being sent around, discussed,
critiqued, and even accepted, into the main kernel tree. This is how
kernel development is done.
If you wish to know more about how to create a patch that is acceptable
to the kernel developers, please see the file,
Documentation/SubmittingPatches for more information as to
what is needed to be specified in the patch, and how to compose it. Also,
other good references are these files:
Andrew Morton's description of the "perfect patch":
http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt
Jeff Garzik's description of what to include in a patch to make it easy
for others to understand it:
http://linux.yyz.us/patch-format.html
Now, who is generating these patches, and who does anything with them?
The Linux kernel development group is a vast group of people that have
structured themselves in a pseudo-pyramid form. At the base of the
pyramid are the hundreds of developers who write anywhere from 1 to 2000
different patches. At last count, there were about 1,000 different
individual contributors to the 2.6 Linux kernel. These developers send
their patches on to the maintainer of the file or groups of files that
they have modified. These maintainers are spelled out in the file
MAINTAINERS that is in the main Linux kernel source tree. There are
about 300 different maintainers currently. If the maintainer feels that
the change is a proper one, and they agree with it, they then send these
changes off to the subsystem maintainer for the major part of the kernel
being modified. Subsystem maintainers are present for almost all parts
of the kernel, examples of which are, networking, USB drivers, Virtual
File System, module core, driver core, Firewire drivers, network
drivers, and so on. These people are also listed in the MAINTAINERS
file, and all individual file and driver maintainers know who these
people are to send these changes to. Then, the subsystem maintainers,
if they agree with the change, then submits the patches to Linus
Torvalds or Andrew Morton, depending on what they are used to doing, and
from there it makes it into the main kernel source tree. Note, that
every person who touches the patch along this chain of submission, adds
a "Signed-off-by:" line to their code, which shows exactly where the
change came from, and who approved it. A number of us kernel developers
call this the "trail of blame", meaning that if someone has a problem with the
change, we know exactly who to blame for the issue.
I originally stated that this is a "pseudo-pyramid" structure. I said
this as the full process of sending patches do not always flow in such a
neat way. Sometimes people short-circut the maintainer of a subsystem,
and send a patch directly to Andrew or a mailing list. Other times, a
subsystem maintainer will modify code that is controlled by another
maintainer, and not specifically get their blessing before submitting it
on upward. Also, maintainers and subsystem maintainers are always
changing, as new people come into kernel development, and older ones
leave.
Sometimes a patch is submitted directly to a maintainer,
without being sent to a public mailing list. This usually happens by new developers who are not
used to the whole review process, and occasionally happens for "trivial"
patches, that simply fix an obvious bug. For small 1-2 line
bugfixes, the maintainer might accept them directly, and then
accumulate them in their development trees (which are all
publicly available in Andrew Morton's -mm kernel releases.) But
for bigger patches, the maintainer usually asks the submitter to
resend them and CC: a public mailing list in order for other
developers to review them. If that never happens,
the patch goes nowhere.
How do the patches go from person to person?
All development is done through email. Developers send patches through
email to other developers by sending them to different mailing lists.
There is one main mailing list for all kernel development,
linux-kernel. This list gets about 200-300 emails a day, and
almost all aspects of the kernel are discussed on it. Because of the
high volume on it, almost all different subsections of the kernel have
formed their own mailing lists, in order to get work done and focus on a
specific area. Some examples of specific mailing lists are:
All of these mailing lists are archived by a wide range of different
archive sites, allowing people to go back in time and see what happened,
and search for specific things. Some examples of archive sites are
http://marc.theaimsgroup.com/
and
http://www.gmane.org.
So a patch is posted on a mailing list. Other developers then critique
the patch, and offer suggestions, again, copying the mailing list for
everyone to see. Eventually some kind of consensus is reached, and the
patch is accepted by the maintainer to submit on up the chain. All of
this is done in public, for everyone to see, and archived, in public,
again, for everyone to see.
As an example, recently someone submitted a small patch that added a new
function and changed a few others in order to support a new type of
hardware the is being created. That can be seen here:
http://thread.gmane.org/gmane.linux.kernel/297422
A number of different developers chimed in, and offered suggestions as
to how to make the patch better:
http://article.gmane.org/gmane.linux.kernel/297427
and:
http://article.gmane.org/gmane.linux.kernel/297463
The original author took those comments, and then created a new patch:
http://article.gmane.org/gmane.linux.kernel/297675
which was then commented on, and the development continued.
This is how kernel development usually works, in the open, with everyone
being able to see everything that happens. That is why when people
complain about not knowing everything that a specific company has done
for Linux, they are usually very misguided.

|