Andrew Morton saw Groklaw's coverage of the "Linux is not forking like Unix" article, and he has now graciously provided his speaker's notes from SDForum, on the theme of "the interface between open source software development and the software-using business world." He says, "It's very close to what was
said." I know, knowing you like I do, that you will enjoy it much more than any third-party report about what he allegedly said. I found it fascinating reading, and I'm happy I can share it with you now. UPDATE: You might like to check out Lamlaw's November 22 article on this speech.
*********************************
SDForum, 16 Nov 2004.
Will today talk about several issues related to the interface between open
source software development and the software-using business world. These two
big and quite different communities have come together in recent years with
considerable success and surprisingly little friction. I'll be looking at
matters related to the development, or creation of open source software rather
than, say, the adoption of open source software.
I'll talk for maybe 20 minutes and will leave 15 minutes or so for a q-n-a
discussion. All this is just one guy's opinion and I can and do make
mistakes. So I'd really seriously ask that if people have disagreements with
what I say or if they perceive insufficiencies in it, please let's bring those
up in the discussion -- otherwise I'll just end up spouting the same garbage
next time I stand up in front of some long-suffering people such as
yourselves.
- software engineer, working on Linux kernel.
- Along with LT, have overall responsibility for development, delivery
and quality of the public Linux kernel, available from kernel.org. Mainly
do that by collecting, integrating and re-releasing the work of the many
members of the kernel development team.
The public kernel is an input to the kernel which is released by
Linux distributors such as Red Hat and Suse/Novell.
- It takes thousands of software packages to make up a distribution, and
the kernel is just one of them. But it is the single thing which defines
that distribution as being a "linux" distribution.
- My words are of course most applicable to the kernel project, but can
be generalised to many of the most important open source software
projects.
- It's interesting to note that the most important and successful open
source products implement what one could describe as "legacy
infrastructure". Let's look at those two words:
Legacy:
- These products are implementing something which has been done many
times before: operating system kernels, runtime software libraries, window
managers, http servers and their variously-tiered tools, mail servers,
various forms of file server, image manipulation programs, programming
language compilers and interpreters, word processors, spreadsheets,
database management, etc.
Many of the above are thirty or more year-old technology. Legacy
stuff which everyone knows how to implement. All the intellectual
property value has been wrung out of these technologies years ago and
anyone who ships such products commercially is, to a large extent,
providing to their customers a low-margin maintenance and support
function.
Why did I describe it as "infrastructure"?
- Many of these successful open source products are implementing
functions which other, higher-level software builds upon. The operating
system, the libraries, the low-level network servers, the database tools,
etc.
All of these provide basic infrastructure which will sit underneath
non-open-source software products which are developed and marketed in the
conventional commercial manner. ISV's are concentrating their investment
and their innovation on higher-level customer-facing products while open
source provides the legacy backend of the software stack.
- So the term "legacy infrastructure" places successful open source
software into its commercial, historical and IT engineering context.
The rule is not universally true, of course. There are some open
source products which are indeed state-of-the-art with research in their
fields and which are competitive with commercial products. Examples of
this would include projects such as valgrind (a form of software debugging
tool) and the Ogg Vorbis project, which continues to deliver world-class
media streaming codecs.
But such projects are the exception in the open-source world: frankly,
if an open-source team is working well together, developing and delivering
leading-edge software which others find valuable then that team should go
and form a company and take a shot at getting rich with it -- this is not
the space where open source licensing makes sense.
Let's look briefly at the resourcing for open source development.
- In the Linux kernel project, pretty well all of the main developers
are working fulltime for technology companies of one form or another. The
days of the bearded geek working in his basement purely for his own
satisfaction are long gone on such projects.
Companies pay staff engineers to work on open source software products
for several reasons.
- because they have a commercial interest in the quality of those products.
- so they have some leverage on the product's future feature set.
- so they have staff at hand who understand and can support the product.
- if they manufacture hardware: so the product supports that hardware well.
- Rather than directly hiring their own engineers, companies will
also fund open source development by entering into contractual
arrangements with other parties, mainly Linux distributors, for all of
the same reasons. The main distributors of Linux are employing a large
number of world-class engineers who work purely on open source products.
- Companies who contribute to open source projects place their engineers
into a peer-to-peer relationship with the rest of the development team,
thus gaining influence over the project and total visibility into the
development and planning processes.
All these good things do not come for free. They only really come
when the company (or, more specifically, individuals within that company)
become recognised contributors to the project.
It's a sad fact of life that if someone pops up out of nowhere with a
question or a contribution, they will have a hard time getting attention.
This is not because of spite. We're not saying "you haven't helped us
before so we're not talking to you". The reason why newcomers tend to
face barriers is derived from the open source trust model -- if a
contributor has a track record then we can take their changes with a
degree of comfort. But if we've never heard from the contributor before
then we basically need to go through their code line-by-line, and it's a
ton more work for already busy people.
One of the things which I do is to try to prevent such things from
simply falling on the floor. If someone is having procedural or process
problems with the kernel team then they should contact me directly and I
can generally offer advice, grease wheels, make things happen, kick heads
or whatever else needs doing.
- Companies contribute engineering resources to open source projects for
two strategic reasons:
- Firstly: resource pooling. Maintaining an entire OS is expensive,
but with open source you get to pool development resources with the
other users of the product while retaining many of the benefits of an
in-house development project.
- And the second main reason why companies contribute to open source
is to avoid vendor lockin. One way to obtain your low-level software is
to simply license it from another IT vendor, and the cost of this could
well be similar to the cost of using and contributing to an open source
equivalent. But with open source you get full access to all the
technology, you get access to the products key developers and you get
full rights to modify the product if you need to do so and you get good
visibility into the product's roadmap. In fact, you can to some extent
control that roadmap if you're prepared to put appropriate resourcing
into it.
So the one-sentence summary of open source from a technology businessperson's
point of view would be: a source of legacy infrastructure software whose
development is cost-optimised via resource pooling and which naturally
provides protection against vendor lock-in.
How do new features find their way into "legacy infrastructure" open source
projects such as the Linux kernel? In other words, what is the requirements
analysis and planning process?
- First up, with a legacy project, the feature set tends to be well
understood.
We're implementing 30-year-old technology, so we're working to all that
prior understanding of how these things should traditionally operate. This
removes a lot of uncertainty from the design process.
And to a large extent we're strongly guided by well-established standards:
POSIX, IEEE, IETF, PCI, various hardware specs, etc. There's little room
for controversy here.
- Generally, new features are small (less than one person-year) and can be
handled by one or two individuals. This is especially true of the kernel
which, although a huge project is really an agglomeration of thousands of
small projects. Linus has always been fanatical about maintaining the
quality and sanity of interfaces between subsystems, and this stands us in
good stead when adding new components.
This agglomeration of many small subsystems fits well into the
disconnected, distributed development team model which we use.
If the project was a large greenfield thing, such as, say, an integrated
security system for the whole of San Jose airport then open source
development methodologies would, I suspect, simply come undone: the amount
of up-front planning and the team and schedule coordination to deliver such
a greenfield product is much higher than with "legacy infrastructure"
products.
The resourcing of projects in the open source "legacy infrastructure" world is
interesting. We find that the assignment of engineering resources to feature
work is very much self-levelling. In that if someone out there has sufficient
need for a new feature, then they will put the financial and engineering
resources into its development. And if nobody ends up contributing a
particular feature, well, it turns out that nobody really wanted the feature
anyway, so we don't want it in the kernel. The system is quite
self-correcting in that regard.
Of course, the same happens in conventional commercial software development:
if management keeps on putting engineers onto features which nobody actually
wants then they won't be in management for very long. One hopes. But in the
open source world we really do spend zero time being concerned with programmer
resource allocation issues -- the top-level kernel developers never sit around
a table deciding which features deserve our finite engineering resources for
the next financial year. Either features come at us or they do not. We just
don't get involved at that level.
And this works. Again, because of the nature of the product: a bundle of
well-specified and relatively decoupled features. If one day we decided that
we needed to undertake a massive rewrite of major subsystems which required 15
person years of effort then yes, we'd have a big management problem. But that
doesn't happen with "legacy infrastructure" projects.
Development processes and workflow
- All work is performed via email. Preferably on public mailing lists so a
record of discussions is available on the various web archives. I dislike
private design discussions because it cuts people out of the loop, reduces
the opportunity for others to correct mistakes and you just end up repeating
yourself when the end product of the discussion comes out.
- Internet messaging via the IRC system is used a little bit, but nothing
serious happens there -- for a start it's unarchived so for the previously
mentioned reasons I and others tend to chop IRC design discussions off and
ask that they be taken to email.
- We never ever use phone conferences.
- The emphasis upon email is, incidentally, a great leveller for people who
are not comfortable with English -- they can take as much time as they need
understanding and composing communications with the rest of the team.
- Contributors send their code submissions as source code patches to the
relevant mailing lists for review and testing by other developers. The
review process is very important. Especially to top-level maintainers such
as myself. I don't understand the whole kernel and I don't have the time or
expertise to go through every patch. But I very much like to see that
someone I trust has given a patch a good look-over.
- When a patch has passed the review process it will be merged into one of
the many kernel development trees out there. The USB tree, the SCSI tree,
the ia64 tree, the audio driver tree, etc. Each one of these trees has a
single top-level maintainer.
I run a uber-tree called the "mm kernels" which integrates the latest
version of Linus's tree with all the other top-level trees (32 at the last
count). On top of that I add all the patches which I've collected from
various other people or have written myself -- this ranges from 200 to 700
extra patches. I bundle the whole lot together and push it out for testing
maybe twice a week.
When we're confident that a particular set of patches has had sufficient
test and review we will push that down into Linus's tree, which is the core
public kernel, at kernel.org.
- Vendors such as Red Hat and Suse will occasionally take a kernel.org
kernel and will add various fixes and features. They will go through a
several month QA cycle and will then release the final output as their
production kernel.
- The preferred form of bug reports from testers is an email to the relevant
mailing list. We go through a diagnosis and resolution process on the
public list, hopefully resulting in a fix. This whole process follows a
many-to-many model: everyone gets to see the problem resolution in progress
and people can and do chip in with additional suggestions and insights.
This process turns out to be quite effective.
- We do have a formal web-based kernel bug reporting system, using bugzilla.
But the bugzilla process is one-to-one rather than many-to-many and ends up
being much less effective because of this. I screen all bugzilla entries
and usually I'll bounce them directly to up email if I think the problem
needs attention.
The mailing lists are high volume and it does take some time to follow them.
But if a company wishes their engineering staff to become as effective as
possible with the open source products, reading and contributing to the
development lists is an important function and engineer time should be set
aside for this.
The new kernel development model
People may have heard about this, and it does significantly affect consumers
of the public kernel.
In previous kernel release cycles over the past ten years we've followed a
model of a 2-, 3- or 4-year development cycle, after which the kernel is
declared "stable". Linus will hand that stable kernel off to a lieutenant and
Linus will then fork off a new unstable kernel for ongoing development. The
forked-off stable kernel is the prime source base for kernel consumers for the
next several years and we are very cautious and conservative about what
changes are made to it. This had the downside that the stable kernel tended
to lag in features and device support, so vendors ended up adding a lot of
their own patches before releasing that kernel in distributions. It had the
upside that the codebase was as reliable as we could make it.
In July of this year we tossed all that out the window. Because we'd
discovered that we could keep a massive rate of change flowing into the tree
without destabilising it too much. This is partly because we've improved our
processes and tools (the adoption of the bitkeeper revision control system
helped here). It is also a reflection of the increasing number and skill of
the kernel development team. It is also a reflection of the increased quality
of the kernel itself: the need for massive destabilising rewrites which break
the whole world simply isn't there any more, so we don't need those long
developemnt cycles.
So we're currently running on a roughly two-month development cycle. We pack
a whole lot of new features and enhancements into each release, but they're
months apart rather than years apart. So the kernel continues to make good
progress, but at a steadier rate.
This does mean that the production kernel is not as bug-free as it would be if
we were concentrating only upon stability, but we do continue to modify these
new processes and the kernel's quality does continue to improve even as we add
new features to it.
|