SCO's Chris Sontag said, at the Harvard appearance, that despite Linus' claiming authorship of ABI files such as errno.h and stating he didn't refer to UNIX when writing them, he, Sontag, still had issues with those files. So Groklaw member Warren Toomey from the Unix Heritage Society has done some work digging up a bit more on errno.h. I'm sure it will convince the reasonable folks at SCO that they are barking up the wrong tree. Or it won't. But the rest of you may read it and reach your own conclusions. This is only the first in what will be several articles on the ABI files by Dr. Toomey.
****************************************************
The ABI Files: Errno.h
~ by Warren Toomey, the Unix Heritage Society
Introduction
In SCO's REVISED SUPPLEMENTAL RESPONSE
TO DEFENDANT'S FIRST AND SECOND SET OF INTERROGATORIES, their supplemental
response to interrogatory 12 states:
SCO objects to this question as overly broad and unduly burdensome,
and on the basis that it seeks information neither relevant nor calculated
to reasonably lead to the discovery of admissible evidence insofar
as it requests the identity of source code and other material in Linux
contributed to Linux by parties other than IBM or Sequent. Subject
to and without waiving these objections, as it pertains to SCO's rights
involving IBM's contributions to Linux, SCO has set forth that information
in response to Interrogatories Nos. 1 and 9 and the corresponding
exhibits. As to others who have violated the terms of their Software
and Sublicensing Agreements, that information is contained in Exhibits
A through C. Specifically, in Exhibit A, it details the line-for-line
copying of UNIX System V code that improperly appears in Linux. Similarly,
in Exhibit B, SCO identifies the application binary interfaces ("ABIs")
that SCO has rights to that are improperly in Linux. Specifically,
in 1992, Unix Systems Laboratories (USL), SCO's predecessor in interest,
sued Berkeley Software Design, Inc. (BSD) for, among other things,
copyright infringement. One of the bases of that action was BSD's
copying and distributing some USL UNIX System V files without proper
permission or attribution. The confidential Settlement Agreement that
ended the Unix Systems Laboratories, Inc. v. Berkeley Software Design,
Inc., litigation required BSD to change the copyright information
in certain of these files, including the nine files listed in Exhibit
B. To SCO's knowledge, BSD complied with the terms of the Agreement,
and gave USL the proper attribution, as also set forth in Exhibit
B. At a later time, persons as yet unknown copied these files into
Linux, erasing the USL copyright attribution in the process. The files
in Linux that improperly use the ABIs are as follows [list omitted]:
SCO asserts that "line-for-line copying of UNIX System V code
. . . improperly appears in Linux'' and that "persons as yet unknown
copied these files into Linux, erasing the USL copyright attribution
in the process''. This report looks at SCO's assertion of direct
copying of System V code into Linux with copyright removal and compares
it with the assertion from Linus Torvalds that the code in question
came from another source. This report examines only the ABI file errno.h.
The errno.h file in all Unix and Unix-like systems (and in
many other non-Unix systems) is a list of possible errors that can
be returned to an application program when it asks the operating system
to perform a task, known as a "system call'', and that task cannot
proceed normally. Some of the reasons for system call failures are
lack of permissions, others are temporary lack of resources, while
others occur because the application program gave an invalid request
to the operating system.
Many systems share a common list of errors, and this list of errors
is defined by the POSIX
standard and also the Single UNIX standard. As these are both open
standards, SCO cannot claim any copyright on the list of error names.
However, each error must have a unique number, so that the operating
system can communicate the error number back to the application program.
For example, the error "operation not permitted'' (known as EPERM
in the POSIX standard) might be given the value 2 in a specific Unix
or Unix-like system. The actual value for each error is not defined
by the POSIX standard, but if systems do use a consistent error numbering
scheme, then executable binaries from one system can run on other
systems and understand the errors that the other systems report.
The choice of numbers and how they are alloted to the errors is
arbitrary and without 'expressive content', so the mere facts of what
number goes with which error cannot normally be copyrighted.
To have a valid assertion that "line-for-line copying of UNIX
System V code . . . improperly appears in Linux'' for errno.h,
SCO needs to demonstrate that error names, their numeric values, and
any associated program comments were directly copied from System V
to Linux.
Errno.h in Linux 0.01 to 0.96c
Linus Torvalds released version 0.01 of the Linux kernel source around
the "middle of [19]91'',
and this includes the kernel file linux/include/errno.h.
SCO asserts that this file was copied from System V source code as noted
above. Linus
and others, on the other hand, assert that the file "was taken from Minix''.
Let's examine Linus' assertion and then SCO's assertion.
Linus believes that he used the error definitions in Minix to construct
the errno.h file in Linux 0.01. The Minix operating system,
version 1.1, was released by Andy Tanenbaum and Prentice-Hall around
1987 as a book and an accompanying set of floppy disks. Subsequent
releases quickly followed: 1.2 around 1988, 1.3 in 1988, 1.4 in January
1989, 1.5.0 in November 1989 and 1.5.10 in May 1990.
Minix 1.6 was developed in-house after 1.5, then released to beta-testers
in October 1992.
It was followed up by Minix 1.7.1 in November 1995.
If Linus did use Minix to construct the errno.h file, then
it would have been based on the file from Minix 1.5.10.
The early vesions of Minix (1.1
to 1.4) had a very plain errno.h file: no copyright notice,
no comment header, no comment for each definition. Minix 1.5 was a
significant rewrite; although there is no copyright notice, the 1.5.10 errno.h
file contains a comment header and comments for each error definition.
More importantly, each definition's value is wrapped with a _SIGN
macro to convert from negative numbers in the Minix kernel to the
positive numbers used by the applications.
The earliest errno.h
from Linux 0.01 has this comment:
/*
* ok, as I hadn't got any other source of information about
* possible error numbers, I was forced to use the same numbers
* as minix.
* Hopefully these are posix or something. I wouldn't know (and posix
* isn't telling me - they want $$$ for their f***ing standard).
*
* We don't use the _SIGN cludge of minix, so kernel returns must
* see to the sign by themselves.
*
* NOTE! Remember to change strerror() if you change this file!
*/
Taking this comment at face value, it gives the impression that Linus
did in fact use the errno.h file from Minix to construct
the Linux 0.01 errno.h file. But do the error names and values
match up? By stripping the Minix _SIGN macro and the error comments
away using:
grep define Minix/1.5/errno.h | sed 's/(_SIGN//;s/).*//'
and comparing the results with the Linux 0.01 errno.h kernel,
we see that every error definition has the same name and value. The
definition of the external variable errno is also identical
between the files:
extern int errno;
Thus there is significant evidence that Linus did refer to the Minix 1.5.10
errno.h file to produce Linux 0.01 errno.h. Let's now examine
SCO's assertion that the errno.h file from Linux originated
in System V.
Obtaining a copy of System V (binaries or otherwise) has proved to
be difficult. However, I have been able to obtain a copy of System
V errno.h with this copyright notice:
/* Copyright (c) 1984, 1986, 1987, 1988, 1989, 1990 AT&T */
/* All Rights Reserved */
/* THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF AT&T */
/* The copyright notice above does not evidence any */
/* actual or intended publication of such source code. */
#ifndef _SYS_ERRNO_H
#define _SYS_ERRNO_H
#ident "@(#)/usr/include/sys/errno.h.sl 1.1 4.0 10/15/90 58840 AT&T-SF"
/*
* PROPRIETARY NOTICE (Combined)
*
* This source code is unpublished proprietary information
* constituting, or derived under license from AT&T's Unix(r) System V.
* In addition, portions of such source code were derived from Berkeley
* 4.3 BSD under license from the Regents of the University of
* California.
*
*
*
* Copyright Notice
*
* Notice of copyright on this source code product does not indicate
* publication.
*
* (c) 1986,1987,1988,1989 Sun Microsystems, Inc.
* (c) 1983,1984,1985,1986,1987,1988,1989 AT&T.
* All rights reserved.
*/
The file, with the October 15, 1990 ident stamp is roughly
contemporaneous with the first release of Linux. It is interesting
that the file has a combined copyright notice from both AT&T and
the Regents of the University of California.
The System V errno.h file is wrapped by the C-preprocessor
defines
#ifndef _SYS_ERRNO_H
#define _SYS_ERRNO_H
...
#endif /* _SYS_ERRNO_H */
which is similar to Linus' file, but which is also standard C practice
to prevent a header file from being included twice into a C program.
The System V errno.h file does not have a definition of the
errno variable, unlike the Linux and Minix files. Each error
definition also has a comment, as does the Minix file, but there are
several differences between the System V and Minix comments:
| Error | System V Comment | Minix 1.5.10 Comment | |
| | EPERM | Not super-user | operation not permitted |
| EBADF | Bad file number | bad file descriptor |
| ECHILD | No children | no child process |
| EAGAIN | No more processes | resource temporarily unavailable |
| ENOMEM | Not enough core | not enough space |
| ENOTBLK | Block device required | Extension: not a block special file |
| EBUSY | Mount device busy | resource busy |
| EXDEV | Cross-device link | improper link |
| ENFILE | File table overflow | too many open files in system |
| ENOTTY | Not a typewriter | inappropriate I/O control operation |
| ETXTBSY | Text file busy | no longer used |
There are several more examples of different comments. This indicates
that the Minix 1.5.10 errno.h file did not come directly
from System V, and the earlier versions of Minix errno.h
did not have comments.
Returning to the Linux 0.01 & System V comparison, the error names
and values are identical from EPERM up to ERANGE, but then the equivalence
breaks down:
| Error | System V Value | Minix 1.5.10 Value | Linux 0.01 Value | |
| | ERANGE | 34 | 34 | 34 |
| EDEADLK | 45 | no value | 35 |
| ENAMETOOLONG | 78 | no value | 36 |
| ENOLCK | 46 | no value | 37 |
| ENOSYS | 89 | no value | 38 |
| ENOTEMPTY | 93 | no value | 39 |
The simplest explanation here is that Linus borrowed error names and
values from Minix from EPERM up to ERANGE, but Minix did not define
errors 35 onwards. As new errors were required in Linux, these were
added on an as-required basis, and so the numbers 35 to 39 were allocated.
The difference in numbering between Linux 0.01 and System V supports
the assertion that Linux 0.01 errno.h came from Minix 1.5.10
and not from System V.
Errno.h in Linux 0.97 Onward
The errno.h file in Linux does not change substantially from
0.01 to 0.96c of the kernel. The definition of ERROR is removed, and
three new errors are defined: ELOOP as 40, ERESTARTSYS as 512 and
ERESTARTNOINTR as 513.
However, from Linux version 0.97 the file (timestamped July 26 1992)
changes significantly. In fact, this has been the only significant change to
errno.h, and it remains essentially unchanged from 0.97 through to
the 2.4.18 Linux kernel.
In the new 0.97 errno.h file, the header comment about Minix _SIGN
and the POSIX standard is removed, errors now have comments, and error numbers
go from 1 up to 121 (then 512 and 513):
#ifndef _LINUX_ERRNO_H
#define _LINUX_ERRNO_H
#define EPERM 1 /* Operation not permitted */
#define ENOENT 2 /* No such file or directory */
#define ESRCH 3 /* No such process */
#define EINTR 4 /* Interrupted system call */
#define EIO 5 /* I/O error */
#define ENXIO 6 /* No such device or address */
#define E2BIG 7 /* Arg list too long */
#define ENOEXEC 8 /* Exec format error */
#define EBADF 9 /* Bad file number */
#define ECHILD 10 /* No child processes */
...
#define ENAVAIL 119 /* No XENIX semaphores available */
#define EISNAM 120 /* Is a named type file */
#define EREMOTEIO 121 /* Remote I/O error */
/* Should never be seen by user programs */
#define ERESTARTSYS 512
#define ERESTARTNOINTR 513
#endif
The large amount of new error numbers, and the fact that this predates
Minix 1.6, strongly suggests that this new file was not derived from
Minix. Was it directly derived from System V? Again, the evidence
does not suggest so. From error numbers 35 onwards, both the System
V and the Linux 0.97 files use different numbers for the same error
names. Linux 0.97 has 121 errors; System V has 151 errors. While some
error comments are identical apart from letter case, many error comments
are different.
Where did the errno.h file for Linux 0.97 originate? The
members of the Linux Kernel Archive mailing list
searched for the origins of the file, and after some analysis, Linus
Torvalds came to the conclusion that the errno.h file was
automatically generated
from the release of the libc-2.2.2 library
that was part of the Gnu C compiler 2.2.2 for Linux (released on July 19 1992).
Linus shows that
"I can re-create _exactly_ the linux-0.97 "errno.h"
file by using the "sys_errlist[]" contents
from "libc-2.2.2". In particular, [a] trivial
[C program] will generate the exact (byte-for-byte) list that
is in the kernel''. Importantly, the regularity of the spacing within
the 0.97 errno.h file strongly supports the idea that the
file was not written by hand.
The file string/errlist.c from the libc-2.2.2 library has
no copyright notice, and begins thus:
#include
#include
#include
/* This is a list of all known signal numbers. */
CONST char *CONST sys_errlist[] = {
"Unknown error", /* 0 */
"Operation not permitted", /* EPERM */
"No such file or directory", /* ENOENT */
"No such process", /* ESRCH */
"Interrupted system call", /* EINTR */
"I/O error", /* EIO */
"No such device or address", /* ENXIO */
"Arg list too long", /* E2BIG */
"Exec format error", /* ENOEXEC */
From all of this analysis, I conclude that the errno.h
file in Linux was not copied directly from UNIX System V. Early versions
of the file were derived from the Minix source code, and the version
of errno.h from Linux 0.97 onwards originated from a file
distributed with the Gnu C compiler 2.2.2 for Linux.
Regardless of the origins of the errno.h files in Minix and
the Gnu C compiler, it cannot be asserted, in my opinion, that Linus Torvalds or some
other person directly copied a System V ABI file into Linux. Nor can
it be asserted that Linus Torvalds or some other person removed a
copyright notice from a file when the Linux errno.h file
was constructed: it has been shown that the Minix errno.h
files nor the libc-2.2.2 errlist.c file contained copyright
notices.
I'll end with a short comment on this assertion by SCO in their supplemental
response:
In 1992, Unix Systems Laboratories (USL), SCO's predecessor in interest,
sued Berkeley Software Design, Inc. (BSD) for, among other things,
copyright infringement. One of the bases of that action was BSD's copying
and distributing some USL UNIX System V files without proper permission or
attribution.
Firstly, Berkeley Software Design, Inc. is known by the acronym BSDi; the
acronym BSD is reserved for the distributions of code that were released
by the University of California, Berkeley. Secondly, SCO is completely
wrong when they assert that infringing distribution of System V code was
one of the bases of the lawsuit. In fact, nowhere in the
court
papers is System V even mentioned, except as a product that USL sells.
All mention of copyright infringement in the USL vs. BSDi lawsuit
relates to the 32V distribution from USL. For this reason, I consciously
decided not to discuss BSD code in this article; that's a whole topic
for later consideration.
Coming up next: a look at signal.h and the other ABI files.
|