decoration decoration
Stories

GROKLAW
When you want to know more...
decoration
For layout only
Home
Archives
Site Map
Search
About Groklaw
Awards
Legal Research
Timelines
ApplevSamsung
ApplevSamsung p.2
ArchiveExplorer
Autozone
Bilski
Cases
Cast: Lawyers
Comes v. MS
Contracts/Documents
Courts
DRM
Gordon v MS
GPL
Grokdoc
HTML How To
IPI v RH
IV v. Google
Legal Docs
Lodsys
MS Litigations
MSvB&N
News Picks
Novell v. MS
Novell-MS Deal
ODF/OOXML
OOXML Appeals
OraclevGoogle
Patents
ProjectMonterey
Psystar
Quote Database
Red Hat v SCO
Salus Book
SCEA v Hotz
SCO Appeals
SCO Bankruptcy
SCO Financials
SCO Overview
SCO v IBM
SCO v Novell
SCO:Soup2Nuts
SCOsource
Sean Daly
Software Patents
Switch to Linux
Transcripts
Unix Books
Your contributions keep Groklaw going.
To donate to Groklaw 2.0:

Groklaw Gear

Click here to send an email to the editor of this weblog.


To read comments to this article, go here
Signal.h -- Part 2 of Warren Toomey's look at the ABI Files
Monday, March 01 2004 @ 05:30 AM EST

Here is UNIX Heritage Society's Warren Toomey's second article on the ABI files, as promised. The first, as you recall looked at errno.h. Now, it's time, he writes, "to turn our attention to SCO's assertion that signal.h was one of the files involved in the "line-for-line copying of UNIX System V code" which SCO alleges "improperly appears in Linux''.

To determine if their accusation is well-founded or not, we need to understand what signal.h is, what's in it, and a bit of its history.

It's important to point out that there are two versions of signal.h in most versions of UNIX ( /usr/include/signal.h and /usr/include/sys/signal.h), and as yet -- to the best of our collective knowlege -- SCO Group has not specified which, if either, is the file they claim has been improperly copied. The same is true of errno.h.

We have yet to see SCO list any "UNIX Derived Files" publicly, for that matter. The files SCO mentions in their Revised Supplemental Responses to IBM's 1st and 2nd Set of Interrogatories are all from AIX, Dynix and Linux, although on page 59 it references an Exhibit A that SCO says lists them. However, Exhibit A is not attached to the publicly available Revised Supplemental Responses, at least not yet. SCO has referenced UNIX files being attached to letters sent to their "dear Unix licensees" ("A complete listing of the UNIX Derived Files is attached"), but so far we have not heard of anyone actually getting the attachment with the letter. In Red Hat's most recent filing, they include the letter to Lehman Brothers, which also references the attachment, but again, there is no such attachment in the public court filing.

Has anyone who got a letter from SCO received this attachment listing "UNIX Derived Files"?

**************************************************

Signal.h
~ by Warren Toomey

Introduction

Following on from my report into errno.h in Linux, it's time to turn our attention to SCO's assertion that signal.h was one of the files involved in the "line-for-line copying of UNIX System V code [which] improperly appears in Linux'' and that "persons as yet unknown copied these files into Linux, erasing the USL copyright attribution in the process''.

In Unix and Unix-like systems, the underlying operating system can send a message to a running program to inform it of some exceptional event: a signal. The program's execution is diverted to a signal handler which deals with the event, before returning the program to what it was originally doing.

The sort of events that can occur are numerous: access to an 'out of bounds' area of memory, a divide by zero operation, a signal to stop executing from the user, etc. For (nearly) each signal type on the system, a running program can decide to ignore the signal, catch the signal and deal with it, or simply let the default Unix behaviour happen for that signal type. Most signals if uncaught result in the program being terminated, and the SIGKILL signal can never be caught: it is the "terminate with extreme prejudice'' signal in Unix.

To have a valid assertion that "line-for-line copying of UNIX System V code . . . improperly appears in Linux'' for signal.h, SCO needs to demonstrate that the signal names, their numeric values, any associated program comments and other function definitions could only have been directly copied from System V to Linux, and from nowhere else. Our job here is to track down the origins of signal.h in Linux.

What's in Signal.h?

What's in a typical signal.h file on most Unix or Unix-like systems? First of all, there is a set of defined signal names, their values, and possibly a C comment describing the signal. Systems which comply with the POSIX standard need to define about 28 signal names and associated numeric values; the values are not defined by the POSIX standard, but nearly every Unix and Unix-like system uses the same numbering scheme.

The earliest version of the signal name/numbering scheme still in existence is the nsys/param.h file from the 3rd Edition of UNIX in August 1973, with 12 defined signals. As Unix grew, so too did the number of signals, and by the 7th Edition of UNIX and the 32V distribution in 1979, the file now called signal.h had 15 signals.

By the end of the 1970s, there were already Unix clones like Idris and Coherent, and of course they also had to enumerate the set of signals. Not surprisingly, they followed the same numbering convention as Unix, as is shown by this file from Idris in 1978, where nearly all of the names and numbers are derived from 6th Edition UNIX.

This sort of code "cloning'' is exactly the thing that seems to make SCO see red. However, at the time AT&T asked Dennis Ritchie (one of the developers of Unix) to visit Coherent's makers [first link] and determine if the Mark Williams Company relied on Unix code when they wrote Coherent, Dennis determined that he "couldn't find anything that was copied'', and "what they generated was [...] reproducible from the [Unix] manual''. It must be remembered that the manual pages for Unix were published and publicly available; in fact, each new version of Unix was known by the edition of the printed manuals.

Dennis goes on to indicate that AT&T "backed off, possibly after other thinking and investigation [... and] so far as I know, after that MWC and Coherent were free to offer their system and allow it to succeed or fail in the market''. This decision and others like it, together with the publicly available enumeration of the signal values, allowed the Unix signal numbers to be used in many Unix clones and non-Unix systems such as:

The list is probably endless; hyperlinks to other examples of the Unix numbering in non-Unix systems can be posted as replies to this article.[1]

We've digressed from the topic of "What's in signal.h?'' to observing that the contents of the original Unix file was copied with AT&T's knowledge as early as 1978. Let's get back to what is in a typical signal.h file.

Along with the list of signal types, there is a list of operations that a running program can do when a signal arrives. Typically:

  • SIG_IGN (usually 0): ignore the signal
  • SIG_DFL (usually 1): use the default system behaviour, and
  • have the program handle the signal.
There is no numeric definition for the program handling the signal itself. Instead, signal.h defines a prototype for the signal() function. This system function takes two arguments: the signal number to catch, and the name of a program-specified function that will catch it. This program-specific function must receive an integer (the number of the signal that has arrived) but not return any value. These days, example definitions of the program-specific function and the signal() function might look like:

   typedef void __sighandler_t __P((int));
   sig_t signal(int sig, sig_t func);

Earlier versions of signal.h often rolled both definitions into one line, giving an unreadable definition like:

   void (* signal(int sig, void (*func)(int)))(int);

The behaviour of signals and their handlers in Unix has changed dramatically over time, and now the whole signal system is mind-bogglingly complex. The POSIX standard lists many, many more type definitions and function definitions that must be found in modern signal.h files.

Signal.h in Linux 0.01

Linus Torvalds released version 0.01 of the Linux kernel source around the "middle of [19]91'', and this includes the kernel file linux/include/signal.h. We have:

  • the usual #ifndef _SIGNAL_H, #define _SIGNAL_H ... #endif /* _SIGNAL_H */ combination to stop this file from being loaded into the compiler more than once;
  • an #include to bring in definitions of common C types;
  • definitions of two new C types:

    typedef int sig_atomic_t;
    typedef unsigned int sigset_t;          /* 32 bits */
    
    
  • the list of 22 signal names and values, plus a definition that NSIG (the number of signals) equals 32;
  • definitions for SIG_IGN and SIG_DFL:

    #define SIG_DFL   ((void (*)(int))0)   /* default signal handling */
    #define SIG_IGN   ((void (*)(int))1)   /* ignore signal */
    
    
  • a definition of a structure called sigaction, which is used to set a handler for a specific signal:

    struct sigaction {
         void (*sa_handler)(int);
         sigset_t sa_mask;
         int sa_flags;
    
    };
    
    
  • the definition of the names and values that the sa_flags field can take:

    #define SIG_BLOCK          0    /* for blocking signals */ 
    #define SIG_UNBLOCK        1    /* for unblocking signals */ 
    #define SIG_SETMASK        2    /* for setting the signal mask */
    
    
  • finally, the definition of a bunch of C library functions that perform signal-related operations:

    void (*signal(int _sig, void (*_func)(int)))(int);
    int raise(int sig);
    int kill(pid_t pid, int sig);
    int sigaddset(sigset_t *mask, int signo);
    int sigdelset(sigset_t *mask, int signo);
    int sigemptyset(sigset_t *mask);
    int sigfillset(sigset_t *mask);
    int sigismember(sigset_t *mask, int signo); /* 1 - is, 0 - not, -1 error
    */
    int sigpending(sigset_t *set);
    int sigprocmask(int how, sigset_t *set, sigset_t *oldset);
    int sigsuspend(sigset_t *sigmask);
    int sigaction(int sig, struct sigaction *act, struct sigaction *oldact);
    
    

Linux 0.01 vs Minix 1.5.10

If you're still awake at this point, then you are doing well. What sources of information did Linus use when he wrote this file? We saw that with errno.h, the most likely source of information was Minix 1.5. The evidence below suggests that Minix 1.5.10's signal.h was the source of inspiration for Linux 0.01 signal.h:

  • the same protective #ifndef ..., #define ..., #endif around the file;
  • the same definition of sig_atomic_t and nearly the same definition of sigset_t, except that the latter is promoted to 32 bits in size with a comment on this promotion in Linux 0.01;
  • the same definition of 22 signal names and numbers;
  • the same definition of SIG_DFL and SIG_IGN; and
  • the same definition of the sigaction structure.
There are some differences though. The Minix 1.5.10 file defines the signal functions differently to Linux 0.01; in particular, the parameter names are different (_set vs. set, _oset becomes oldset etc.). The parameter names are really for decoration here, and serve no purpose to the compiler, so perhaps Linus was not so keen on the Minix parameter names.

One important difference is the different definitions of the signal() function:

void (*signal()) (); in Minix 1.5.10

void (*signal(int _sig, void (*_func)(int)))(int); in Linux 0.01

One possible clue here is Linus' comment in the file that he is "trying to keep headers POSIX''. The POSIX standard defines the signal() function thus:

void (*signal(int, void (*)(int)))(int);
and Linus has followed the POSIX standard and also decorated his definition with parameter names.

Linux 0.01 vs System V R4

Let's now compare the Linux 0.01 signal.h file to the corresponding file /usr/include/sys/signal.h from the 1990 version of System V R4.0 for i386:

  • the same protective #ifndef ..., #define ..., #endif around the file, although the macro used is _SYS_SIGNAL_H not _SIGNAL_H;
  • no #include'd files;
  • sigset_t is defined as a structure containing an array of 4 unsigned longs called sigbits;
  • sig_atomic_t is not defined here, but it is defined in the /usr/ucbinclude/sys/signal.h file: obviously AT&T got it from the BSD distributions;
  • there are 31 signals numbered 1 to 31, with some different names to Linux 0.01:

    NumberLinuxSystem V
    7SIGUNUSEDSIGEMT
    10SIGUSR1SIGBUS
    12SIGUSR2SIGSYS
    16SIGSTKFLTSIGUSR1
    17SIGCHLDSIGUSR2
    18SIGCONTSIGCHLD
    19SIGSTOPSIGPWR
    20SIGTSTPSIGWINCH
    21SIGTTINSIGURG
    22SIGTTOUSIGIO

  • SIG_IGN and SIG_DFL are defined as per Linux but with the outside parentheses missing and no comments;
  • sigaction has an extra field: int sa_resv[2];
  • SIG_BLOCK, SIG_UNBLOCK, SIG_SETMASK: same values, no comments;
  • most of the C functions defined in Linux are defined in System V as C macros:

    #define sigmask(n)              ((unsigned long)1 sigbits[0])
    #define sigktou(ks,us)          ((us)->sigbits[0] = *(ks),  
                                     (us)->sigbits[1] = 0,  
                                     (us)->sigbits[2] = 0,  
                                     (us)->sigbits[3] = 0)
    #endif /* !defined(_POSIX_SOURCE) */ 
    
    
I think it's pretty obvious that Linus did not have access to nor use System V source code to generate his 0.01 signal.h file.

Since Linux 0.01, the signal.h file has changed and expanded somewhat, but even the signal.h file from the Linux 2.4.22 distribution still bears little resemblance to the System V signal.h file; even a cursory inspection shows that the Minix 1.5.10 signal numbers are still used here.

Postscript: errno.h Proliferates

At the beginning I mentioned that, as early as 1978, the signal names and values from AT&T's original signal.h file had been used in other systems. The same is true for errno.h. Here is an example list that I put together in about 30 minutes of searching on Google:

  • pe7sys.h from the port of C-Kermit to the Idris system on the Perkin-Elmer 7000. Copyright attribution to Whitesmiths Ltd. in 1978.
  • errno.h from the Microsoft Quick C compiler. Copyright attribution to Microsoft Corporation.
  • tclErrno.h from the Tcl source. This has copyright attributions to the Regents of the University of California and Sun Microsystems, Inc.
  • need_errno.h from a package called RandomDan. This has copyright attributions to Microsoft Corporation and may have been derived from a Microsoft C compiler.
  • errno.h from FreeDOS. Copyright attribution to Borland International.
  • errno.h from the LIVSIX package. Copyright attribution to Motorola and others.
  • arch.h from the lwIP package. Copyright attribution to the Swedish Institute of Computer Science.
AT&T did not put copyright notices on the "ABI files'' from 3rd Edition UNIX in 1973 up to and including the first release of System V in 1983. It makes you wonder, if Whitesmiths were putting copyright notices on their files in 1978, who really can claim copyright on the content of these files?


[1] These links are merely to demonstrate that the signal names and numbers have been used elsewhere. Copyright notices in the linked files should be observed. Copyrighted materials may not be used without the permission of the author.


  View Printable Version


Groklaw © Copyright 2003-2013 Pamela Jones.
All trademarks and copyrights on this page are owned by their respective owners.
Comments are owned by the individual posters.

PJ's articles are licensed under a Creative Commons License. ( Details )