|
Data Mining, Spectral Analysis, and All that Jazz |
|
Friday, August 29 2003 @ 01:45 AM EDT
|
We're
not born knowing what spectral analysis is. So when SCO said that spectral analysis is
one of the methods they used to find "infringing" code, I had no idea
what they were talking about. When Sontag compared it to finding a
needle in a pile of needles, I figured it wasn't much use. And it turns
out, on further investigation, that my intuitive conclusion may be about
right, at least when it comes to using it for software code data mining
for infringing code in this case. An alert reader noticed something
interesting. One of Canopy Group's companies is called DataCrystal.
Could that be at least one of the three groups SCO hired to try to sort
through the code of UNIX System V and the Linux kernel? DataCrystal
does "advanced pattern recognition" and "AI systems". They actually claim to do a
great deal more besides. One of the things listed on their what-we-do
page is data mining. Presumably, that's what SCO wanted to do. And a
look at their About page indicates that if
you are the RIAA, you probably would want to have a company like DataCrystal to hunt down pirates for you. Another reader noticed this page
about a DataCrystal, and he
wondered if it might be the same company. It isn't, because this DataCrystal
is the name of a project at USC, not a company. While I don't know if
the Canopy Group company DataCrystal was hired by SCO, or whether there is any connection between the company and the project, it did make me start to wonder about the field in general. If you really wanted to know if two piles of code had identical or similar code, can data mining find out? And would matches
be reliable for use in the way SCO apparently is using them? Judging by
the SCOForum demo, we might think no. And we might be right. I asked a
Groklaw resource person, a man who worked for over a decade doing basic and exploratory
research for the US DoD and the Canadian Ministry of Defence on topics related
to secure communications and signals intelligence, including cryptology,
statistical processing of natural language, signal processing, and
computational learning, if he'd be willing to explain it in general and understandable
terms, so we can follow along. Very likely this subject is going to be
a very significant part of the case when it goes to trial.
Here is what
he explained to me: Data mining is looking for
patterns or similarities in large quantities of information. Google is a
good example of data mining-on-demand where the pattern is supplied by
the user and the large quantity of information is the entire set of
webpages on the internet. But data mining in general is potentially much
broader. For example, a typical data-mining task might be to take a
sample document and look for other documents in a database that might be
similar to it. But even beyond that, data mining can be applied to other
kinds of data -- pictures, for example, or sound
recordings.
There are lots of different ways to approach
problems like this. Beyond the most elementary, what all the techniques
have in common is that they rely on mathematical models and
transformations of the data. Part of the reason is efficiency, since
turning the problem into math usually means there's a computationally
clever way to do it. Another part of the reason is that, by transforming
the problem into math, you make it possible to find and grade a
continuum of approximate matches -- in short, to find ranges of
similarities rather than just identities. Note very well that
'similarity' here is completely dependent on the particular flavor of
math you've chosen as your technique. This is extremely
important.
OK, so you've taken your document or picture or
whatever, and you've mined your database for similar items. Those items
will be graded for similarity to your original, just as some search
engines will rate their returned items in terms of probable pertinence.
The most sophisticated and respectable data-mining systems will be using
grades based on probabilities. This is because the underlying math will
be using probability models. Many times the grade will reflect not
merely the strength of a match in terms of probability, but also the
likelihood that such a match would be found at random searching any old
data at all. This also is extremely important, since 'any old data at
all' can be subject to a wide range of interpretations. (This could
pertinent in the SCO case, since, if data-mining techniques are used,
it's a reasonable question whether any contamination discovered this way
is real, or whether it's spurious, i.e., capable of being found to the same
degree in other, unrelated data.)
Now the DataCrystal webpage
consists mostly of a laundry list of any and all of the subjects ever
associated with data mining, artificial intelligence, knowledge
discovery, or machine learning. But the .pdf white papers all focus on
using data-mining techniques for indexing and retrieving digital video
and audio. What's more, they're offering not just indexing and retrieval
services, but also housing, protecting, and distributing the data
itself. It outlines an enhanced technique for expanding
data-mining coverage. It's a technique for building patterns out of
patterns and data mining on the derived metapatterns in
turn. Not being a rocket scientist, I wanted to be sure I'd understood, so I wrote back
and asked these followup questions, and got this reply: Q: I
have two questions to follow up: 1. . .the results would depend on
how you programmed the software? In other words...it can look for
similarities, but it can't evaluate them?ANS: Absolutely
correct. Q:..there might be in actuality no common code at
all? ANS: You know how Google sometimes matches all the words in
your query, but not necessarily conjointly or in the same order?
In the case of computer code, especially code written in C expressing
similar or common algorithms, it would be astounding if there weren't
pattern similarities at some level. If nothing else such things are
enforced by the design of the language and commonly-held notions about
good coding style. Q: ...it simply would have to be the case
that some of the code is close enough that they might have a
case? ANS: Just the contrary. As with the 1st slide example,
the ancestry of that memory-management code is known to virtually
anybody who's studied C from Kernighan and Ritchie's book. A similarity
like that would stand out like Devil's Tower, but what it indicates is
exactly the opposite of what they contend: it shows that everybody knows
the pattern. Q: And can they program the math to increase
"matches"? Pls. explain a bit more this part. ANS: Here's an
example. Suppose you came up with a hitherto-unknown page of blank
verse. The question is, was it written by Shakespeare or not?
Data mining your way through that problem, you'd get one level of
certainty if your database contained the Bible, Goethe, Racine, Pushkin,
and the New York Times. You'd get a different level of certainty if your
database were confined to Elizabethan dramatists. The scores for
putative Shakespeare against the mixed database would probably be huge
just for matching any English. The scores against Elizabethan dramatists
would probably be quite a bit weaker, but clearly more
conclusive. The mixed-database test -- the one with the Bible, Goethe,
etc. -- will probably say 'Shakespeare indeed!' but it's
expressing the idea that 'if it's English it's Shakespeare.' On
the other hand, the Elizabethan dramatist test might say
yes, might say no, but the answer will be based on such
things as a small number of very subtle differences between,
say, Shakespeare's and Marlowe's vocabulary. It expresses
perhaps the idea that 'in any 1000-line chunk of Shakespeare
and any 1000-line chunk of Marlowe, Shakespeare is likely to
use the word 'ope' once and Marlowe not at all. This example
doesn't use 'ope' at all therefore it's probably Marlowe. You
can see it's still a matter of interpretation and
probability, but the second test is simply more credible on
grounds that are external to the data-mining method itself. Here's
another point of view. How does a data-mining
search for SVr4 code look if you run it against all C programs? In all
likelihood you're going to find some matches. Are the matches
against Linux actually any stronger than matches against an arbitrary
body of C code? Against other Unix-like kernels? etc. These are
interpretive issues, but there are statistical grounds for
deciding them, and speaking strictly for myself, I seriously doubt
they've been fielded satisfactorily. For my money you couldn't even
start taking the matter seriously unless exactly the same tests were run
against every body of other kernel code like all the BSDs, and a
chunk of the SVr4 kernel against the rest of that same kernel. And
even then, you've only generated the raw information to start the
business of verifying and refining the procedure. Q: Also,
what is spectral analysis? Is that what this is? No. In
general, spectral analysis refers to breaking things down into component
frequencies -- sort of like how a prism breaks white light into colors,
and so on. In this case it refers to using the periodicities of
the individual characters of program text as frequencies to look for a
very specific set of 'colors' associated with a particular swatch of
program code. It's not determinative either. It may also refer to a kind
of computational trick using spectral-based techniques to look for
certain kinds of approximate matches very quickly.
So,
there you have it. At least now we know in general what they are
talking about. As the case goes forward, and more is revealed, no doubt
it'll be interesting to meaningfully follow along. His
analogy to Google made it all come clearer to me. On top of all that he
wrote, I know with Google, input affects output. And input means humans,
imperfect humans. I certainly know that I get different results from
Google if I plug in the identical search terms, but in a different
order, for example. So I totally get how results could be skewed by
what you tell the software to do. For example, I get different results
if I search for "Dave Farber" and IP than I do if I search for IP and
"Dave Farber", and it's different still if I search for just IP or just
"Dave Farber" or just Farber or just Farber and IP. And that's using
the same pile of data. Input affects output. Obviously they would argue
that their methods are so refined, blah blah. But that human element
can't be removed, because humans write the software, no matter how
sophisticated. So how reliable are the matches? You use Google. What do
you think? Doesn't a human at some point have to interpret the value of
the results? "A continuum of approximate matches" does not
infringement prove, on its face. As he says, it's an interpretive
issue. And data mining seems to be a better match with something like
matching amino acid strings than figuring out if someone stole
somebody's code, which requires knowing who has or doesn't have a valid
copyright, which way matching code travelled, who had the code first, etc. If I've understood
what my friend has written, it means that if SCO swapped out Linux and
searched Windows 2000 code instead, it'd likely find instances that
looked like "infringing code" also. That's the same as saying that so far,
they are holding maybe nothing. It all reinforces in my
mind that, once again, nothing has been proven to date by their claims
of similarity, derivative or obfuscated code matches, and nothing can be
proven using data mining techniques, until this case goes to trial and
the experts speak, followed by a decision by a judge. If you are interested,
here is a white paper, "Text Mining -- Automated Analysis of Natural
Language Texts" that explains the process of searching just for simple
text, and while it does the explaining, it also shows just how much
human input goes into structuring your search before you begin the
search and why the results still may not be what you want. It is hard to see how such techniques
could answer the question: "Is this infringing code?" At best, it could show you where to begin
to investigate. And here are the DataCrystal project's white papers. Oh, one
other thing I found out in my investigation. Guess where most of the
cutting-edge brains working on such data-mining techniques work? . . .
No, really. Guess. . . That's right: at IBM.
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 11:17 AM EDT |
SCO have been unusually quiet today, indeed most of the week
In the last comment's section, I can't see comments after Wesley Parish •
8/29/03; 2:26:37 AM. This appeared when there were about 145 or so comments (I
think), so we're missing about 15. Recommend people post new comments in this
topic instead!
Recommended reading for the day
http://www.threenorth.com/sco/co
hen.html
http://www.pclinuxonline.com/modules.php?name=Forums&file=viewtop
ic&topic=2033&forum=46
http://www.pclinuxonline.com/modules.php?name=Forums&file=viewtop
ic&topic=2027&forum=46
On the was it or wasn't it a DoS? Somebody posted a link to a weekly view of
uptime at biz.yahoo.com (can't find the link right now) - it really was regular
as clockwork.
Incidentally SCO migrated off SCO Unix to Linux in 2002. Check it out! Maybe
customers should follow their example.
http://uptime.netcraft.com/up/graph/?site=sco.co
m&mode_u=on&mode_w=on&avg_days=30&submit=Redisplay+Graph quatermass - SCO
delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 11:22 AM EDT |
More reading
http://action.eff
.org/action/index.asp?step=2&item=2775
Oh, yes here is the pattern for the alleged DoS. Look how regular. Surely looks
to me like they're just turning it off during non-business hours?
http://uptime.netcraft.com/up/performance?explain=0&mode_p=on&m
ode_u=off&mode_w=off&by=collector&errors=0&site=www.sco.com&site1=&sample=2&subm
it=Examine&range=5d&maxy=0 quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 11:25 AM EDT |
If the glove don't fit you must _______?
I see how the logic above fits with the above information and I see how it all
connects. Yep, I really do... however, how many people from an eligible jury
pool will begin to understand this? Hmmm, or will they go for the raw power
brought to a court room by an appearance of the celebrity lawyer (Boies).
Hmmm, if a jury is involved then a simple phrase that catches the ears of the
jury just right might win the case over all of this expert IP data! If the
glove don't fit, then Unix is owned by SCO. Hmmm, the last I heard OJ is free
and still playing golf in Florida.
SO as an insurance poilcy against a SCO court room victory in Utah (Microsoft,
and maybe SUN, will make sure that they have enough money to pay for the years
of lawyers bills), think seriously about making a complaint/filing, concerning
SCO, by advising your state attorney general about SCO's FUD and or actual abuse
of you as a innocent 3rd party consumer by attempting, as posted on their web
site and public interviews, to threaten you, as a paid up user of LINUX, into
paying again and again for your use of your already paid for LINUX. And while
filing your concerns with the state AG don't forget to document the date(s) of
your previous and current LINUX acquisitions and/or downloads, as evidence of
your date(s) of possession and use, today.
Why should I maybe elect to contact my AG and document my LINUX today?
Becasue, THIS IS A DO or DIE period of time for SCO UNIX and for SCO/Canopy
group! Any bet that they will not sue users?
Here is one point that most are not considering when planning their LINUX
roadmap into the future!
1. In order for SCO to prevent the state AGs from protecting users (via the laws
of agency), SCO first has to go thru a legal proceedure where they notify
absolutly everyone in the world that, the current LINUX agents are not acting
with SCO's IP authority - and that these agents can not sell LINUX under the
terms of the perpetual use GNU GPL anymore (this perpetual use understanding
overflows to mean that all future upgrades are perpetual as well)! SCO has not
made moves to make this notification legal yet!
They will make this notification - they have to (if they don't they will not be
able to collect any money from any LINUX user = their goal)! So, after SCO does
make legal notification to the public, then after that date, then any LINUX
acquistion or download that happens after that date could be seen by the courts
as being after the fact (after SCO has made the notification to users legal, as
seen by the court). Being seen as getting your LINUX after this SCO notification
date may put you in the way of SCO's harm.
The attorney general's office can document your complaint about SCO and also
document a date(s) when, previous to SCO legal notification to stop actions of
agents, YOU had legal possession of the LINUX IP product (where then you are
then protected as the terms and conditions of your acquistion of LINUX then
predates any SCO action..., So, your rights are truely perpetual... and however
the IBM suit, the Red Hat suit or any SCO vs LINUX user suits go... the laws of
agency and your attorney general should protect you)!
Contacting and documenting your LINUX possession status at the AG's office is
not rabid dog crazy, but maybe crazy like a fox (a simple prudent move that one
can make to cover the bases now rather than later). Again, Please remember, the
last I knew... a "jury of our IP knowlegable peers" will be hearing the case in
UTAH and a certain someone is still living freely in Florida! annon[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 11:29 AM EDT |
Final corrected version of the article. Like
quatermass, I can't read end of comments on previous
article.
HEY, LINUX! WORK HARD AND YOU WILL BE SCO'D
The "new SCO" (formerly Caldera) is not the only member of the Canopy
group (http://www.canopy.com) still
distributing the Linux kernel.
Canopy portfolio member Linux Networx, depite the name incorporating
alleged software pirate Linus Torvald's notorious trademark, has
distributed the latest Linux kernel but one, even after SCO made
its allegations, aggregates the kernel with the work of others,
and continues distributing the source and binaries for
that derived work today.
Now you might think that as a member of the Canopy portfolio, Linux
Networx would respect the claimed intellectual property of another
Canopy Group member. However, not even the Canopy group itself takes
SCO/Caldera's claims seriously. As recently as May 1st, 2003, Linux
Networx was uploading Linux source and binaries to the FTP site,
ftp://ftp.lnxi.com. Linux is still distributed
from that venue today.
Their customers are referred to this site in the white paper for their
Linux BIOS product,
http://www
.linuxnetworx.com/products/linuxbios_white_paper.pdf
There, under the linuxbios directory
(ftp://ftp.lnxi.com/pub/linuxbios/
kernel) we find
linux-2.4.20.tgz 5/1/2003 3:24:00 PM
Suprisingly, however, this file is not actually the Linux 2.4.20
kernel.
4ef3a43d8fa4d8166a8bdcadd4285f80 *linux-2.4.20.tgz
It turns out to be based on linux-2.4.20.tar.gz, a pristine kernel as
downloaded from the kernel.org distribution site, with two patches
applied. Both patches are included in the toplevel directory of the
new aggregate distribution being distributed by Linux Networx. They
are:
patch-2.4.21-pre4
and
linux-2.4.21-pre4.mtd-thayne_rc1.patch
Both patches seem to be commonly available on the net.
In addition, it contains the vmlinux binary and many build artifacts,
mostly ".o" files.
Linux 2.4.21 pre4 puts this kernel on the development branch
immediately preceding today's stable Linux kernel, 2.4.22. In
addition, by aggregating this allegedly infringing kernel with two
other derivative works, created by others, Linux Networx is itself
creating and distributing a derivative work, both binary and source.
They are doing this, however, without any notation of the changes they
have made in so doing as required by the GPL--though it is easy enough
to infer from the included patch files. They are (1) calling their
aggregate distribution Linux, and (2) distributing it under the same
version as a commonly available Linux kernel.
Now, naming and distributing a Linux 2.4.21pre4 kernel with the title
"2.4.20" is a bit sloppy. It also violates the GPL provision that
your changes must be noted and clearly labelled. So, in addition to
using Linus Torvald's trademark, violating the GPL, *and* tresspassing
egregiously on SCO's alleged copyright claims, all at the same time,
Canopy group members are distributing falsely labelled kernels.
As a respected and active member of the Linux community, the Canopy
group should disavow all association with SCO's actions. Or, if they
prefer not to be respected, they should unlink alleged software pirates
like TrollTech and Linux Networx from their own homepage. Or maybe they
should just go and f^Hsue themselves. John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 11:34 AM EDT |
The threat is back, or is it? So many double negatives.
I can't imagine this being anything other an attempt to parry Red Hat
http://www.theinquirer.net/?art
icle=11273
Blake Stowell, director of public relations at SCO, told the INQUIRER late
today: "Just because we aren’t “planning” to sue Linux companies doesn’t mean we
won’t. We tried to avoid suing Red Hat, but they seemed to bring the litigation
upon us, not us upon them. Also, just because we are saying that we won’t sue
Linux companies doesn’t mean that we won’t sue Linux customers". quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 11:38 AM EDT |
Oh, and has the SCO warning to Linux customers disappeared from their site or
not? I can't find it now. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 11:44 AM EDT |
Everybody in the open source movement is angry at SCO, and SCO seems to rely
heavily on GPL'd code in their commercial Unix products (the SCO FAQ
specifically suggests using gcc as a compiler, for example, and Samba is a very
necessary part of their new releasees). Just had me wondering, and BTW, this is
just pure conjecture- what would happen if the various open source projects
decided to modify the GPL for new releases- make it illegal for open source
software to be used in any commercial Unix release? I know- sort of against the
spirit of open source, but SCO is begging for something like this. Could
something like this ever happen? wild bill[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:02 PM EDT |
More sexy than Lawrence Livermore, LinuxNetworx
render farm is behind movie "The Core". I wonder if
SCO 0wnz Hollywood now.
http://
www.linuxnetworx.com/news/4.2.2003.32-Linux_Networx_R.html John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:13 PM EDT |
It doesn't really matter. Let's see, I could build a computer like HAL, with
terabytes of memory, employ teams of specialists, scientists, engineers; to
scour the Linux kernel for similarities to anything in existance. And, you know
what? There probably would be similarities -to a degree.
No number of fancy algorithms, or MIT experts, prove beyond a doubt that code
was stolen specifically from SCO and put into Linux. It used to be that they
claimed this had been done by IBM, but when that fissled out, they expanded it
to "Linux" and the dark forces surrounding it. Similarities or not, they have to
PROVE their claims -and that would be extremely hard to do. Stephen Henry[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:23 PM EDT |
PJ, again another great article, please keep up the good work. I completely
understand and agree with the sections referring to common algorithms and
similar coding structures. It can't be avoided, you're limited by education and
hardware. Comments are a different story, especially lengthy ones, but I'd even
expect to see some similar, and possibly identical, one-liners come from
multiple independant programmers. Tazer[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:26 PM EDT |
Wild Bill,
"what would happen if the various open source projects decided to modify the GPL
for new releases- make it illegal for open source software to be used in any
commercial Unix release"
As I understand it, the FSF is the only one that can modify the GPL. And, as
you note, restrictive cluses are very much avoided by the GPL authors. Tom
Cranbrook[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:30 PM EDT |
I think the picture is pretty clear - you can build a fantastic method for
detecting similarities between SCO-owned code and Linux but
a) your machine's sensitivity to similarity is human-controlled
b) even if you find similarity it doesn't imply tainted code in Linux.
Perhaps PJ could comment on the proof required for copyright infringement to be
upheld. I remember hearing a radio programme about it (UK law this would be)
some time ago - they said that basically there were 3 defenses to copyright
infringement cases
1) The similarity is not substantial (in this setting I guess "substantial"
would mean not passing the abstraction-filtration-whateveritwas test)
2) I had never seen his (work of art) when I made mine, so mine was
independent
3) We both got this idea from a common source (in a non-infringing way).
Is this right? It seems that both 1) and 3) are viable defences for whichever
Unix licensee is supposed to have contributed the tainted code. Dr Drake[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:33 PM EDT |
I'd like to see something like, "License to use GPL software can be removed if
the FSF finds that the user(s) in question is damaging the reputation of GPLed
software, or attacking the GPL via their publicity efforts or the court
system.
And could everyone PLEASE learn to give short URLs? The comments system does
allow the use of HTML tags. Alex Roston[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:37 PM EDT |
To people asking about withdrawal of usage rights to GPL software from SCO -
it's a matter of principle. Free software is free, even for asshole companies to
use (although they'd better be a bit careful about redistribution if they don't
want to stick to the GPL). Even though we strongly disagree with SCO's actions,
we still grant them the right to use GPL software. We show ourselves to be
better than them by not lowering ourselves to our level. Dr Drake[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:39 PM EDT |
Great job, pj.
Not only does IBM have the data mining talent, they have the SVR4 code.
If they have the data mining talent, do they have data mining patents?
How will data mining and spectral analysis play to a jury? About like
DNA in the OJ trial.
I wonder if IBM has any good decompilers? Might be useful for determining
if there is Linux code in SCO's LKP. Greg T Hill[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:44 PM EDT |
I would doubt that anything like data mining or 'spectral analysis' would play
meaningful part in a trial. Data mining is certainly a way to find similarities
in a large corpus of code. But the courts have long-standing tests and
precedents to guide them to arriving at a determination of infringement. Such
tests involve _reading_ the code. I think the case (if it ever happens) will
hinge on the provenance of such code as may be considered infringing.
td Thomas Downing[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:51 PM EDT |
Data Mining is all fine and good, but it's the genealogy of the matches that
helps make the proof...
When I did code reviews years ago, the key piece of information that was always
required was the revision history.
With it, our teams were able to resolve problems that could not be solved by
other means - because we had the history
on how all the moving parts were created, who did it, and how they did it.
SCO can employ all the gee-whiz tech they want, but at the end of the day,
someone has to look at it and make a
decision whether the match is worth anything. Sounds like SCO has not done their
homework again. Paul Penrod[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 12:53 PM EDT |
I agree with td. SCO may claim to be using data mining and spectral analysis to
find the so-called infringing code, but what you would hear in a court room is a
description of the code itself. How they found the code is irrelevant. What
would matter is if they could find copied code that violated a contract. If so,
what difference would it make if they found that code by stumbling upon it, from
a tip, through spectral analysis, or the back of a cereal box. Once found, its
existence is the key, not how it was found. Nick[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:07 PM EDT |
I can see why they call it spectral analysis. Having worked with RF and SP off
and on, the general idea is how close a match you get to something big. In
radio thats like something transmits at 800MHz. Well it doesn't always transmit
at exactly 800MHz so when you search for frequencies +/- around 800MHz you get
the classic frequency matching curve which represents the amount of energy
arround 800MHz.
The data mining technique must be similar to this. Ever google with a long
string. Works very poorly. But if you do a spectral analysis approach on a
chunk of code you will get code matches that are close. Like looking at a diff
of two versions of code with just some changes. There could be percentage score
associated with such a match based simply on amount not changed.
OK for getting started, hardly proof. Even if you had a 100% match there are 3
questions you would have to ask before you find the smoking gun.
1. where did the code come from FIRST? Looks like SCO just said AH HA a match
from SysV is from us!
2. Is it general knowledge? An example is the malloc code.
3. Is it big enough to matter? A small chunk here or there is not enough.
Obviously this highlights a few things here.
1. SCO has been at this for YEARS. They didn't just have a fall out with IBM
and went and sued them.
2. The BIG CONCERN at SCO is NOT Linux is general, but how Linux is moving into
SCO's traditional market. Linux is ok as long as its at Universities, hobbiest,
even in the server room. Once it shows up in multiprocess machines, clusters,
64 bit machines, etc. There is no market for SCO. At least that's what SCO
thinks.
This explains the weird comments about "look how fast Linux has come" and the
like. That's why they are after 2.4 not 2.2 or 2.0.
BUT THEY CAN'T SAY THIS!! Why? Because that market for high end multiprocessors
does not = $3 Billion. SOOO sue everyone.
The old business model for this kind of stuff was charge the customer a LARGE up
from fee, plus some 4 or 5 digit fee for yearly support. The customer pays
because he can't develop it for cheaper and can't find the stuff anywhere else.
Now along come the same thing for FREE. (oh shit). SCO didn't adapt. They
still think it works like VAX/ VMS or IBM AS400. Those days are very LONG gone.
This suit is a desperate attempt by SCO. BubbaCode[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:14 PM EDT |
Here's a URL that proves another Canopy company is using
MySQL for its "culturegrams" product. I drilled into
look at what they say about Finland, then broke my query
to get the error message.
http://onlineedition.culturegrams.com/world/world_country.ph
p?contid=5&wmn=Europe&cn=Finland John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:14 PM EDT |
Wow, pj. This site is stunning proof that investigative journalism is no longer
done at newspapers, but by people like you.
My own personal take on data mining and pattern analysis is this: while it may
be a useful technique for _discovering_ copyright violations (especially when
you don't really know what you're looking for), at this point I think it's fair
to say that it's absolutely useless for _proving_ them. If such a tool spits out
a "similarity score" or some such, what does that mean? Not much, in my
opinion.
Perhaps the best analogy would be DNA sequence comparisons, which are now, after
an extensive period of debate and scientific work, being admitted as evidence in
trials. But the key here is that an expert can give a pretty good explanation
about what DNA results mean. A judge or jury can use evidence such as
"statistically, there is less than a one in a billion chance that the blood
under the victim's fingernails came from a different individual than the
suspect". That expert testimony would be backed up by real science, involving
serious critical debate about what these probability estimates mean. I'm sure
someone of pj's research skills will have no trouble digging up info on all
this, but here's a quick URL I found anyway:
http://www.nap.edu/book
s/0309053951/html/194.html
By contrast, the science of data mining is still in its infancy. Based on the
way software is produced, you'd _expect_ to find statistical similarities in
entirely separate codebases. For one, the basic algorithms are taught in
textbooks that everybody reads (or should read). If you do find a statistical
similarity, how can you separate out the truly creative contribution from the
standard application of cookbook recipes? What's the noise level, in other words
the chance that two random excerpts of code will trigger a "statistical
similarity" check?
Even pattern analysis was able to perfectly identify whether copying took place,
there are all the other relevant questions. Who copied from whom? Was there a
legal right to do so? I think the Berkeley Packet Filter components of the
SCOForum presentation is particularly telling here. Sontag's assertion that it
was primarily a demo of the pattern analysis that they're doing is probably
right on target.
I might be willing to accept SCO's pattern analysis evidence in one scenario:
that they allow it to be used to analyze their proprietary code base to
determine how much "copying" there has been from free software projects, and
agree to make good any damages found in proportionate terms to the damages
they're seeking from IBM. If their confidence in the tool is so high, as well as
their confidence in their own lily-white processes for making sure there is no
improper copying, then they should have no problem agreeing to this. I'll leave
it to the groklaw community to estimate the likeliness of this happening :)
In a trial, I'm sure any half-competent lawyer with access to half-competent
expert advice would be able to demolish evidence from data mining. I think IBM
is a worst-case adversary for SCOX in this sense. But, again, all this seems to
fit into the pattern of trying this case in the media rather than the courts.
Gullible journalists, analysts, and so on, are quite likely to be taken in by a
snow-job with fancy-schmancy scientific lingo. The best way to counter this is
probably to continue insisting that SCOX makes clear, verifiable claims. Raph Levien[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:15 PM EDT |
I agree with Steve above. Basically you have to get the offending
Engineer/coder and get him to admit "I had SCO code when I wrote ______. I
copied the SCO code to make______. I used SCO code as a reference to write
______." If they don't have this their case is weak. Bet you lots of money
they have a team working this issue now. BubbaCode[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:15 PM EDT |
Dr Drake
IANAL, but defense 4) You are using under a license from the copyright holder.
If there is any SCO code in Linux, it's certainly going to be argued they
licensed it's use themselves under the GPL. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:27 PM EDT |
Sorry if this was posted before. Article on third party company to close to SCO
in the same data center getting blasted by the DOS attack on SCO.
http://elette
rs.eweek.com/zd/cts?d=79-180-2-3-22359-23149-1 BubbaCode[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:33 PM EDT |
> Just had me wondering, and BTW, this is just pure conjecture- what would
happen if the various open source projects decided to modify the GPL for new
releases- make it illegal for open source software to be used in any commercial
Unix release?
If all the copyright holders of a project agreed, they could relicense it will a
GPL like license that
excluded say OpenServer, Unixware, etc. I don't think they could use the GPL
itself.
Of course, SCO could continue to use the older GPL releases.
Alternatively, the software authors could simply gradually remove SCO specific
work-rounds, and write code that just happens to break on SCO's platforms.
Yes SCO could fix it, because it's open source, but (a) it's cost them money,
and (b) over time SCO will get on a more and more deserted private side fork,
missing all the critical security enhancements and bug fixes. If the software
author doesn't mention these changes in their documentation, SCO could be in a a
real pickle, just trying to figure out what is going on.
Check comments section on recent SCO stories at www.linux.com, and you'll see
this idea is being somewhat discussed... I bet there are more quitely thinking
about it, or discussing it on private forums or IRC.
Personally, I have a lot of sympathy with the idea. Why supply free code, free
support, and free products to sell, to somebody who is gunning for you? quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:37 PM EDT |
I would like to return for a moment to the subject of the company "DataCrystal."
Anyone armed with even a cursory knowledge of the data mining field who visits
the company web site -- as opposed to the web site of the academic research
project with the Coincidentally I'm Sure Similar Name -- would dismiss this
company out of hand. It's a non-serious entity. The list of claimed expertise is
a dead give-away. There is no business reason why anyone would assemble a
collection of people with such disparate skill sets. Not even a huge DOD
contractor would lay claim to all those things. And certainly no one human being
possesses expertise in more than one or two of the listed fields. When we find
out that this business operates out of somebody's house, the claim becomes
preposterous.
Then there is the matter of their "white papers" on the technology page. This is
beam-me-up-Scotty stuff. What they are talking about is so far beyond the
current state of the art that Somebody Big would have to spend many years and
hundreds of millions of dollars to create the technology they are
describing.
Not to put too fine a point on it, but absolutely nothing on this company's web
site is consistent with an actual, real-world, data-mining or AI house. I
recently spent three years in such a company; it was full of Ph.D. text-mining
and neural-net gurus. I know where the state of the art is, and Data Crystal is
not on the same planet.
This company has all the earmarks of being a Canopy Group shell... something
kept around to be one of the things that the peas hide under as Canopy shuffles
its assets around in some kind of game. I'm not sure it would have to be
disclosed in the 10-Q, but as soon as SCO files it I'll be checking it to see if
SCO has been making large payments to DataCrystal for Darl's "pattern matching"
work.
Many people on the Yahoo SCOX booard are speculating that the plan here is to
liquidate SCO, but to do so in as noisy a manner as possible so as to have one
last opportunity to extract cash from the investing public. Thus the insider
selling and the Vultus acquisition using newly-minted SCO shares. The next step
would be getting as much of SCO's cash as possible out of the company before
scuttling it... this would mean SCO "buying" services from fellow Canopy Group
entities -- and paying in cash -- to the maximum extent practical. They just
acquired another three months of delay before they have to produce any more
expensive legal work in the Red Hat or IBM lawsuits, and that's probably it for
them. By the time Canopy actually has to pay Boies & Co. for any serious legal
work, they could have moved most of SCO's cash out of the company. The last step
is to file a Chapter 7 bankruptcy, in which Canopy Group -- as the largest
debtor -- inherits whatever assets remain, such as the UNIX IP.
To prove this, we would have to somehow document the "purchases" made by SCO
during its noisy and expensive swan song. Presumably these could be obtained by
subpoena, but probably not until after SCO has decalred bankruptcy, leaving
aggrieved shareholders with some standing to sue (by alleging fraud). I checked,
and DataCrystal itself appears to be a privately-held company with no reports on
file anywhere. So there is no data to be had that way. Bob[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:38 PM EDT |
Bubba, did you look at the uptime graph?
This alleged hacker seems to keep very very very regular hours?
http://uptime.netcraft.com/up/performance?explain=0&mode_p=on&m
ode_u=off&mode_w=off&by=collector&errors=0&site=www.sco.com&site1=&sample=2&subm
it=Examine&range=5d&maxy=0
Maybe somebody should tell DiDio, that it can't be a crunchie at fault. It's
hard to keep precise time when you're stoned out of your mind at the ashram.
P.S. Do SCO's ISP indemnify them? quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:38 PM EDT |
Another Canopy company that trades in Linux:
http://www.center7.com/us/products
/vm/
[[Vintela Manager is a secure, web-based, systems management solution that
reduces the cost of deploying and managing established versions of Linux and SCO
UNIX. ]]
Not anymore, methinks. John Goodwin[ Reply to This | # ]
|
- radiocomment - Authored by: Anonymous on Wednesday, November 05 2003 @ 07:14 PM EST
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:40 PM EDT |
By the way, I wonder if that MySQL is the one
IBM used to say it owned. John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:40 PM EDT |
When I read your Canadian expert's explanation of data mining, several
ideas began to come together. Please bear with me while I go through
some rather speculative reasoning.
1.Obsfucated Code
SCO claims that there is lots of obsfucated code. This implies that
the search algorithm makes approximate matches.
2.Spectral Analysis
This is the basis for a field called automatic target recognition
(ATR). In its simplest form an image of the desired object and the
unknown image that may contain the object are transformed using the
Fourier transform. The transformed images are multiplied together on
a point by point basis and the result is retransformed back to the
spatial domain. All of you EEs recognize this as the convolution
theorem. Convolution in the spatial domain is equivalent to
multiplication in the frequency domain.
Convolution is like doing many correlations between the desired object
and the unknown image. When the desired object is aligned with its
occurrence in the unknown image we get a bright spot. The brightness
of the spot tells us about the degree of match.
I speculate that SCO did something like this to find what they call
matching code. This would be done using some kind of spectral
transform, perhaps a wavelet transform, to the Linux code (the
unknown) and then matching that up with the transform of a piece of
SCO code (desired object). One way to do this ould be to use the
ASCII value for characters. This gives a one dimensional data set
of numbers that can be transformed.
This has got to be an intensive computation and that is why they are
not through working on it.
3.Google Search
Google works great if you want to find snuff boxes and you type in
"snuff". If you type in "sniff" you may find one. If ATR were
looking for snuff and saw sniff you will get a nice hit.
I do not think SCO is using text search methods.
4. Mathematicians
While I am an engineer who does algorithmic development, I have been
referred to as a mathematician because some see the development and
application of computational methods as "mathematics." (IANAM)
5.Convergence
Enter "automatic target recognition" into Google; the first hit is MIT
AI Laboratory. I speculate that someone with past connections with
MIT AI Laboratory may be who is referred to as "MIT Mathematicians."
MIT Mathematics Department is a different place.
Ron Ron Michaels[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:42 PM EDT |
quatermass wrote:
Check comments section on recent SCO stories at www.linux.com, and you'll see
this idea is being somewhat discussed... I bet there are more quitely thinking
about it, or discussing it on private forums or IRC.
Personally, I have a lot of sympathy with the idea. Why supply free code, free
support, and free products to sell, to somebody who is gunning for you?
Thanks for the info- I will check out that discussion. An analogy to this
situation would be getting hit by the local bully in the schoolyard. Be
holier-than-thou and turn the other cheek and the S.O.B. will probably smack the
other cheek for you! I just don't see why programmers should continue to turn
out great programs under the GPL and be abused by a greedy commercial entity
over their work. Open Source is intellectual property too- just like SCO's Unix
code. wild bill[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:42 PM EDT |
Bob, maybe you could get yourself to be a creditor somehow (or maybe
shareholder??).
The idea to get in on any possible future bankruptcy hearing. quatermass - SCO
delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 01:45 PM EDT |
wild bill - most practical approach would be for lots of projects to
simply skip security updates for SCO platform. Then they have to
go fix everything themselves. No need to cripple the platform--
just make it insecure and let others do that for you. John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 02:04 PM EDT |
> Another Canopy company that trades in Linux:
Some or all of Vintela products used to be SCO's Volution products.
SCO sold them to Center 7 in some complicated stock deal... and got the right to
continue to sell the products to its customers in a complicated royalty
deal.
Check old news stories for how Volution was going to be the next big thing
according to SCO - which makes it even more surprising that they gave them away
for a relative pittance.
Personally, as SCO wrote them, and still sells them, I think IBM should check if
Vintella/Volution infringes any patents.
I think DiDio should ask SCO if Center 7 indemnifies SCO against patent
infringement. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 02:06 PM EDT |
> Another Canopy company that trades in Linux:
Some or all of Vintela products used to be SCO's Volution products.
SCO sold them to Center 7 in some complicated stock deal... and got the right to
continue to sell the products to its customers in a complicated royalty
deal.
Check old news stories for how Volution was going to be the next big thing
according to SCO - which makes it even more surprising that they gave them away
for a relative pittance.
Personally, as SCO wrote them, and still sells them, I think IBM should check if
Vintella/Volution infringes any patents.
I think DiDio should ask SCO if Center 7 indemnifies SCO against patent
infringement. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 02:13 PM EDT |
Lesser Known Canopy group specialties:
http://www.clearstonehealth.com/index.php?gettopic=Products
Services&getsubtopic=OtherServices
# eLearning in Healthcare
# Bloodborne Pathogens
# Peripheral IV Therapy
# PICC
# Wound Care
# Pain Assessment
# Pain Management
# Needlestick Prevention
# Improving the Sales Process through eLearning John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 02:18 PM EDT |
Yarro is also on the board of Canopy's Altiris who also do some Linux kind of
thing (neverly clearly understood what).
Those board meetings must be interesting! quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 02:26 PM EDT |
Why is Darl C. McBride listed as CEO here?
Their webpage says the Chairman is also CEO. Note
the deal with IBM for web services too.
htt
p://www.asia-links.com/matrix/b2b/b2bcompdetail.asp?companyid=1435 John
Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 02:28 PM EDT |
On the topic of somehow poisoning open-source software for the SCOX
platform.
Many actual free software developers (including myself) are angry at SCOX and
are definitely thinking and talking about ways to fight back at their
aggression. That said, the idea that there are no restrictions on _who_ can use
the software is very central to the free software philosophy. In fact, the idea
of excluding particular wrong-doers has come up many times in the past, and the
consensus has always been that it's important to keep the priniciple. Indeed,
any such restriction would fail to meet the standards of the Open Source
Definition and the Debian Free Software Guidelines (from which the OSD was
derived).
The most common class of use-restricted but "almost free" licenses are those
that permit non-commercial use. It used to be fairly common for software to be
released to the academic community under such terms, but the practice is fading,
supplanted by real open source licenses.
I think there may be other effective strategies that don't raise these kinds of
issues. I like the idea of not accepting patches specific to the SCOX platform.
This puts the burden for applying such patches and distributing the patched
versions firmly on SCOX, which feels just to me. I'm not a big fan of putting in
explicit anti-SCOX "logic bombs", because I think that unfairly affects users.
On the other hand, I am in favor of adding text to platform-detection messages.
I'd go for something like this:
Platform detected: SCO UnixWare/OpenServer/whatever
NOTICE: While you have a legal right to use this software on this platform under
the terms of the GNU General Public License, the authors of this software
deplore the tactics of the SCO Group, and do not support this use. Patches
specific to SCO Group platforms will be rejected. Thus, running on this platform
may be less robust than other platforms. Please consider changing to a system
less hostile to the interests of the free software community. Thanks,
<project name> team.
The cool thing is that it's likely that even if SCOX systematically tried to
remove all such notices from the versions they ship, it's likely that some would
slip through (they're not that smart, you see). In addition, software compiled
from source, or adapted from, say Red Hat binaries (running on the Linux Kernel
Personality module) would see such notices unchanged. Real users would probably
get a good chance to see these messages frequently.
I'm interested to see what kind of responses develop. This almost certainly
won't be the last time a company lashes out against the free software community.
It would be good to have a response that is ethical, morally justified, and
highly effective. Raph
Levien[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 02:38 PM EDT |
Alex, a horizontal-friendly version of this comment thread can be viewed here. CSS2[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:04 PM EDT |
All this talk about "Spectral Analysis" and now the #1 buzzword, "Wavelets"..
The, in my largely unprofessional opinon, correct way to compare (parts) of two
codebases would be to first tokenize them, including compressing all kind of
whitespace to a token WHITESPACE and so on, then build parse-trees (basically
it's like a compiler up till this point) and then compare these trees against
each other, both topology and content, using some suitable scoring function
(there the magic lies).
When scoring you'd give positive points for such things as "same variable name",
"same order of non-dominating statements", etc and maybe negative for others.
Some idioms might be so common as to be useless in this type of comparison and
should be pruned from the tree altogether or not scored, or simply tokenized
into some low-score token.
Computing the edit distance for subsets of the parse-tree(s) is probably a
useful scoring function in itself. For an introduction, see for instance http://citeseer.nj.nec.
com/navarro00indexing.html eloj[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:09 PM EDT |
Bob,
I really think that SCO are going to liquidate. What we know is: SCO initiated
extremely dubious legal action against IBM; not only is the case extremely
flimsy, but exceedingly hard to prove in the circumstances even if they were
correct. Furthermore, any such legal action would undoubtably draw massive and
ultimately fatal retalitary action, should IBM not settle. Instead, they have
mysteriously broadened their claims to all aspects of Linux by making
rediculous, even boarderline slanderous, statements -completely unsupported and
unproven. Their legal council, especially one as respected as Boies, would
demand that such egregious statements cease, since they would seriously
undermine their position, and even their case, in a court of law. Conversely,
the statements and press releases have increased and show a strong correlation
between prices and timing (the "fortune 500 company" statement was made when
their shares were down $2 off their opening value, a massive 20% of the total
value). Insiders have been insidiously selling their share in the company,
making upwards of $50k, sometimes as high as $200k, at a time; in somecases, the
shares where sold at $15, a massive 15 times the value of their pre-IBM price.
Furthermore, a dubious purchase of a fellow canopy company at inflated value,
despite the company's incompatibility with SCO's current product line -lacking
in compatibility for that matter. More importantly however, early in this
debackle a filing was made to the SEC (it's on the Yahoo board) which states
that SCO offers total indemnication for the actions of its executives, whereas
previously it hadn't (whether this would be retroactive, I don't know).
With the recent talk of SCO relaunching UnixWare, one has to wonder: if they -by
their own tongue- own the rights to Linux, why upgrade an inferior technology,
while they could already license Linux to a far greater profit? Even if this was
the case, SCO does not have the personnel, having sacked the majority of their
R&D staff, to compete which such "legal" UNICES suchas Solaris. None of the
pieces fit together.
I would agree that the liquidation of SCO is most likely the plan. Afterall, how
would they profit most from this debackle. The question is, what would
happen?
Clearly, IBM would not accept such an outcome, and no doubt proceed with the
legal action (IANAL) in spite (if possible). If SCO was found guity of gross
misrepresentation, would the executive then become liable for their actions?
Though this may not be the case, it's a pretty sure bet that a SCO cannot
indemnify it's employees actions, if the said actions were illegal. Furthermore,
wouldn't such a mysterious liquidation be EXTREMELY illegal; with the eyes of
the world on them, and the hearts of thousands of developers against them, they
would most likely fail to slide silently out of the limelight.
Or, it may be that Darl McBride _really_ is a dumb as he seems and the true
intention was to get bought out by IBM all along! Stephen Henry[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:22 PM EDT |
Speaking as a mathematician & computer scientist (I teach the former, have a PhD
in the latter) I'm extremely skeptical about "spectral analysis". Spectral
analysis refers to analysis performed in the frequency domain. It's hard to see
what sort of frequency domain can apply to code, except for (as pj's expert
mentioned) literally the frequencies of fragments. Spectral methods work well in
signal processing usually because important signal properties show up in the
frequency domain, whereas the time domain is often littered with insignificant
garbage. That can't be the case for code - the important thing is the
functionality of the code. I would be surprised but not overwhelmed if frequency
analysis of programs could determine whether two fragments of code were *written
by the same person* (by picking up coding styles well in the frequency domain)
but that has no bearing here.
With my computer science hat on, I know very well how hard it is to build a
mathematical model of the functionality of a program. Indeed, for C programs
it's damn near impossible to do with accuracy.
Spectral analysis seems to me to be a red herring - one of those buzzwords
thrown around by people trying to portray themselves as experts. In this case
it's being used by SCO to hype the calibre of their "mathematicians", whom they
can't name (?!). On the other hand, there is plenty of literature on catching
plagiarised code (some universities have automated systems which screen for
prima facie cases of copying, later scrutinised by a human) which is not too bad
at seeing through obfuscation. If I were one of the Universities or businesses
who has a licensed copy of the SysVr4 code, I would be sure to run that against
the linux kernel and leak the results.
PS: Wavelets are another red herring. They are just a way of getting into a
somewhat different frequency domain and are suitable for signals where low
frequency does not demand good location. They can't be relevant to code.
PPS: A very important part of what pj's expert was saying hasn't been properly
publicised. A VITAL part of data mining is not only to find matches but to
answer the question "how likely was it for this match to occur by chance". It's
a difficult question to answer, but without an answer you can't quantify the
significance of a match. (With my statistician's hat on I'd point out that
"significance" is the technically correct word - the data mining program ought
to be able to approximate the significance level for the particular observation,
or something like it). Dr Drake[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:24 PM EDT |
Raph Levin -- I don't think a lot of effort should go into poisoning
SCO platform--or supporting it anymore. Failure to provide security
updates has nice Fear value, because "package foo no longer works on
SCO" is one minor problem, but "package foo is insecure on SCO" is
a big problem. Simply stopping security updates (why bother?) should
be more effective. Also (speaking as a QA guy)--don't worry about poisoning
the code, just don't *test* it on that platform, or do bugfixes that are SCO-
specific. Trust me, it *will* break one day.
If you must poison SCO, setting the LANG=C variable somewhere in your
installation procedure and exporting to the compilation shell should
break that install and most downstream ./configure, make, make install's.
Lot's of existing software works around the LANG variable for SCO, and
will compile wrong if it's set. In this day and age, packages install
other packages.... LANG=C should be like m4sugar in the gas tank. John
Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:26 PM EDT |
I would like to hear from a lawyer on the spectral analysis and data mining
produced evidence actually being allowed to be introduced in court. There are
many ways to check for authorship of text documents that have been found to be
very reliable. How about plagarism? What kind of criteria is used for that?
As has been pointed out here and elsewhere, pattern matching algorithms are
bound to find many proximate matches in a structured language such as C and C++.
And when a program or algoritm is written to a published specification, then the
pattern matches are bound to show up even more strikingly, such as Jay
Schulist's clean room implementation of the Berkeley Packet Filter. Ther are
also bound to be pattern matches in totally unrelated areas.
I think that SCO has another huge hurdle to jump here just getting this type
of stuff admitted as evidence.
Glenn Glenn Thigpen[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:32 PM EDT |
Hey what's up with our favortite FUD generator? I wonder if the legal team
has muzzled them. Perhaps whilst the stock is >14 they don't have anything
to "announce".
Morbo Morbo[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:34 PM EDT |
SCO busted tracking SCOX Yahoo! messages
Yahoo! Message MajorLeePissed[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:39 PM EDT |
http://www.macobse
rver.com/article/2003/08/29.11.shtml
Tinfoil hat theory. Not necessarily my opinion, but an opinion, pure conjecture.
If you think this is a silly idea - tell me why - please.
It started off about some libraries. Check the early SCOsource stuff. This is
probably where they spent the legal-advice money initially - and they haven't
spent much since.
Nobody was interested.
So in other words, they (SCO) blew it.
Boies and Heise don't know the difference between shared libraries, UNIX, Linux,
JFS, RCU, etc. They certainly don't know the meaning of whatever it was SCO
registered the copyrights for, or UNIX history. They don't know the meaning of
those slides, and they are trivial pieces of code in any case.
Their client (SCO) tells them IBM and others are ripping off these libraries and
putting them in Linux. Their client (SCO) tells them this is how Linux got JFS,
RCU, NUMA, etc.,
SCO's complaint is a basically a recitation of the "facts" according to Sontag
and McBride. Boies and Heise don't know what they're writing. This would explain
the curious section about shared libraries even in the revised complaint, which
seems to have little or nothing to do with the rest of the complaint.
Heise maybe even thinks the shared libraries are what they registered copyright
for. If these are being ripped off, he thinks he can sue.
Check out Heise and Boies comments - do they ever say anything about SCO's
general "case" against Linux??
Boies & Heise probably don't do email or web browser. They don't read computer
magazines. They are probably unaware of much of the press SCO is getting. If
they read the computer magazines, they might not understand the story being told
in any case.
The Linux IP license is probably McBride and Sontag's own work. Input from legal
counsel is minimal.
The 3 teams reviewing Linux code probably don't exist. After all the other 3
teams supposedly finding Linux customers to sue, can't exist, if SCO's statement
in The Age/SMH is true.
In summary, it started as a legitimate, or semi-legit, attempt to extract
revenue from some libraries SCO wrote. It didn't work. At that point, they went
on a different course, Boies & Heise never realized (perhaps until they read the
Red Hat complaint) that SCO was pursuing in the press an entirely different set
of issues to the shared library thing. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:49 PM EDT |
quartermass:
> Personally, I have a lot of sympathy with the idea. Why supply free code, free
support, and free products to sell, to
> somebody who is gunning for you?
Standard IANAL bit, first off, I endear the same sympathy as do you, with
regards to Linux SCO support; however, it's a petty thing to do and might even
be illegal. Despite the code is free, as it becomes a foundation that
businesses are built upon, I think it would be illegal to intentionally
discriminate against SCO.
I don't agree with any of SCO's actions, in the least. But I am not the
American legal system and it is within SCO's rights to defend its contracts and
IP. Of course, they'll have to prove their claims, otherwise face stiff
penalties, but it's their right to do so, as a business in the United States. I
think petty recriminations, by any software developer, is sad to say the least.
Even the SAMBA folks realize this, they provide software under the GPL, for end
users, regardless of their actions. If end users want SCO support, SCO support
will exist. If end users ignore SCO, then the SCO code will become legacy and
eventually be removed. Just imagine the SAMBA folks getting pissed at Microsoft
and altering their source to make it incompatible with Windows just because they
don't like Microsoft's business ethics. Sounds silly if you ask me.
This is the power of America, we as end users can speak a language that
businesses, for profit and not for profit alike, can understand, and that's
demand. All businesses want to at least survive, and you can't survive if noone
wants your product. The developers of SAMBA don't release SAMBA just for fun,
they ultimately do it for us. If I wanted software that was periodically
rewritten to break compatibility with another software package, then I'd install
some version of Windows on our company's domain controller, web server, 2
database servers, 2 routers, etc...
The OSI is about more than just writing cool software, it's about the ideals
that ESR layed out for us all. He wanted the DDoS to stop, because it was a
childish manuever that accomplished absolutely nothing. If someone calls you a
bad name, does that give you the right to smash the windshield on their car? It
doesn't, and that's where the legal system comes to play. You can take the
issue to court and have it resolved in front of a judge and/or jury. Just
imagine if all of the energy put into DDoS attacks and source manipulation, went
into creating a coherent response to SCO's actions. I'd take the coherent
response over a DDoS and source manipulation any day. Tazer[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:52 PM EDT |
Some good keywords to might be "Program Differencing" and "Tree Differencing". I
found a TR on the latter here: http://www.quci
s.queensu.ca/TechReports/Reports/95-372.ps
It's about structured documents, but the same principles can be applied to parse
trees. eloj[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 03:57 PM EDT |
Plz forgive for being off topic but, Bob Toxen writing for Net-security.org
wrote an
article entitled SCO v.
IBM
in which he reassures his readers that no Linux user has anything to worry when
using Linux.
If someone else linked to this article earlier, my apologies, I shoulda read all
the posts first. PhilTR[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 04:15 PM EDT |
> Standard IANAL bit, first off, I endear the same sympathy as do you, with
regards to Linux SCO support; however, it's a petty thing to do and might even
be illegal
IANAL, but I guess it depends on what they do.
But I'm curious what you think might be illegal?
Most GPL software authors don't provide versions for
Windows/Mac/Plan9/BeOS/QNX/zillion-obscure-UNIX versions
1. Are they under any obligation to do so?
2. Are they required to not depend on functionality that happens not to be
present in whatever operating systems?
3. Are they required to provide workrounds for bugs that appear on certain
platforms?
IANAL, but I am not aware of any legal reason why they would be required to do
any of the above. If you are, I'd be interested to know.
BTW in case you are thinking of MS-DOS vs DR-DOS under Windows 3.1. I don't
think this is quite the same thing, as in that case we were talking about
anti-trust issues. Which presumably don't apply to a typical GPL package. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 04:19 PM EDT |
PJ: If your out there, these two links tell just about all there
is to tell in the 10th Circuit concerning program infringement.
http://www.digi
tal-law-online.com/lpdi1.0/treatise22.html
http://digital-law-online
.info/misc/ogilvie.htm gumout[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 04:33 PM EDT |
> I really think that SCO are going to liquidate.
I once had the, umm, pleasure of being an observer during the last days of a
company that was being strip-mined by its owner, who was a professional at it.
On the day they filed Chapter 11 bankruptcy, the creditors got a radioactive
husk that had nothing left. (In this case the 'radioactivity' was an EPA
Superfund lawsuit involving the remains of decades-old metalworking facilities).
They were $40 million into the banks, losing $25 million a month, with sales
dropping like a rock. The creditors took that over and the previous owner
walked.
What were some of the final steps, so that we might watch for them in the SCO
case? Getting the owner's people out. They brought in a board member as a new
Chairman/CEO, paying the previous occupant a hefty separation package. Next to
go was the CFO... he too was "resigned" but given a generous package on his way
out. Then the president... ostensibly "let go" but he in fact walked out with
about a half-million of the final remaining cash as his "separation agreement."
About two weeks later they cratered the thing. All the "resigned" officers
surfaced later in exec positions in other companies owned by the same guys. It
was all a game... they extracted maximum cash from the public, the suppliers,
and they even burned the banks (big ones... Chemical took about a $30 million
write-down over this deal).
If Darl and his friends start disappearing, either not replaced at all or
replaced by gullible underlings who don't realize what is being done to them, we
can assume the end is near. If there are more quarterly payments coming in from
MSFT or Sun, they will probably wait for those to arrive, and pass the cash out
to other Canopy properties as fast as they can. When they're down to the last
million or so, they'll pass that out as separation money for departing
officers... and then crater it, leaving IBM and Red Hat with no one and nothing
to sue.
A good question for the legal brains among us is whether Canopy can in fact rely
on a bankruptcy court to award it the UNIX IP (Canopy will be the only serious
creditor at death), or can IBM and Red Hat somehow pursue their lawsuits and
perhaps acquire the IP from the bankruptcy court as their damage award? Bob[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:09 PM EDT |
"How about plagarism? What kind of criteria is used for that? "
It's whatever you can convince a jury it is. Generally you look for intact or
lightly edited passages,
sudden changes of style, and vocabulary differences. Usually you know what the
probably source is, so
you do a comparison of the texts, and consider whether the author of "B" was in
a positon to steal from
"A". Where it gets tricky is when both authors are using a common source ...
you have to decide how much
of the similarity is because of the ancestral texts and how much was lifted from
"A". That same problem
holds true in technical works: there are only a few ways that a USB port can be
described, and if both
authors were working from the published specification, and both are skilled
writers, the text is going to
be very similar because of the constraints of the subject mattre, and the
language, and the expectations
of the readers. Tsu Dho Nimh[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:11 PM EDT |
Quatermass,
re: Tinfoil hat theory
I dunno... that theory ascribes an incredible amount of stupidity to Boies &
company.
If I were a lawyer, I don't think I'd take the word of two executive types on
either software or IP issues; I'd get the opinion of an independent expert.
On the other hand, even though Boies has a reputation to protect (such that it
is), he's been strangely silent, letting his sidekick Heise make such legendary
statements as "... copyright law allows only one copy to be made ...". With
nitwits like McBride and Heise on his side, Boies probably decided to retire to
Aruba. Dick Gingras - SCO caro mortuum erit![ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:12 PM EDT |
Note "Altiris" logo on building in this
picture...
http://www.smilereminder.com/inde
x.html John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:14 PM EDT |
Sorry. You have to click on "About Us" to see the picture. John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:26 PM EDT |
willows.com used to be Canopy. Are they still around?
"Software Tools and Services Enabling your Windows® Applications to Run on
UNIX®, Macintosh® and Other Systems."
http://216.239.37.104/search?q=cache:waaU
S0kVAh4J:www.willows.com/+%22willows+software%22+canopy&hl=en&ie=UTF-8 John
Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:31 PM EDT |
Name: willows.com
Address: 216.250.129.62
Name: sco.com
Address: 216.250.140.112 John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:49 PM EDT |
John,
You wrote" Why is Darl C. McBride listed as CEO here?"
PointSource, was D. McB's employer before he started working
at Caldera last year.
The page you found seems to be a tad out of date, like SCOG's unices... D.[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:50 PM EDT |
Noorda Family Trust (redirects to Canopy)
Name: nft.com
Address: 216.250.129.2 John Goodwin[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 05:56 PM EDT |
> I dunno... that theory ascribes an incredible amount of stupidity to Boies &
company.
Like I said it's just an opinion, 100% pure conjecture, and not
necessarily even my opinion.
I don't think it ascribes stupidity, just technical ignorance. You know, even
some people who might be more technically astute than Boies got fooled by the
slide show. Furthermore even if Boies eventually caught on, could he back out at
that point?
The easiest way to show it's a totally bogus theory, would be to find any
specific reference from Boies or Heise to the general Linux case - or - to find
a plausible logical explanation of why there's a big section about shared
libraries in both the original SCO complaint and the amended complaint.
Seeing as we are in tin-foil hat territory today, there is one other theory that
is worth mentioned. Again 100% pure opinion and conjecture, and not necessarily
even my opinion. I do not believe this theory incompatible with the bankruptcy
or Boies/Heise=dupes theory
Tinfoil Hat Theory 3:
SCOX = BRE-X
BRE-X if you remember was a struggling small town Canadian mining company.
Midland Walsh, one of the principals (founder?), was famous for suing a former
employer and getting a settlement for an undisclosed sum.
BRE-X suddenly said they found these incredibly huge gold deposits in a mine in
Indonesia.
BRE-X said they had their own secret teams of experts, whose identities they
couldn't reveal, supporting their claims (assaying of core samples for
gold).
Industry experts criticized the techniques for assaying which were unorthodox,
didn't follow industry standard practises.
The company's reports (with incredible claims) were criticized by industry
experts for the same reasons.
The industry experts were ignored.
Despite this media and stock analysts preferred the company's version to that of
the industry experts. Some analysts really pushed the stock hard.
As more and more discrepencies in the companies story came to light, the company
produced a series of increasingly unsatisfactory explanations, which were
debunked by industry experts too.
The stock prise rose and rose on the Toronto Stock Exchange. Massive relatively
uncritical media coverage.
Insiders cashed out millions of stock. I think it was a tiny fraction of the
total company, but still a lot of money to them.
Eventually it turned out the samples from the mine had been faked. All was
revealed. The stock price crashed so badly in a single day that it broke the
software for the Toronto Stock Exchange.
Links to BRE-X story:
Short summary: http://www.goodreports.net/bregoo.h
tm
Long version of story: http://www.sbae
r.uca.edu/Research/1999/SRIBR/99sri091.htm
The tech stuff: http://minera
ls.state.nv.us/programs/min_fraudami.htm#bre-x
Could this tin-foil hat theory be true?
For #1: Lots of people report difficulty (impossiblity) of buying SCO Linux IP
licenses. They don't seem to be actively trying to actually sell their new
product - or actively pursue their riches by litigation strategy.
For #2: So many secrets - the code gold, the code analysts, the Linux IP
customer, etc
How to disprove: SCO or some enterprising report to find and properly verify any
of the SCO secrets quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 06:00 PM EDT |
Just a comment to "Tazer"
Not sure where you are comming from with the "might even be illegal..", if such
were actually the case, M$S would have had their pants sued off over not
supplying Office for all the other platforms, discontiniung support for software
& such. As for Samba, the only reason such exsits is so that Linux can talk
to/be used instead of M$S Servers. Dropping "support" for M$S in this case
would pretty much mean there was no Samba. As far as the Linux world, or for
that matter all other operating systems I've ever worked on (starting with a
Wang mainframe in grade school), there are much better ways for network shares
and such to be done (an example would be NFS). The SMB block is not actually
very good, but it's what M$S decided to use, so everyone else had to try to
figure out how to supply compatability it - no easy task as M$S keeps modifying
how it works (they extend & exstinguish their own stuff too - the "forced"
upgrade...).
While I would agree that the programers perhaps "ought" (in an ethical sense) to
leave the previouse work in support of SCO's stuff in, continuing to fix &
update such is certaintly not "required" of them in any sense - legal, ethical
or what have you. Changing a line in the compile instructions that ignores
issues with compilling on SCO's stuff is, IMO, also not an issue - it is a "no
longer supported platfrom", something everyone in the computer world has, for
lots of reasons, gotten somewhat used to.
Thomas Thomas LePage[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 06:00 PM EDT |
Just a comment to "Tazer"
Not sure where you are comming from with the "might even be illegal..", if such
were actually the case, M$S would have had their pants sued off over not
supplying Office for all the other platforms, discontiniung support for software
& such. As for Samba, the only reason such exsits is so that Linux can talk
to/be used instead of M$S Servers. Dropping "support" for M$S in this case
would pretty much mean there was no Samba. As far as the Linux world, or for
that matter all other operating systems I've ever worked on (starting with a
Wang mainframe in grade school), there are much better ways for network shares
and such to be done (an example would be NFS). The SMB block is not actually
very good, but it's what M$S decided to use, so everyone else had to try to
figure out how to supply compatability it - no easy task as M$S keeps modifying
how it works (they extend & exstinguish their own stuff too - the "forced"
upgrade...).
While I would agree that the programers perhaps "ought" (in an ethical sense) to
leave the previouse work in support of SCO's stuff in, continuing to fix &
update such is certaintly not "required" of them in any sense - legal, ethical
or what have you. Changing a line in the compile instructions that ignores
issues with compilling on SCO's stuff is, IMO, also not an issue - it is a "no
longer supported platfrom", something everyone in the computer world has, for
lots of reasons, gotten somewhat used to.
Thomas Thomas LePage[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 06:38 PM EDT |
quartermass, I think I am agreebly naive on legal matters, but let me elaborate.
Please correct me where I'm wrong. I would think that any organization, for
profit or not for profit, would have to distribute it's products and/or services
in a non-discriminatory manner. Otherwise, if an entire business was based upon
Linux, not that SCO is, then it would have to pander to the developers of the
Linux kernel, otherwise face possible retaliatory measures which is similar to
extortion<?>.
Ultimately, some person or organization has to be accountable for Linux(http://www.osdl.org/about_osdl/):
"OSDL Mission
To be the recognized center of gravity for Linux; the central body dedicated to
accelerating the use of Linux for enterprise computing through:
Enterprise-class testing and other technical support for the Linux development
community.
Marshalling of Linux-industry resources to focus investment on areas of greatest
need thereby eliminating inhibitors to growth.
Practical guidance to our members - vendors and end users alike - on working
effectively with the Linux development community."
I am not clear on laws regarding non-profit organizations, but by allowing
blatant code manipulation against a specific company, wouldn't OSDL and/or Linus
Torvalds be held to the same ethical business standards that other companies
are? Wouldn't SCO be able to make some sort of legal argument that since OSDL
is a non-profit(charitable) organization, that by excluding a certain group,
intentionally, that they should lose their tax-exempt status? Wouldn't they be
creating an unfair advantage?
For instance, Oracle has spent a probably large sum of money on their Linux
port. If OSDL decided that they didn't like Oracle anymore and intentionally
created, or permitted to be included into the kernel, incompatibilities that
prevented Oracle from running on Linux, couldn't Oracle do anything about that?
If this is the case, I can't fathom why any business would build products for
Linux, especially if it's not possible to enforce certain restraints.
Sure, they could start their own distribution based on the last compatible
version of Linux, but that would require a significant investment and diversion
from their current market strategy. Every software manufacturer would
effectively have to be prepared to be an operating system manufacturer as
well.
These types of legal manuevers may not be a huge blow to Linux, but if I'm a
beginner on law, and these are valid arguments, it would be reasonable that a
well equipped law firm could find many more issues than I.
Don't get me wrong, I'm a huge Linux advocate and have been know to Microsoft
bash on occasion(daily at 4:30PM in the IT managers office), but do any of my
questions or points have any merit? Tazer[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 06:49 PM EDT |
Assuming that SCO were liquidated, and assuming that Canopy as the big creditor
were to receive the UNIX SystemV rights (such as they are), I should think that
Canopy would not be out of the woods as they would then have property which IBM
has already indicated infringes their patents. What would prevent IBM from then
persuing Canopy over those same patents? The transfer of the rights could quite
possibly involve the recipient(s) in court, fighting the biggest patent team in
the USA. It would be entirely consistent for IBM to let the really big dogs
loose on Canopy, since Canopy is financially benefitting from the SCO price
run-up due to the trial-by-press-release.
It is much like the scene from "The Wizard of Oz," you cannot ignore the man
behind the curtain, even when he demands you do so.
In a way, this whole episode would seem to make toxic waste out of the SystemV
rights.
Marty Marty[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 06:50 PM EDT |
Thomas, I guess I mean there is a significant difference in eliminating
compatibility and introducing compatibility. Granted, OSDL is a non-profit
organization, but as the *nix industry moves forward, I would expect to see
Linux proliferate the market, essentially becoming the dominant standard
platform on which applications are built. Maybe I'm saying that Linux might
become a monopoly, and will have to follow a specific, legal, business
ethic(antitrust?). Isn't there a precedence that would force the compatibility
to be kept in? Maybe like a public service?
I truly am not a lawyer and am posting so that I might learn the answers to some
of these questions I have. Tazer[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 06:57 PM EDT |
Tazer, first off, I think we're talking about applications rather than operating
systems.
Second, I think there's a big difference between sabotage, and simply not
supporting something.
I am not aware of any legal reason that obligates any developer (except maybe
Microsoft who are bound by anti-trust issues, and even then there are limits) to
support everything.
And as a practical matter, they simply couldn't, even if they wanted to. Time
and cost, limit all software development. What about QA - is it practical to
test on a million platforms?
If SCO's UNIX platform has bugs or pecularities in it, that happens to mean some
new piece of software doesn't work on it - that's SCO's problem, not the
developers.
To give an analogy... there are bugs and limitations in Windows 95 that a
developer can work round. There are also bugs and limitations which are not
easily worked around. Similarly there are bugs/limitations in say WINE.
If a developer (say Adobe or Macromedia or Westwood or whoever) produces a new
or updated application, are you really saying that they are legally obligated to
ensure that their software works on all of Windows 95, 98, NT4, 2000, Me, XP,
2003, WINE, etc. Mac too? QNX too? Plan9 too? HPUX too? And so on?
As far as I am aware, they are only obligated to support those platforms that
they want to?
Can I successfully sue AOL, because their latest clients are no longer
compatible with the Commodore 64 or Apple ][, even though years ago they used to
offer that?
If you really think so, please point to a law allowing me to do so, and a case
where such a suit succeeded. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 07:03 PM EDT |
Tazer:
One thing about Open source is that anything you add to introduce
incompatibility can be removed, often with less effort than it took to add
it.
On the other hand, just neglecting platform-specific updates and bug-fixes puts
a significant burden on the people who use and maintain software on that
platform. That sounds reasonable and legal to me. IANAL. r.a.[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 07:12 PM EDT |
http://uptime.netc
raft.com/perf/graph?site=www.sco.com
SCO is down again. Taking a web site doesn't sound like a usual ddos and it
also doesn't sound like usual server maintenance/upgrade.
Anyone with an idea what's going on? Has anyone ever seen anything like
this? r.a.[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 07:16 PM EDT |
r.a. I have no explanation
Peter Williams of VNUNET seems to think it's a DOS
http://www.vnunet.com/News/1143283
Maybe somebody ought to tell him that the downtime is regular, real regular,
like a clock. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 07:18 PM EDT |
It seems to me that someone with access to the relevant sources should be able
to tell if there has been major pilfering from Unix System V into Linux by:
1) use some kind of pattern matching to find suspicious lines (should be
trigger happy, false positives
OK, false negatives not OK. This is the `suspicious population'
2) take a random sample of the suspicious lines, perhaps 50 in the first
instance, and do the code
philology on them carefully to work what percentage of them were
illegitimately copied.
3) you can then use basic stats to estimate what percentage of lines in the
suspicious population
were illegitemately copied. The accuracy of the estimate depends on the
size of the sample
and percentage of lines found to be copied, *not* on the size of the
original population,
and is more accurate the closer the percentage is to 50%. So if half of
the sample come up
bodgy, then it would be clear that there is a tremendous problem, but
only one or two did, then
you'd need to extend the sample to get a respectable estimate.
Well the guys at IBM and OSF know more about this kind of thing than I do so
maybe they're doing like this, or better. Avery[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 07:24 PM EDT |
http://www.connect-utah.com
/article.asp?r=139
"If IBM drags the case out into several years, we will consider seeking damages
from Linux customers," says McBride. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 07:53 PM EDT |
http://www.connect-utah.com
/article.asp?r=139
"If IBM drags the case out into several years, we will consider seeking
damages from Linux customers," says McBride.
Unfortunately, there is no law that says they can seek damages in a
trade secret leak from anyone except the leak source. And copyriught
law is the same way - the infringer, the perswon who actually knowingly
ripped off the code is the only person who they can take action against.
Suing users isn't mentioned in the remedies section. Tsu Dho Nimh[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 07:58 PM EDT |
Tsu, my post pertains to the SCO not suing anybody & never had any plans too, as
a possible SCO defense to Red Hat's suit, as discussed in the last comments
section of last article (at least in the bits I can see) quatermass - SCO delenda
est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 08:02 PM EDT |
Dr Drake, gumout, and Glenn: I think Computer Associates v Altai is spot on, but
for all the wrong reasons. It was both a
copyright case and a trade secret case. So far SCO has failed to make any
copyright claims. Gumout is correct to mention it
though, it's full of good stuff. It was cited by BSDI and Berkeley in the USL v
BSDI case.
ht
tp://www.kentlaw.edu/e-Ukraine/copyright/cases/computer_v_altai.html I'll
refer to Computer Associates v Altai as "CA" from here
on out.
Before you even begin to look for "substantial similarites" you first have to
prove that the author had access to the Unix
System V source code. As I understood the press reports, the author of the
Berkeley Packet Filter derivative shown in the Las
Vegas slide show was not a SCO licensee. If you can't prove that an author had
access, you don't bother with any analysis
That should hold true for any of the other Linux kernel copyright holders.
On appeal,there is some discussion in CA about how the lower court should have
first excluded the functional code required by
external specifications (POSIX, iBCS2, & etc.) and also any code taken from the
public domain from the similarities test.i.e.
abstraction, filtration, comparison. It's important to remember that this high
tech search method is just an alternative
way of raising the question of substantial similarity for the judge or jury.
"Since the test for illicit copying is based upon
the response of ordinary lay observers, expert testimony is thus "irrelevant"
and not permitted. Id. at 468, 473. We have sub-
sequently described this method of inquiry as "merely an alternative way of
formulating the issue of substantial similarity." Ideal Toy Corp. v. Fab-Lu Ltd.
(Inc.), 360 F.2d 1021, 1023 n. 2 (2d Cir.1966)." That's a citation from CA.
There is another major difference. SCO readily admits that IBM and Sequent own
the copyrights and patents. They simply claim
that they control their release or distribution. This theory looks doomed on two
accounts. AT&T sent letters to all of it's
licensees explaining almost the opposite. There were even sworn depositions to
that effect in USL v BSDI. CA has a nice
discussion of Title 17 section 301 (copyright) pre-empting state trade secret
laws. In CA the court simply wanted to prevent
"double dipping", i.e. charging copyright infringment and trade secret
misappropriation for the same act. That complicated
things for Computer Associates, but probably makes quite a few things very
uncomplicated for IBM and Sequent - they
are being charged for distributing their own code in accordance with their
rights under a statute (17 USC section 301) that
pre-empts Utah's trade secret laws..;-)
It's also a fact that AT&T's lawyers felt that copyright is or was incompatible
with trade secret protection. That is one of the
reasons I've been given by insiders to explain why AT&T started removing
copyright notices from their Unix 32V code. It's also
why they sent out the letters disowning any derivative that didn't contain at
least some of their code. Under the procedures
of abstraction, filtration, and comparison outlined in CA, it's possible that
you might have to remove any overlapping BSD or
32V code before doing the analysis. Harlan[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 08:17 PM EDT |
quartermass
Well, in retrospect, it seems you're more specifically referring to application
maintainers rather than kernel/OS maintainers. That was my misread. I agree
that it's not feasible to maintain an application on every platform/OS
configuration and it's up to a specific company/organization to determine which
platforms it will decide to support. My argument was primarily focused on the
actual kernel itself and is therefore deprecated.
I guess misreads are the price you pay for setting up Gentoo VPN's all night. =)
Seriously, I was trying to argue a good argument, not pick a fight. Tazer[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 08:17 PM EDT |
gumout,
Those links you provided are an excellent source of the current legal thinking
on copyright infringement as applied to programs.
Should be required reading for everyone here! Dick Gingras - SCO caro mortuum
erit![ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 08:20 PM EDT |
I'm out here, gumout. Take a look in SCO Archives for an article on How the
10th Circuit Defines Derivative Code and for the articles on copyright, that I
think are also on the Legal Links page. In the first, the 10th circuit
article, there is a link to a paper Dan Ravicher wrote that you will likely get
a lot out of. There's another on Patents and Copyright, showing what lawyers
and cases have said are the differences.
CSS2, you are too much! Thanks.
quatermass, I agree with you about Boies et al. Its not unusual for an attorney
to just go with what the client tells him in the beginning. pj[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 08:55 PM EDT |
pj, I'm not sure whether I agree with me about Boies.
However it does seem plausible (which ain't necessary the same as true), that he
doesn't or didn't know the whole SCO story.
Another reason, I think this plausible, is the number of simple factual errors,
omissions, and lack of precision in the complaint. If Boies knew the whole
story, I thought that he would:
1. have described UNIX differently, right at the start. And justified SCO's
complaint in terms of SCO's rights to a particular (the AT&T original)
implementation of UNIX.
2. and also mentioned the difference between Old SCO and New SCO - and then
given a legal justification of why he thinks New SCO has the successor interest
in project Monterey.
Before getting behind any theory, should be looking for testable predictions,
the scientific method if you like. This has not yet been applied to my
knowledge. I do not believe any of the theories need be mutually exclusive.
Anyway, both the theories I posted, are just theories, conjecture and possible
opinions. I know they have been discussed elsewhere in other forums. And yes
they are pretty extreme, some people might even say wacko.
Currently, I do not currently endorse any of these theories. quatermass - SCO delenda est[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 09:12 PM EDT |
q'mass,
Bingo! I'll buy SCO == BRE-X, a classic pump 'n' dump scheme; the parallels with
SCO are notable.
The longer article's concluding sentence is apt: "Bre-X is a story about pure
human greed.". Dick Gingras - SCO caro mortuum erit![ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 09:16 PM EDT |
"SCO is down again. Taking a web site doesn't sound like a usual ddos and it
also doesn't sound like usual server maintenance/upgrade.
Anyone with an idea what's going on? Has anyone ever seen anything like this?
"
Actually, yes. It's an exceedingly funny case of basic stupidity. This business
(which shall remain nameless) had a server in their store. Unfortunately for the
suits, some idiot plugged the server into a switched outlet. When the last
person went home and switched off the lights as they left, the server was
suddenly without power. When the first person arrived in the morning and turned
on the lights, the server booted up and was running just peachy by the time the
first tech-head arrived. No problem with our server... must be someone else's
fault!
:) J.F.[ Reply to This | # ]
|
|
Authored by: Anonymous on Friday, August 29 2003 @ 10:31 PM EDT |
If SCO really has code by line number that is copied, they can see who copied
what.
SCO has to know names from the kernel logs and changes(if they really have any
code)
SCO is protecting the very people they say "ripped off their IP".
Can Redhat file a claim under the DMCA, for SCO to be forced to reveal the names
of the "code stealers".
I do not believe SCO has any real code they could show, if the court case was
today.
Just my .02 worth, Am I dead wrong?
Satan Claims Opensource nm[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 01:11 AM EDT |
I repeat my suggestion that the terms of the GPL be changed so that if a GPL
licensee delibrately violates the GPL on one product, then he barred from using
any GPL product. IMO, this change does not conflict against the
non-discrimination clause of the GPL. Frankly, I don't care about cosmetic
concerns such as being seen as better than the other guy or worry about sinking
to his level when he needs to be taken out. On the other hand, this is what I
care about: I believe that OSS has an obligation to protect the IP of the
thousands of developers who contribute their time and effort, and I believe that
the change I am proposing is a necessary step toward meeting that obligation.
style="height: 2px; width: 20%; margin-left: 0px; margin-right:
auto;">blacklight[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 01:22 AM EDT |
nm:
This isn't really about code and barely about the court case. The code in
question has always been out in the open and if at any time in the past, present
or future, any party feels code has been donated improperly they have and will
be able to show that they are the true owners and the code will be removed
immediately. (Try doing that with Microsoft)
SCO is taking advantage of the fact that as a "traditional" software company,
they get a presumption of reasonableness from right now when they make
accusations against a decentralized community of programmers. Every week they
lose a little more credibility. There are many examples already of mainstream
press that doesn't take them seriously. A month ago that was not true.
Redhat, IBM and others have *very* good legal talent trying to resolve this
issue as quickly as possible. It is frustrating that SCO gets to keep talking.
In Germany, companies are not allowed to make accusations and SCO will be fined
if they make this kind of statement there. In the US, it seems companies get
more lattitude.
SCO has been fully aware of IBM, Sequent and others contributions to Linux from
the beginning. They've seen the code contributions added to them and
distributed them. Their accusations today are just bizarre. They probably have
succeeded in somewhat slowing the acceptance of Linux but as they lose their
presumption of reasonableness they matter less and less.
Here is a link from
Slashdot in June 2002 where some IBM developers talk about the legal
requirements for adding code to Linux.
"As Linux developers inside IBM, do you get to see the AIX source code? If you
do, are you allowed to "steal" some ideas from AIX and implement them in Linux?
If not, why not, and what's the IBM official line?
"IBM Kernel Hackers:
"First of all, before any of us were allowed to contribute to Linux, we were
required to take an "Open Source Developers" class. This class gives us the
guidelines we need to participate effectively in the open source community -
both IBM guidelines and lessons learned about open source from others in IBM.
"We are definitely not allowed to cut and paste proprietary code into any open
source projects (or vice versa!). There is an IBM committee who can and do
approve the release of IBM proprietary or patented technology, like RCU.
"That covers "stealing" code, but what about ideas? We might talk to an AIX
programmer and comment we're seeing performance issues in Linux in this area or
that area and she tells us they discovered that they really needed to profile
the network routines when they saw that. Having solved the problem once, our
non-Linux peers can help steer us without spelling it out for us, allowing us to
still develop solutions that can then be open sourced.
"It's a fine line to walk, especially as an engineer who just wants the answer
:) " r.a.[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 01:27 AM EDT |
J.F, I once came across a server the was going down every night. It was placed
in a cellar, and what had happened was that someone had connected it to a shared
outlet that had a breaker on a timer that for security reasons switched off
every night. eloj[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 01:42 AM EDT |
http://www.interesting-people.org/archives/interesting-people/200308/m
sg00243.html
Interesting. A flaw in a router caused what looked like an attack but
wasn't. pj[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 06:34 AM EDT |
pj, in my opinion it was an attack, caused by an incompetent programmer
without(?) malicious intent.
The full story is at http://www.cs.wisc.edu/~plon
ka/netgear-sntp/
It is impolite to query an NTP server more than once every minute. What has
happened to the Wisconsing NTP server is a kind of SlashDot effect, magnified by
a bug in the software that did retransmissions every second when a request
failed. Multiply that by 200.000 (domestic) routers sold and you get a DDoS. MathFox[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 08:17 AM EDT |
Interview with McBride in Wired:
"Are you afraid of being remembered as the man who killed open source? --
People ask why we haven't sued Red Hat. We haven't sued Red Hat because then the
GPL [general public license] grinds to a screeching halt, and all shipping
distributions of Linux must stop. This whole process is going to make Linux and
open source stronger with respect to intellectual property. Today, there's no
vetting process to make sure the code that goes into open source is clear. We're
trying to work through issues in such a way that we get justice without putting
a hole in the head of the penguin."
http://www.wired.
com/wired/archive/11.09/view.html?pg=3 pj[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 08:51 AM EDT |
MathFox, I know you know a great deal more than I do on this subject, but when I
went to read the article you linked to, I find this:
"Currently, based on our analysis we believe that the NETGEAR "Platinum"
products such as the RP614 and MR814 are the primary source of this
flood of traffic. They likely will need to have their code changed to
mitigate what is essentially an accidental Denial-of-Service flood
against our NTP infrastructure. "
There is also a link there to another instance in Australia of a misconfigured
router causing problems.
So, what am I missing? pj[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 09:06 AM EDT |
You're missing that we call it an attack even though it wasn't done
intentionally. This is typical in security lingo. Compare with the use of
"break" in cryptography. eloj[ Reply to This | # ]
|
|
Authored by: Anonymous on Saturday, August 30 2003 @ 06:47 PM EDT |
Thank you, eloj. That makes sense in your lingo. In mine, it more would mean
somebody broke the law, a big difference. Thanks for the explanation.
style="height: 2px; width: 20%; margin-left: 0px; margin-right: auto;">pj[ Reply to This | # ]
|
|
Authored by: Anonymous on Monday, September 01 2003 @ 03:17 AM EDT |
As I understand it, "spectral analysis" as applied to texts is a technique used
to establish the likelihood of authorship. i.e. given documents A,B and C known
to be written by X (and some control documents known not to be written by X)
what is the chance that a disputed document D was written by X? It's used, for
example, to detect material inserted into a statement/confession by someone
other than the author.
The significance is that it generates "hits" based on likely common authorship.
However, even discounting the fact that two writers of C code can very easily
have near-identical styles (much more so than prose), showing that two documents
A (owned by X) and B (a disputed document) are likely to have the same author
does not show that B infringes on A's copyright. It might show that there is
possibly an older version of B which was indeed written by X, but without proof
of this older version's provenance there is no case for copyright
infringement.
An analogy would be if I wrote a book which (without actually using the same
story and characters) deliberately aped Salman Rushdie's prose style, but
published under my own name (i.e. not an attempt to fraudulently pass off a new
Rushdie work.) Even if I did such a good job that a casual reader who didn't
look at the book jacket thought it was by Rushdie, I would not be infringing on
his copyright.
However (and this is a nice point) a book reviewer would probably say (rightly)
that my book had "ripped off" his style and was very "derivative". In a similar
fashion, SCO's press statements talk of "derivative code" in this loose everyday
fashion, hoping that it will be confused with its legal meaning in the context
of copyright. (Their use of "IP" as if this had a distinct legal meaning is the
same gambit.) Dr Stupid[ Reply to This | # ]
|
|
|
|
|