decoration decoration
Stories

GROKLAW
When you want to know more...
decoration
For layout only
Home
Archives
Site Map
Search
About Groklaw
Awards
Legal Research
Timelines
ApplevSamsung
ApplevSamsung p.2
ArchiveExplorer
Autozone
Bilski
Cases
Cast: Lawyers
Comes v. MS
Contracts/Documents
Courts
DRM
Gordon v MS
GPL
Grokdoc
HTML How To
IPI v RH
IV v. Google
Legal Docs
Lodsys
MS Litigations
MSvB&N
News Picks
Novell v. MS
Novell-MS Deal
ODF/OOXML
OOXML Appeals
OraclevGoogle
Patents
ProjectMonterey
Psystar
Quote Database
Red Hat v SCO
Salus Book
SCEA v Hotz
SCO Appeals
SCO Bankruptcy
SCO Financials
SCO Overview
SCO v IBM
SCO v Novell
SCO:Soup2Nuts
SCOsource
Sean Daly
Software Patents
Switch to Linux
Transcripts
Unix Books
Your contributions keep Groklaw going.
To donate to Groklaw 2.0:

Groklaw Gear

Click here to send an email to the editor of this weblog.


To read comments to this article, go here
Allen v. World - Reexamination Update
Monday, August 15 2011 @ 09:00 AM EDT

Another of the Interval Licensing patents, the '507 patent, subject to reexamination has received a first office action, and once again Interval has taken it on the chin. Of the four independent and 24 dependent claims subject to reexamination, the examiner has rejected all of them in a 152-page first office action [PDF].

With this office action (the third pertaining to the four Interval patents subject to reexamination), of the 13 independent and 46 dependent claims subject to reexamination, all but two of the independent claims have been rejected in first office actions. So there is clearly substance to what the reexamination requester put before the USPTO.

We will provide the examiner's detailed discussion of the independent claims below.

DO NOT CLICK 'READ MORE' IF YOU DO NOT WANT KNOWLEDGE OF THESE BASES FOR REJECTION.

Here is the updated reexamination table for Interval:

Interval Licensing vs. AOL et al
as of 2011-08-10






















Patent No. Claims Claims Not Subject to Reexam Claims Subject to Reexam Claims Rejected Claims Confirmed Claims Surviving

Ind Dep Ind Dep Ind Dep Ind Dep Ind Dep Ind Dep
6263507 15 114 11 90 4 24 4 24 0 0 11 90
6034652 9 9 5 4 4 5 - - - - 9 9
6788314 6 9 0 0 6 9 6 9 0 0 0 0
6757682 3 17 0 4 3 13 1 13 2 0 2 4
Totals 33 149 16 98 17 51 11 46 2 0 22 103
Percent of All Claims 100.00% 100.00% 48.48% 65.77% 51.52% 34.23% 33.33% 30.87% 6.06% 0.00% 66.67% 69.13%
Percent of Claims Reexamined





77.78% 100.00%



So far Interval has, at least temporarily, lost 85% of the independent claims subject to reexam (the '652 patent not having received a first office action to date) and 100% of the dependent claims. In fairness to Interval, as noted in our earlier story, it has added 16 dependent claims to the '314 patent, but we will hold off adding those to the statistics until the examiner has opined on them.

In the first office action the examiner found all four of the independent claims to be both non-novel (on at least one ground) and obvious (on multiple grounds). In other words, the rejection is extensively documented at this first stage. Here is what the examiner had to say about independent claims numbered 20, 39, 63 and 82:

First, the references the examiner has relied upon in rejecting the claims:

B. References Cited in this Office Action

1. The references discussed herein are as follows:

a. "Network Plus", Walter Bender et al., January 12-13, 1988 ("Bender").
b. "Cluster-Based Text Categorization: A Comparison of Category Search Strategies", Makoto Iwayama, July 9-13, 1995 ("Iwayama").
c. "The Fishwrap Personalized News System", Pascal R. Chesnais et al., June 1995 ("Chesnais").
d. "Classifying News Stories using Memory Based Reasoning"/ Brij Masand, June 1992 ("Masand").
e. "WebWatcher: Machine Learning and Hypertext", Thorsten Joachims et al., May 29, 1995 ("Joachims").
f. JP Publication No. H07-114572 to Yuasa ("Yuasa").
g. "Wire Service Transmission Guidelines Number 84-2", Special Report / American Newspaper Publishers Association, ANPA June 14, 1984 ("WTS Guidelines").
h. "The Associated Press Stylebook and Libel Manual", The Associated Press, 1994 ("AP Stylebook").

And then the application of those references to each of the independent claims. According to the examiner:

1. Independent claims 20 and 63 are anticipated by the Bender reference;
4. Independent claims 20 and 63 are obvious over Chesnais in view of AP Stylebook and further in view of Wire Service Transmission Guidelines;
6. Independent claims 20 and 63 are obvious over Chesnais in view of Bender;
8. Independent claims 20 and 63 are anticipated by Joachims;
11. Independent claims 39 and 82 are anticipated by Masand;
12. Independent claims 39 and 82 are anticipated by Iwayama; and
15. Independent claims 39 and 82 are anticipated by Yuasa;

For clarification, the use of the term "anticipated by" means that the claimed invention is not novel.

Ground #1 - Bender

RE: Claim 20

A method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, the method comprising the steps of:

Bender discloses a method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "a news retrieval system where the news editor has been replaced by the personal computer. A variety of both local and remote databases which operate passively as well as interactively are accessed by 'reporters.' These 'reporters' are actually software interfaces, which are programmed to gather news"). Bender at pp. 81-82. Bender also discloses that news items in the closed captioned data are delimited with certain characters, such as ">>>." Bender at p. 82.

acquiring data representing the body of information;

Bender discloses acquiring data representing the body of information (e.g., "The embodiment of these media experiments is a news retrieval system where the news editor has been replaced by the personal computer. A variety of both local and remote databases which operate passively as well as interactively are accessed by 'reporters.' These 'reporters' are actually software interfaces, which are programmed to gather news. Ideally, they are *broadcatching'; that is to say, watching all broadcast television channels, listening to all radio transmissions, and reading all newspapers, magazines, and journals," "News articles are collected based on a summary of topical events compiled daily by the wire services, in anticipation of the items which will be reported during the evening news telecast.") Bender at pp. 81-82.

Thus, Bender discloses that the system acquires, among other information, broadcast news and the closed caption data associated with the broadcast, in addition to news wire stories. These are exactly the same types of data that the '507 patent describes in its preferred embodiment. '507 patent 9:61-10:16, 20:15-21, 28:5-23.

storing the acquired data;

Bender discloses storing the acquired data, such as news wire stores and broadcast data (e.g., "News articles are collected based on a summary of topical evens compiled daily by the wire services, in anticipation of the items which will be reported during the evening news telecast.") Bender at pp. 81-82 and 85. Further, Bender explains that the Network Plus system uses software interfaces, called "reporters" that access "both local and remote databases" to perform their news editing and presentation functions (i.e., "data and processing are packaged locally.") Bender at pp. 81 and 84. Bender further explains, with respect to data from the broadcast, Network Plus also stores acquired data from the broadcast (e.g., "The presentation is driven by a processor that scans the closed caption data transmitted along with the broadcast ....Selected frames drawn from the telecast and stored in local memory are also presented as well"). Bender at p. 81 and Fig. 2 (p. 86). Further, Bender discloses that a primed version of annotated broadcast can be provided after the broadcast, which necessarily requires storing the data in order to generate a primed version. Bender at pp. 81 & 84-85 (describing the postprocessing used to generate still images). Moreover, one skilled in the art would also understand that Bender's Network Plus system necessarily discloses storing the acquired data because Bender's disclosure of comparing data from the news wire stories and the broadcast via keyword searching would require storing the data so that the keyword searching, comparison and display described in Bender could be performed. Bender at p. 85-86. In short, Bender discloses several different ways in which acquired data is stored.

generating a display of a first segment of the body of information from data that is part of the stored data;

Bender discloses generating a display of a first segment of the body of information from data that is part of the stored data (e.g., "The display is divided into three sections (figure 2). In the lower right quadrant, the news telecast is shown live, in its entirety.... A third section, the upper right quadrant is reserved for displaying video stills extracted from the broadcast.") Bender at FIGs. 1 and 2 and pp. 81-82. Thus, the display of the broadcast news (lower right quadrant of Fig. 2) is a display of a first segment from data that is part of the stored data. Alternatively, the video stills (upper right quadrant) may also be considered a first segment. Again, this is exactly the same type of display of broadcast news that is described not just in the claims, but in the preferred embodiment of the '507 patent at 10:14-16 ("Additionally when the use is observing a particular news story in an audiovisual news program, the invention can identify and display a related text news story or stories.")

comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

Bender discloses comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related because, for example, Bender compares closed caption data representing the news broadcast (one segment) to news wire text stories (a different segment) via keyword matching to determine, whether according to predetermined criteria (e.g., a threshold number of matched keywords), the segments are related. See e.g., Bender at pp. 82-83 (describing keyword matching process)("Network Plus is comprised of two procedural components. One gathers information prior to the broadcast. The other matches stories during the broadcast."(emphasis added); "The primary function of Network Plus is to correlate news wire stories and live broadcasts ....A keyword matching scheme was chosen, based upon empirical evidence that there exists a sufficient correspondence between words found in the transcript and words found in the wire service stories. If the number of words common to both the transcript and a trial story exceeded some threshold, the two were designated as related ....A threshold of four words worked well in this experiment...")(emphasis added). Bender further provides a specific example illustrating the process for comparing a news wire story about the nuclear accident at Chernobyl to a television broadcast on "ABC Nightly News" to determine they were related. Id. Thus, Bender discloses at least comparing the closed caption data for the news broadcast with the news wire text via keyword matching to determine whether according to a predetermined threshold for keyword matching (e.g., four common words), the broadcast and the news wire story are related. Once again, Bender discloses the exact same type of comparison between closed caption data and news wire text that is described not just in the claim, but in the preferred embodiment of the '507 patent wherein closed caption data for the news broadcast is compared to news wire text to determine if they are related by "any appropriate method." '507 patent at 28:5-23, 36-38.

generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Bender discloses generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related because, for example, Bender discloses displaying the news wire text that has been determined to be related to the television news broadcast, in response to and along with the television news. See e.g., Bender Figs. 1 (p.85) ("Locally Packaged Television. On the top is the original broadcast... On the right, the map is replaced with text from the news wire services") and Fig. 2 (p. 86) (The live broadcast is in the lower right quadrant ....Text from the wire services is on the left); Bender at p. 81 ("Network Plus annotates the television news with articles drawn from a local copy of wire service new material selected and presented along with the video in real time"); Bender at pp. 81-82 ("The display is divided into three sections (figure 2). In the lower right quadrant, the news telecast is shown live, in its entirety. The left half of the screen is used to display related news wire stories..")(emphasis added). Once again, Bender discloses the same type of display described not just in the claim, but in the preferred embodiment of the '507 patent - the second segment (the news wire text) is displayed in response to and along with the news broadcast and stills. Compare '507 patent FIG. 2B with Bender Figs. 1 and 2; see also '507 patent at 14:64-15:3, 18:52-67.


RE: Claim 63

A computer readable medium encoded with one or more computer programs for enabling acquisition and review of a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, comprising:

Bender discloses a computer readable medium encoded with one or more computer programs for enabling acquisition and review of a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "a news retrieval system where the news editor has been replaced by the personal computer. A variety of both local and remote databases which operate passively as well as interactively are accessed by 'reporters.' These 'reporters' are actually software interfaces, which are programmed to gather news"). Bender at pp. 81-82. Bender also discloses that news items in the closed captioned data are delimited with certain characters, such as">>>."Bender at p. 82.

instructions for acquiring data representing the body of information;

Bender discloses instructions for acquiring data representing the body of information (e.g., "The embodiment of these media experiments is a news retrieval system where the news editor has been replaced by the personal computer. A variety of both local and remote databases which operate passively as well as interactively are accessed by 'reporters.' These 'reporters' are actually software interfaces, which are programmed to gather news. Ideally, they are broadcatching'; that is to say, watching all broadcast television channels, listening to all radio transmissions, and reading all newspapers, magazines, and journals";" News articles are collected based on a summary of topical events compiled daily by the wire services, in anticipation of the items which will be reported during the evening news telecast.") Bender at pp. 81-82. Thus, Bender discloses software for acquiring, among other information, broadcast news and the closed caption data associated with the broadcast, in addition to news wire stories. These are exactly the same types of data that the '507 patent describes in its preferred embodiment. '507 patent 9:61-10:16, 20:15-21, 28:5-23.

instructions for storing the acquired data;

Bender discloses instructions for storing the acquired data, such as news wire stores and broadcast data (e.g., "News articles are collected based on a summary of topical evens compiled daily by the wire services, in anticipation of the items which will be reported during the evening news telecast.") Bender at pp. 81-82 and 85. Further, Bender explains that the Network Plus system uses software interfaces, called "reporters" that access "both local and remote databases" to perform their news editing and presentation functions (i.e., "data and processing are packaged locally.") Bender at pp. 81 and 84. Bender further explains, with respect to data from the broadcast, Network Plus also stores acquired data from the broadcast (e.g., "The presentation is driven by a processor that scans the closed caption data transmitted along with the broadcast ....Selected frames drawn from the telecast and stored in local memory are also presented as well"). Bender at p. 81 and Fig. 2 (p. 86). Further, Bender discloses that a primed version of annotated broadcast can be provided after the broadcast, which necessarily requires storing the data in order to generate a primed version. Bender at pp. 81 & 84-85 (describing the postprocessing used to generate still images). Moreover, one skilled in the art would also understand that Bender's Network Plus system necessarily discloses storing the acquired data because Bender's disclosure of comparing data from the news wire stories and the broadcast via keyword searching would require storing the data so that the keyword searching, comparison and display described in Bender could be performed. Bender at p. 85-86. In short, Bender discloses several different ways in which acquired data is stored.

instructions for generating a display of a first segment of the body of information from data that is part of the stored data;

Bender discloses instructions for generating a display of a first segment of the body of information from data that is part of the stored data (e.g., "The display is divided into three sections (figure 2). In the lower right quadrant, the news telecast is shown live, in its entirety. A third section, the upper right quadrant is reserved for displaying video stills extracted from the broadcast.") Bender at FIGs. 1 and 2 and pp. 81-82. Thus, the display of the broadcast news (lower right quadrant of Fig. 2) is a display of a first segment from data that is part of the stored data. Alternatively, the video stills (upper right quadrant) may also be considered a first segment. Again, this is exactly the same type of display of broadcast news that is described not just in the claims, but in the preferred embodiment of the '507 patent at 10:14-16 ("Additionally when the use is observing a particular news story in an audiovisual news program, the invention can identify and display a related text news story or stories.")

instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

Bender discloses instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related because, for example, Bender compares closed caption data representing the news broadcast (one segment) to news wire text stories (a different segment) via keyword matching to determine, whether according to predetermined criteria (e.g., a threshold number of matched keywords), the segments are related. See e.g., Bender at pp. 82-83 (describing keyword matching process)("Network Plus is comprised of two procedural components. One gathers information prior to the broadcast. The other matches stories during the broadcast", "The primary function of Network Plus is to correlate news wire stories and live broadcasts ... A keyword matching scheme was chosen, based upon empirical evidence that there exists a sufficient correspondence between words found in the transcript and words found in the wire service stories. If the number of words common to both the transcript and a trial story exceeded some threshold, the two were designated as related. A threshold of four words worked well in this experiment.. .")(emphasis added). Bender further provides a specific example illustrating the process for comparing a news wire story about the nuclear accident at Chernobyl to a television broadcast on "ABC Nightly News" to determine they were related. Id. Thus, Bender discloses at least comparing the closed caption data for the news broadcast with the news wire text via keyword matching to determine whether according to a predetermined threshold for keyword matching (e.g., four common words), the broadcast and the news wire story are related. Once again, Bender discloses the exact same type of comparison between closed caption data and news wire text that is described riot just in the claim, but in the preferred embodiment of the '507 patent wherein closed caption data for the news broadcast is compared to news wire text to determine if they are related by "any appropriate method." '507 patent at 28:5-23, 36-38.

instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Bender discloses instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related because, for example, Bender discloses displaying the news wire text that has been determined to be related to the television news broadcast, in response to and along with the television news. See e.g., Bender Figs. 1 (p.85) ("Locally Packaged Television. On the top is the original broadcast... On the right, the map is replaced with text from the news wire services") and Fig. 2 (p. 86) (The live broadcast is in the lower right quadrant ....Text from the wire services is on the left); Bender at p. 81 ("Network Plus annotates the television news with articles drawn from a local copy of wire service new material selected and presented along with the video in real time."); Bender at p. 81- 82 ("The display is divided into three sections (figure 2). In the lower right quadrant, the news telecast is shown live, in its entirety. The left half of the screen is used to display related news wire stories.."(emphasis added). Once again, Bender discloses the same type of display described not just in the claim, but in the preferred embodiment of the '507 patent - the second segment (the news wire text) is displayed in response to and along with the news broadcast and stills. Compare '507 patent FIG. 2B with Bender Figs. 1 and 2; see also '507 patent at 14:64-15:3, 18:52-67.


Ground #4 - Chesnais, AP Stylebook and Wire Service Transmission

RE: Claim 20

A method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, the method comprising the steps of:

Chesnais discloses a method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "Fishwrap is an experimental electronic newspaper system available at MIT." (p. 275); "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (Id.); "All items coming into the system are analyzed for geographic and topical relevancy." (Id.) (emphasis added); "Access to Fishwrap's personalized news system appears as a World Wide Web (WWW) hypertext link" (Id.); "[W]ithin Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line. Each supplier program does three things: First it translates all news items into an internal, wire- independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data. Finally each article is supplied to the Fishwrap news database server." (p. 277) (emphasis added);

Note: in the quotation cited above, Chesnais uses the word "article" in some aspects to refer to "all news items," not just news items that are articles. Further, references in quotes to Chesnais in the form of [number] appear this way in the Chesnais publication and refer to the references listed at the end of the article.

"A Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." (p. 276).) Further as shown in Fig. 6 the "News Server" receives many different types of data, including news wire feeds, evening news stills and video, and audio files. (Fig. 6, at 278). As described above, each of these different data items represent distinct segments that Fishwrap analyzes and creates a "signature" for.

acquiring data representing the body of information;

Chesnais discloses acquiring data representing the body of information (e.g., "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (p. 275); "[Within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line." (p. 277); "Suppliers and Servers - Fishwrap receives news from a variety of sources and formats. The traditional news wires (Associated Press, Reuters, Knight-Ridder/Tribune, and BPI Entertainment all are providing their news feeds to Fishwrap) come in ANPA [7] format. Fishwrap also receives submissions via electronic mail and a number of 'homebrew' formats'" (p. 278)). Further, as shown above with respect to Fig. 6, the "NEWS SERVER" acquires information from a variety of sources, including text, video, images and audio. (Id.). Chesnais also explains that "[o]ur current Fishwrap news server uses a media-independent representation that allows it to accept items with graphics, audio, text, and motion pictures. It is up to the presentation application to determine the appropriate medium to provide." (p. 279.) As exemplified by the above citations, Chesnais discloses acquiring a variety of different types of data that make up a body of information.

storing the acquired data;

Chesnais discloses storing the acquired data, including for example news wire stories, photos and audio files in databases. See e.g., Chesnais at p. 277 ("[W]ithin Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic ....Finally each article is supplied to the Fishwrap news database server [4] where it will remain for the next 48 hours."); and Chesnais at p. 278 ("Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story."); see also id. at Fig. 6. Thus, Chesnais describes that it stores all incoming items.

generating a display of a first segment of the body of information from data that is part of the stored data;

Chesnais discloses generating a display of a first segment of the body of information from data that is part of the stored data because, for example, it discloses generating a display of an article. See e.g., Chesnais at 277 (e.g., "When a reader generates a newspaper through Fishwrap, an article is retrieved if it matches one of the reader's global topics of interest ....[and an "article is then rendered by the from end application"); see also Figs. 2 and 13. Chesnais further explains that it uses a web browser to provide the display. See e.g., Chesnais at p. 275 ("World Wide Web browser access allows for easy traversing of the information space (see Figure 2)."). Chesnais further explains how the user navigates to display an article-"[a] Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." Chesnais at p. 276. Figs. 2 and 13, further illustrate how a user of Fishwrap can navigate to a particular news item, such as the article "New Evidence About Bombing Suspect Emerges," which represents an example of the display of a first segment generated from the stored data.

comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

Chesnais discloses comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related. As explained below, Chesnais discloses that all incoming items are provided with a signature which is used for searching, and that when an article is rendered Fishwrap also searches the photo and audio databases for items that "match the story" (i.e., related items). See e.g., Chesnais at 277 ("When a reader generates a newspaper through Fishwrap ....The article is then rendered by the frontend application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story_.") (emphasis added); and Chesnais at p. 281 ("One blind student appreciated the ... audio segments for illustrations."). As discussed in more detail below, Chesnais discloses the "comparing" as identifying "photos and sound recordings that match the story." Chesnais makes this possible because, as addressed immediately below, the Fishwrap system stores the incoming items (e.g., stories, audio files, and photos) with "signatures" ("data representing" a segment).

For example, Chesnais explains that the "signatures," which are derived from the incoming data are applied to all items coming into the system. Chesnais at p. 275 ("All items coming into the system are analyzed for geographic and topical relevancy."). These signatures are created along with a particular data structure ("Dtype") and provides "inferences" about the item:

within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic... ...First it translates all news items into an internal, wire-independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data.

Chesnais at p. 277.

Further, as shown in Fig. 9 of Chesnais, the signature process (which adds the content labeled 1 and 3 to the item) provides additional data representing the item (i.e., an inference made from the data), such as a headline ("Survivors of Crash Victim Sue USAir"), a "slugword," and a "summary."

Note that the Dtype data structure is described in Chesnais by example, but also by citation to reference [3] Abramson, Nathan S. The dtype library or, how to write a server in less time that it takes to read this manual, Technical Report, Electronic Publishing Group, MIT Media Laboratory, Cambridge, MA, 1992.

See Chesnais at pg. 279

Chesnais further explains that the signatures are used in searches ("because they significantly speed up the searches") used to build a paper to present to the user, which presentation, as described above and shown for example by the photo thumbnails in Figs. 2 and 13, also includes "photos and sound recordings that match the story." See e.g., Chesnais at Figs. 2 and 13 and p. 277 (matching) & 279 (using signatures to "significantly speed up the searches."). Thus, the signatures, which include data representing the segments (i.e., a headline and a summary like those shown in the third portion of Fig. 9), include predetermined criteria used to determine whether particular segments are related.

Chesnais discloses "comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related." However, even if the Examiner determines that Chesnais did not expressly disclose comparing signatures of two items to determine if they are related, it would have been obvious to one of ordinary skill in the art to perform the recited comparison step based on Chesnais's disclosure in view of the WST Guidelines and AP Stylebook.

Chesnais references the Wire Service Transmission Guidelines Special Report No. 84-2, from the American Newspaper Publishers Association (ANPA). Chesnais at p. 278 & 282. These guidelines specify the content and format of headers applied to newswire items, including a field for keywords. WST Guidelines at 1 & 2. The Associated Press ("AP") used these headers. Id. at 1. The AP Stylebook indicates that stories, photos, and graphics follow the same coding requirements for wire transmission. AP Stylebook at p. 297-299. "Every news item in the AP report has a keyword slug line." Id. at 299. Further, AP photos had associated text captions. Id. at p. 293-296. Chesnais states that the signature added to an item is "derived from the ANPA format coding." Chesnais, p. 279. As shown in Fig. 9, the signature of an item included, for example, a "slugword" field with keywords. In short, the ANPA format coding for stories and photo captions from the AP provided the same type of information. Thus, to the extent it is not inherently disclosed, it would have been obvious to one of ordinary skill in the art that Chesnais's disclosure of "signatures" (described above) and checking Fishwrap's databases for "photos . . .that match the story" would include comparing the signature for a news story with the signatures for photos, including the text captions, (or audio files) to identify photos that are related to the news story using predetermined criteria, such as matching one or more fields from the signatures (e.g., a slugword, headline, or summary). In fact, this is one of the well- known functions that databases are designed to perform, using coded fields to make identification of information stored in the database easier. Comparing signatures of items to determine whether two items are related is applying a known technique to a known method to yield predictable results.

generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Chesnais discloses generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related. Specifically, Chesnais discloses that the Fishwrap system, in the following order, (1) renders an article, (2) then checks for photos or audio that match the article, and (3) then displays the related photos or audio. See e.g., Chesnais at p. 277 ("The article is then rendered by the front end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story. For most Fishwrap readers, articles are rendered in hypertext markup language (HTML) for a WWW browser."; and Chesnais at p. 281 ("One blind student appreciated the.., audio segments for illustrations."). Chesnais further explains "On Demand Publishing: Fishwrap's use of the WWW is different from existing servers. Rather than be an archive of documents, Fishwrap contructs [sic] its personalized news documents on the fly. Building documents on demand allows Fishwrap to provide the most recent news." (Id. at 280). Finally, as shown in Figs. 2 and 13, Fishwrap presents a user with photos (thumbnails shown below) and audio (display of a portion or representation of a second segment) that "match" or are related to the article being displayed (the first segment).

Reasons to Combine Chesnais with AP Stvlebook and WST Guidelines

Chesnais is directed toward an electronic newspaper that builds a presentation on the fly and combines for users a variety of data types (e.g., newswire stories, photos and audio, video etc.) based on their similarity. Chesnais, p. 275. For example, Chesnais explains that "[w]hen a reader generates a newspaper through Fishwrap ....The article is then rendered by the front end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story." Chesnais at p. 277 (emphasis added).

Chesnais also discloses receiving news feeds from the Associated Press ("AP"). Chesnais, p. 278; Fig. 6. Further, one skilled in the art would understand that news wire services have long provided photographs by wire service, since at least 1935 when the AP introduced its Wirephoto Network (see e.g., http://www.ap.org/pages/history/photos.htm) (describing the development of AP's news wire photo service). Chesnais references the Wire Service Transmission Guidelines Special Report No. 84-2, from the American Newspaper Publishers Association (ANPA). Chesnais at p. 278 & 282. These guidelines specify the content and format of headers applied to newswire items, including a field for keywords. WST Guidelines at 1 & 2. The AP used these headers. Id. at 1. The AP Stylebook indicates that stories, photos, and graphics follow the same coding requirements for wire transmission. AP Stylebook at p. 297-299. "Every news item in the AP report has a keyword slug line." Id. at 299. Further, AP photos had associated text captions. Id. at p. 293-296. Chesnais states that the signature added to an item is "derived from the ANPA format coding." Chesnais, p. 279. A person of ordinary skill in the art, looking for a method of determining similarities between two information sources such as the articles and other content disclosed in Chesnais would have been motivated to compare the signatures for the news stories and photos (or sound recordings). Because Chesnais discloses that each item in the system is assigned a "signature" that includes keywords and discloses identifying photos and audio that "match" a news article, and the AP Stylebook discloses that all news items transmitted over the news wire have a slugword containing keywords, one of skill in the art would have been motivated to combine the teachings of the WST Guidelines and the AP Stylebook regarding the slugword keywords with the disclosure of Chesnais to identify matching photos and sound recordings. Because Chesnais explicitly discloses receiving news wire items from the AP, it would have been obvious to use the keyword slugline of an AP news item as a basis to compare information in Chesnais because the "signatures" contain keywords. Moreover, the combination of Chesnais, the WST Guidelines, and the AP Stylebook yields a predictable result, and one of ordinary skill in the art would clearly be capable of combining these systems to achieve the expected result-of determining similarities between two information sources.


RE: Claim 63

A computer readable medium encoded with one or more computer programs for enabling acquisition and review of a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, comprising:

Chesnais discloses a computer readable medium encoded with one or more computer programs for enabling acquisition and review of a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "Fishwrap is an experimental electronic newspaper system available at MIT." (p. 275); "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (Id.); "All items coming into the system are analyzed for geographic and topical relevancy." (Id.) (emphasis added); "Access to Fishwrap's personalized news system appears as a World Wide Web (WWW) hypertext link" (Id.); "[Within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line. Each supplier program does three things: First it translates all news items into an internal, wire-independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data. Finally each article is supplied to the Fishwrap news database server." (p. 277) (emphasis added);

Note: in this quotation, Chesnais uses the word "article" in some aspects to refer to "all news items," not just news items that are articles. Further, references in quotes to Chesnais in the form of [number] appear this way in the Chesnais publication and refer to the references listed at the end of the article.

"A Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." (p. 276).) Further as shown in Fig. 6 the "News Server" receives many different types of data, including news wire feeds, evening news stills and video, and audio files. (Fig. 6, at 278). As described above, each of these different data items represent distinct segments that Fishwrap analyzes and creates a "signature" for.

The Fishwrap electronic newspaper system includes multiple servers that contain computer readable medium comprising instructions for performing the functions disclosed by Chesnais (e.g., "Glue provides a standard 'plug and play' set of tools for servers, knowledge representations modules, user profiling systems, and presentation modules." (p. 278)). Further, Chesnais also describes multiple modules interacting as part of Glue, including the News Server acquiring the news items (pp. 278-79), the supplier programs adding signatures (pp. 277 & 278) and the From End Application rendering presentation to a user (p. 277). Certain module names are shown in boldface in Fig. 7 (p. 278).

instructions for acquiring data representing the body of information;

Chesnais discloses instructions for acquiring data representing the body of information (e.g., "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (p. 275); "[W]ithin Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line." (p. 277); "Suppliers and Servers - Fishwrap receives news from a variety of sources and formats. The traditional news wires (Associated Press, Reuters, Knight- Ridder/Tribune, and BPI Entertainment all are providing their news feeds to Fishwrap) come in ANPA [7] format. Fishwrap also receives submissions via electronic mail and a number of 'homebrew' formats'" (p. 278)). Further, as shown above with respect to Fig. 6, the "NEWS SERVER" acquires information from a variety of sources, including text, video, images and audio. (Id.). Chesnais also explains that "[o]ur current Fishwrap news server uses a media- independent representation, that allows it to accept items with graphics, audio, text, and motion pictures. It is up to the presentation application to determine the appropriate medium to provide." (p. 279.) As exemplified by the above citations, Chesnais discloses acquiring a variety of different types of data that make up a body of information.

instructions for storing the acquired data;

Chesnais discloses instructions for storing the acquired data, including for example news wire stories, photos and audio files in databases. See e.g., Chesnais at p. 277 ("[Within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic ....

Finally each article is supplied to the Fishwrap news database server [4] where it will remain for the next 48 hours."); and Chesnais at p. 278 ("Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story."); see also id. at Fig. 6. Thus, Chesnais describes that it stores all incoming items.

instructions for generating a display of a first segment of the body of information from data that is part of the stored data;

Chesnais discloses instructions for generating a display of a first segment of the body of information from data that is part of the stored data. For example, it discloses generating a display of an article. See e.g., Chesnais at 277 (e.g., "When a reader generates a newspaper through Fishwrap, an article is retrieved if it matches one of the reader's global topics of interest....[and an "article is then rendered by the from end application"); see also Figs. 2 and 13. Chesnais further explains that it uses a web browser to provide the display. See e.g., Chesnais at p. 275 ("World Wide Web browser access allows for easy traversing of the information space (see Figure 2)."). Chesnais further explains how the user navigates to display an article "[a] Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." Chesnais at p. 276. Figs. 2 and 13, further illustrate how a user of Fishwrap can navigate to a particular news item, such as the article "New Evidence About Bombing Suspect Emerges," which represents an example of the display of a first segment generated from the stored data. Further, Fig. 7 shows the "appRender" module that renders the articles (p. 278). Thus, Chesnais discloses that Fishwrap has instructions to display the aforementioned fist segment. See Chesnais at FIGS. 2 and 13.

instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

Chesnais discloses instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related. As explained below, Chesnais discloses that all incoming items are provided with a signature, which is used for searching, and that when an article is rendered Fishwrap also searches the photo and audio databases for items that "match the story" (i.e., related items). See e.g., Chesnais at 277 ("When a reader generates a newspaper through Fishwrap ....The article is then rendered by the front end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story.") (emphasis added); and Chesnais at p. 281 ("One blind student appreciated the ... audio segments for illustrations."). As discussed in more detail below, Chesnais discloses the "comparing" as identifying "photos and sound recordings that match the story." Chesnais makes this possible because, as addressed immediately below, the Fishwrap system stores the incoming items (e.g., stories, audio files, and photos) with "signatures" ("data representing" a segment).

For example, Chesnais explains that the "signatures," which are derived from the incoming data are applied to all items coming into the system. Chesnais at p. 275 ("All items coming into the system are analyzed for geographic and topical relevancy."). These signatures are created along with a particular data structure ("Dtype") and provides "inferences" about the item:

within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. First it translates all news items into an internal, wire-independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data. Chesnais at p. 277.

Further, as shown in Fig. 9 of Chesnais, the signature process (which adds the content labeled 1 and 3 to the item) provides additional data representing the item (i.e., an inference made from the data), such as a headline ("Survivors of Crash Victim Sue USAir"), a "slugword," and a "summary."

Note: the Dtype data structure is described in Chesnais by example, but also by citation to reference [3] Abramson, Nathan S. The dtype library or, how to write a server in less time that it takes to read this manual, Technical Report, Electronic Publishing Group, MIT Media Laboratory, Cambridge, MA, 1992.

See Chesnais at p. 279.

Chesnais further explains that the signatures are used in searches ("because they significant speed up the searches") used to build a paper to present to the user, which presentation, as described above and shown for example by the photo thumbnails in Figs. 2 and 13, also includes "photos and sound recordings that match the story." See e.g., Chesnais at Figs. 2 and 13 and p. 277 (matching) & 279 (using signatures to "significantly speed up the searches."). Thus, the signatures, which include data representing the segments (i.e., a headline and a summary like those shown in the third portion of Fig. 9), include predetermined criteria used to determine whether particular segments are related.

Chesnais discloses "comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related." However, even if the Examiner determines that Chesnais did not expressly disclose comparing signatures of two items to determine if they are related, it would have been obvious to one of ordinary skill in the art to perform the recited comparison step based on Chesnais's disclosure in view of the AP Stylebook.

Chesnais references the Wire Service Transmission Guidelines Special Report No. 84-2, from the American Newspaper Publishers Association (ANPA). Chesnais at p. 278 & 282. Chesnais states that the signature added to an item is "derived from the ANPA format coding." Chesnais, p. 279. As shown in Fig. 9, the signature of an item included, for example, a "slugword" field with keywords. In short, the ANPA format coding for stories and photo captions from the AP provided the same type of information. Thus, to the extent it is not inherently disclosed, it would have been obvious to one of ordinary skill in the art that Chesnais's disclosure of "signatures" (described above) and checking Fishwrap's databases for "photos... that match the story" would include comparing the signature for a news story with the signatures for photos, including the text captions, (or audio files) to identify photos that are related to the news story using predetermined criteria, such as matching one or more fields from the signatures (e.g., a slugword, headline, or summary). In fact, this is one of the well-known functions that databases are designed to perform, using coded fields to make identification of information stored in the database easier. Comparing signatures of items to determine whether two items are related is applying a known technique to a known method to yield predictable results.

instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Chesnais discloses instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related. Specifically, Chesnais discloses that the Fishwrap system, in the following order, (1) renders an article, (2) then checks for photos or audio that match the article, and (3) then displays the related photos or audio. See e.g., Chesnais at p. 277 ("The article is then rendered by the front end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story. For most Fishwrap readers, articles are rendered in hypertext markup language (HTML) for a WWW browser."; and Chesnais at p. 281 ("One blind student appreciated the.., audio segments for illustrations."). Chesnais further explains "On Demand Publishing: Fishwrap's use of the WWW is different from existing servers. Rather than be an archive of documents, Fishwrap contructs [sic] its personalized news documents on the fly. Building documents on demand allows Fishwrap to provide the most recent news." (Id. at 280). Finally, as shown in Figs. 2 and 13, Fishwrap presents a user with photos (thumbnails shown below) and audio (display of a portion or representation of a second segment) that ""match" or are related to the article being displayed (the first segment).

See Fishwrap at Figs. 2 and 13.


Ground #6 - Chesnais and Bender

RE: Claim 20

A method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, the method comprising the steps of:

Chesnais discloses a method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "Fishwrap is an experimental electronic newspaper system available at MIT." (p. 275); "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (Id.); "All items coming into the system are analyzed for geographic and topical relevancy." (Id.) (emphasis added); "Access to Fishwrap's personalized news system appears as a World Wide Web (WWW) hypertext link" (Id.); "[Within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line. Each supplier program does three things: First it translates all news items into an internal, wire-independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data. Finally each article is supplied to the Fishwrap news database server." (p. 277) (emphasis added);

Note: in this quotation, Chesnais uses the word "article" in some aspects to refer to "all news items," not just news items that are articles. Further, references in quotes to Chesnais in the form of [number] appear this way in the Chesnais publication and refer to the references listed at the end of the article.

"A Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." (p. 276).) Further as shown in Fig. 6 the "News Server" receives many different types of data, including news wire feeds, evening news stills and video, and audio files. (Fig. 6, at 278). As described above, each of these different data items represent distinct segments that Fishwrap analyzes and creates a "signature" for.

Reasons to Combine Chesnais and Bender

Chesnais is directed toward an electronic newspaper that builds a presentation on the fly and combines for users a variety of data types (e.g., newswire stories, photos and audio, video etc.) based on their similarity. Chesnais, p. 275. Similarly, Bender is directed to presenting news broadcasts and related news articles to users. Bender, p. 81. Both the Network Plus system of Bender and the Fishwrap system of Chesnais were developed at the MIT Media Laboratory, and Dr. Chesnais is a co-author of both publications. A person of ordinary skill in the art, looking for a method of determining similarities between two information sources such as the articles and other content disclosed in Chesnais would have been motivated to use the keyword matching scheme of Bender. Because Chesnais discloses that each item in the system is assigned a "signature" that includes keywords and discloses identifying photos and audio that "match" a news article, and Bender discloses using a keyword matching scheme to "match" news stories to a broadcast, one of skill in the art would have been motivated to combine Bender's keyword matching scheme with the disclosure of Chesnais to identify matching photos and sound recordings. Bender discloses that the Network Plus system's use of a threshold of four matching keywords to identify related items was "computationally inexpensive" and "worked well." Bender, p. 82. Thus, it would have been obvious to use the keyword matching scheme of Bender to compare information in Chesnais because the "signatures" contain keywords and the keyword matching scheme of Bender was "computationally inexpensive" yet also "worked well." Moreover, the combination of Chesnais and Bender yields a predictable result, and one of ordinary skill in the art would clearly be capable of combining these systems to achieve the expected result of determining similarities between two information sources.

acquiring data representing the body of information;

Chesnais discloses acquiring data representing the body of information (e.g., "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (p. 275); "[Wjithin Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line." (p. 277); "Suppliers and Servers - Fishwrap receives news from a variety of sources and formats. The traditional news wires (Associated Press, Reuters, Knight-Ridder/Tribune, and BPI Entertainment all are providing their news feeds to Fishwrap) come in ANPA [7] format. Fishwrap also receives submissions via electronic mail and a number of 'homebrew' formats'" (p. 278)). Further, as shown above with respect to Fig. 6, the "NEWS SERVER" acquires information from a variety of sources, including text, video, images and audio. (Id.). Chesnais also explains that "[o]ur current Fishwrap news server uses a media-independent representation, that allows it to accept items with graphics, audio, text, and motion pictures. It is up to the presentation application to determine the appropriate medium to provide." (p. 279.) As exemplified by the above citations, Chesnais discloses acquiring a variety of different types of data that make up a body of information.

storing the acquired data;

Chesnais discloses storing the acquired data, including for example news wire stories, photos and audio files in databases. See e.g., Chesnais at p. 277 ("[Wjithin Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic ....Finally each article is supplied to the Fishwrap news database server [4] where it will remain for the next 48 hours."); and Chesnais at p. 278 ("Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story."); see also id. at Fig. 6. Thus, Chesnais describes that it stores all incoming items.

generating a display of a first segment of the body of information from data that is part of the stored data;

Chesnais discloses generating a display of a first segment of the body of information from data that is part of the stored data. For example, it discloses generating a display of an article. See e.g., Chesnais at 277 (e.g., "When a reader generates a newspaper through Fishwrap, an article is retrieved if it matches one of the reader's global topics of interest.... [and an "article is then rendered by the front end application"); see also Figs. 2 and 13. Chesnais further explains that it uses a web browser to provide the display. See e.g., Chesnais at p. 275 ("World Wide Web browser access allows for easy traversing of the information space (see Figure 2)."). Chesnais further explains how the user navigates to display an article "[a] Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." Chesnais at p. 276. Figs. 2 and 13, further illustrate how a user of Fishwrap can navigate to a particular news item, such as the article "New Evidence About Bombing Suspect Emerges," which represents an example of the display of a first segment generated from the stored data. See Chesnais at FIGS. 2 and 13.

comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

Chesnais discloses comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related. As explained below, Chesnais discloses that all incoming items are provided with a signature, which is used for searching, and that when an article is rendered Fishwrap also searches the photo and audio databases for items that "match the story" (i.e., related items). See e.g., Chesnais at 277 ("When a reader generates a newspaper through Fishwrap ....The article is then rendered by the front end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story.") (emphasis added); and Chesnais at p. 281 ("One blind student appreciated the .. . audio segments for illustrations."). As discussed in more detail below, Chesnais discloses the "comparing" as identifying "photos and sound recordings that match the story." Chesnais makes this possible because, as addressed immediately below, the Fishwrap system stores the incoming items (e.g., stories, audio files, and photos) with "signatures" ("data representing" a segment).

For example, Chesnais explains that the "signatures," which are derived from the incoming data are applied to all items coming into the system. Chesnais at p. 275 ("All items coming into the system are analyzed for geographic and topical relevancy."). These signatures are created along with a particular data structure ("Dtype") and provides "inferences" about the item:

within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. First it translates all news items into an internal, wire-independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data.
Chesnais at p. 277.

Note: the Dtype data structure is described in Chesnais by example, but also by citation to reference [3] Abramson, Nathan S. The dtype library or, how to write a server in less time that it takes to read this manual, Technical Report, Electronic Publishing Group, MIT Media Laboratory, Cambridge, MA, 1992.

Further, as shown in Fig. 9 of Chesnais, the signature process (which adds the content labeled 1 and 3 to the item) provides additional data representing the item (i.e., an inference made from the data), such as a headline ("Survivors of Crash Victim Sue USAir"), a "slugword," and a "summary."

See Chesnais at p. 279.

Chesnais further explains that the signatures are used in searches ("because they significantly speed up the searches") used to build a paper to present to the user, which presentation, as described above and shown for example by the photo thumbnails in Figs. 2 and 13, also includes "photos and sound recordings that match the story." See e.g., Chesnais at Figs. 2 and 13 and p. 277 (matching) & 279 (using signatures to "significantly speed up the searches."). Thus, the signatures, which include data representing the segments (i.e., a headline and a summary like those shown in the third portion of Fig. 9), include predetermined criteria used to determine whether particular segments are related.

Chesnais references the Wire Service Transmission Guidelines Special Report No. 84-2, from the American Newspaper Publishers Association (ANPA). Chesnais at p. 278 & 282. Chesnais states that the signature added to an item is "derived from the ANPA format coding." Chesnais, p. 279. As shown in Fig. 9, the signature of an item included a "slugword" field with keywords. Therefore, one of ordinary skill in the art would understand that the signature for photos stored in the Fishwrap database included a slugword field containing keywords associated with the photos.

Chesnais discloses "comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related." However, even if the Examiner determined that Chesnais did not expressly or inherently disclose comparing signatures of two items to determine if they are related, it would have been obvious to one of ordinary skill in the art to perform the recited comparison step on Chesnais's signatures using the comparison technique disclosed in Bender. Bender discloses comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related. For example, Bender compares closed caption data representing a news broadcast (one segment) to news wire text stories (different segments) via keyword matching to determine, whether according to predetermined criteria (e.g., a threshold number of matched keywords), the segments are related. See e.g., Bender at pp. 82-83 (describing keyword matching process) ("Network Plus is comprised of two procedural components. One gathers information prior to the broadcast. The other matches stories during the broadcast, "(emphasis added); "The primary function of Network Plus is to correlate news wire stories and live broadcasts.—A keyword matching scheme was chosen, based upon empirical evidence that there exists a sufficient correspondence between words found in the transcript and words found in the wire service stories. If the number of words common to both the transcript and a trial story exceeded some threshold, the two were designated as related ....A threshold of four words worked well in this experiment...") (emphasis added). Bender further provides a specific example illustrating the process for comparing a news wire story about the nuclear accident at Chernobyl to a television broadcast on "ABC Nightly News" to determine they were related. Id. Thus, Bender discloses at least comparing the closed caption data for the news broadcast with the news wire text via keyword matching to determine whether according to a predetermined threshold for keyword matching (e.g., four common words), the broadcast and the news wire story are related. One of skill in the art would have been motivated to combine Bender's keyword matching scheme with the disclosure of Chesnais to identify matching photos and sound recording at least because Chesnais discloses that all items in the Fishwrap system are assigned a "signature" that includes keywords, and further discloses identifying photos and audio files in its database that "match" a news article, and Bender discloses using a keyword matching scheme to "match" news stories to a broadcast. Thus, to the extent that Chesnais does not expressly or inherently disclose using predetermined criteria to "compar[e] data representing a segment of the body of information to data representing a different segment of the body of information to determine whether.., the compared segments are related," using a predefined threshold for a number of keywords that match as disclosed in Bender would have been obvious to one of ordinary skill in the art based upon Chesnais in view of Bender.

generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Chesnais discloses generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related. Specifically, Chesnais discloses that the Fishwrap system, in the following order, (1) renders an article, (2) then checks for photos or audio that match the article, and (3) then displays the related photos or audio. See e.g., Chesnais at p. 277 ("The article is then rendered by the front end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story. For most Fishwrap readers, articles are rendered in hypertext markup language (HTML) for a WWW browser."; and Chesnais at p. 281 ("One blind student appreciated the .. . audio segments for illustrations."). Chesnais further explains "On Demand Publishing: Fishwrap's use of the WWW is different from existing servers. Rather than be an archive of documents, Fishwrap contructs [sic] its personalized news documents on the fly. Building documents on demand allows Fishwrap to provide the most recent news." (Id. at 280). Finally, as shown in Figs. 2 and 13, Fishwrap presents a user with photos (thumbnails shown below) and audio (display of a portion or representation of a second segment) that "match" or are related to the article being displayed (the first segment). See Figs. 2 and 13.


RE: Claim 63

A computer readable medium encoded with one or more computer programs for enabling acquisition and review of a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, comprising:

Chesnais discloses a computer readable medium encoded with one or more computer programs for enabling acquisition and review of a body of information, wherein the body of information,includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "Fishwrap is an experimental electronic newspaper system available at MIT." (p. 275); "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (Id.); "All items coming into the system are analyzed for geographic and topical relevancy." (Id.) (emphasis added); "Access to Fishwrap's personalized news system appears as a World Wide Web (WWW) hypertext link" (Id.); "[Within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line. Each supplier program does three things: First it translates all news items into an internal, wire-independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data. Finally each article is supplied to the Fishwrap news database server." (p. 277) (emphasis added); "A Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." (p. 276).)

Note: as shown in this quotation, Chesnais uses the word "article" in some aspects to refer to "all news items," not just news items that are articles. Further, references in quotes to Chesnais in the form of [number] appear this way in the Chesnais publication and refer to the references listed at the end of the article.

Further as shown in Fig. 6 the "News Server" receives many different types of data, including news wire feeds, evening news stills and video, and audio files. (Fig. 6, at 278). As described above, each of these different data items represent distinct segments that Fishwrap analyzes and creates a "signature" for.

The Fishwrap electronic newspaper system includes multiple servers that contain computer readable medium comprising instructions for performing the functions disclosed by Chesnais (e.g., "Glue provides a standard 'plug and play' set of tools for servers, knowledge representations modules, user profiling systems, and presentation modules." (p. 278)). Further, Chesnais also describes multiple modules interacting as part of Glue, including the News Server acquiring the news items (pp. 278-79), the supplier programs adding signatures (pp. 277 & 278) and the From End Application rendering presentation to a user (p. 277). Certain module names are shown in boldface in Fig. 7 (p. 278).

instructions for acquiring data representing the body of information;

Chesnais discloses instructions for acquiring data representing the body of information (e.g., "The Fishwrap design readily accepts traditional news wire stories and direct contributions from the community." (p. 275); "[Within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. Articles come to Fishwrap in many formats: over satellite, radio frequencies, email, and phone line." (p. 277); "Suppliers and Servers - Fishwrap receives news from a variety of sources and formats. The traditional news wires (Associated Press, Reuters, Knight- Ridder/Tribune, and BPI Entertainment all are providing their news feeds to Fishwrap) come in ANPA [7] format. Fishwrap also receives submissions via electronic mail and a number of 'homebrew' formats'" (p. 278)). Further, as shown above with respect to Fig. 6, the "NEWS SERVER" acquires information from a variety of sources, including text, video, images and audio. (Id.). Chesnais also explains that "[o]ur current Fishwrap news server uses a media- independent representation, that allows it to accept items with graphics, audio, text, and motion pictures. It is up to the presentation application to determine the appropriate medium to provide." (p. 279.) As exemplified by the above citations, Chesnais discloses that Fishwrap includes instructions for acquiring a variety of different types of data that would make up a body of information.

instructions for storing the acquired data;

Chesnais discloses instructions for storing the acquired data, including for example news wire stories, photos and audio files in databases. See e.g., Chesnais at p. 277 ("[Within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic ....Finally each article is supplied to the Fishwrap news database server [4] where it will remain for the next 48 hours."); and Chesnais at p. 278 ("Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story."); see also id. at Fig. 6. Thus, Chesnais describes that it stores all incoming items.

instructions for generating a display of a first segment of the body of information from data that is part of the stored data;

Chesnais discloses instructions for generating a display of a first segment of the body of information from data that is part of the stored data. For example, it discloses generating a display of an article. See e.g., Chesnais at 277 {e.g., "When a reader generates a newspaper through Fishwrap, an article is retrieved if it matches one of the reader's global topics of interest. .. .[and an "article is then rendered by the front end application"); see also Figs. 2 and 13. Chesnais further explains that it uses a web browser to provide the display. See e.g., Chesnais at p. 275 ("World Wide Web browser access allows for easy traversing of the information space (see Figure 2)."). Chesnais further explains how the user navigates to display an article—"[a] Fishwrap reader starts with their edition's table of contents, then focuses on a particular news topic and, ultimately, articles that are illustrated with graphics and audio." Chesnais at p. 276. Figs. 2 and 13, further illustrate how a user of Fishwrap can navigate to a particular news item, such as the article "New Evidence About Bombing Suspect Emerges," which represents an example of the display of a first segment generated from the stored data. Further, Fig. 7 shows the "appRender" module that renders the articles (p. 278). Thus, Chesnais discloses that Fishwrap has instructions to display the aforementioned fist segment. See Chesnais at FIGS. 2 and 13.

instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

Chesnais discloses instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related. As explained below, Chesnais discloses that all incoming items are provided with a signature, which is used for searching, and that when an article is rendered Fishwrap also searches the photo and audio databases for items that "match the story" (i.e., related items). See e.g., Chesnais at 277 ("When a reader generates a newspaper through Fishwrap ....The article is then rendered by the front end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story.") (emphasis added); and Chesnais at p. 281 ("One blind student appreciated the ... audio segments for illustrations."). As discussed in more detail below, Chesnais discloses the "comparing" as identifying "photos and sound recordings that match the story." Chesnais makes this possible because, as addressed immediately below, the Fishwrap system stores the incoming items (e.g., stories, audio files, and photos) with "signatures" ("data representing" a segment).

For example, Chesnais explains that the "signatures," which are derived from the incoming data are applied to all items coming into the system. Chesnais at p. 275 ("All items coming into the system are analyzed for geographic and topical relevancy."). These signatures are created along with a particular data structure ("Dtype") and provides "inferences" about the item:

within Fishwrap an article begins when it appears on any incoming data stream. Each data stream has its own supplier program which monitors incoming traffic. First it translates all news items into an internal, wire-independent representation using Dtype [3] expandable data structure. Second the supplier adds a signature to each item. The signature represents an inference made from the data. Chesnais at p. 277.

Note: as shown in this quotation, Chesnais uses the word "article" in some aspects to refer to "all news items," not just news items that are articles. Further, references in quotes to Chesnais in the form of [number] appear this way in the Chesnais publication and refer to the references listed at the end of the article.

Further, as shown in Fig. 9 of Chesnais, the signature process (which adds the content labeled 1 and 3 to the item) provides additional data representing the item (i.e., an inference made from the data), such as a headline ("Survivors of Crash Victim Sue USAir"), a "slugword," and a "summary." See Chesnais at p. 279.

Chesnais further explains that the signatures are used in searches ("because they significantly speed up the searches") used to build a paper to present to the user, which presentation, as described above and shown for example by the photo thumbnails in Figs. 2 and 13, also includes "photos and sound recordings that match the story." See e.g., Chesnais at Figs. 2 and 13 and p. 277 (matching) & 279 (using signatures to "significantly speed up the searches."). Thus, the signatures, which include data representing the segments (i.e., a headline and a summary like those shown in the third portion of Fig. 9), include predetermined criteria used to determine whether particular segments are related.

Chesnais references the Wire Service Transmission Guidelines Special Report No. 84-2, from the American Newspaper Publishers Association (ANPA). Chesnais at p. 278 & 282. Chesnais states that the signature added to an item is "derived from the ANPA format coding." Chesnais, p. 279. As shown in Fig. 9, the signature of an item included a "slugword" field with keywords. Therefore, one of ordinary skill in the art would understand that the signature for photos stored in the Fishwrap database included a slugword field containing keywords associated with the photos.

Chesnais discloses "comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related." However, even if the Examiner determined that Chesnais did not expressly or inherently disclose comparing signatures of two items to determine if they are related, it would have been obvious to one of ordinary skill in the art to perform the recited comparison step on Chesnais's signatures using the comparison technique disclosed in Bender. Bender discloses comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related. For example, Bender compares closed caption data representing a news broadcast (one segment) to news wire text stories (different segments) via keyword matching to determine, whether according to predetermined criteria (e.g., a threshold number of matched keywords), the segments are related. See e.g., Bender at pp. 82-83 (describing keyword matching process) ("Network Plus is comprised of two procedural components. One gathers information prior to the broadcast. The other matches stories during the broadcast."(emphasis added); "The primary function of Network Plus is to correlate news wire stories and live broadcasts ....A keyword matching scheme was chosen, based upon empirical evidence that there exists a sufficient correspondence between words found in the transcript and words found in the wire service stories. If the number of words common to both the transcript and a trial story exceeded some threshold, the two were designated as related ....A threshold of four words worked well in this experiment...") (emphasis added). Bender further provides a specific example illustrating the process for comparing a news wire story about the nuclear accident at Chernobyl to a television broadcast on "ABC Nightly News" to determine they were related. Id. Thus, Bender discloses at least comparing the closed caption data for the news broadcast with the news wire text via keyword matching to determine whether according to a predetermined threshold for keyword matching (e.g., four common words), the broadcast and the news wire story are related.

One of skill in the art would have been motivated to combine Bender's keyword matching scheme with the disclosure of Chesnais to identify matching photos and sound recording at least because Chesnais discloses that all items in the Fishwrap system are assigned a "signature" that includes keywords, and further discloses identifying photos and audio files in its database that "match" a news article, and Bender discloses using a keyword matching scheme to "match" news stories to a broadcast. Thus, to the extent that Chesnais does not expressly or inherently disclose using predetermined criteria to "compar[e] data representing a segment of the body of information to data representing a different segment of the body of information to determine whether.., the compared segments are related," using a predefined threshold for a number of keywords that match as disclosed in Bender would have been obvious to one of ordinary skill in the art based upon Chesnais in view of Bender.

instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Chesnais discloses instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related. Specifically, Chesnais discloses that the Fishwrap system, in the following order, (1) renders an article, (2) then checks for photos or audio that match the article, and (3) then displays the related photos or audio. See e.g., Chesnais at p. 277 ("The article is then rendered by the from end application with hints given by the signatures. Fishwrap also checks its photo and audio databases to see if there are photos and sound recordings that match the story. For most Fishwrap readers, articles are rendered in hypertext markup language (HTML) for a WWW browser."; and Chesnais at p. 281 ("One blind student appreciated the.., audio segments for illustrations."). Chesnais further explains "On Demand Publishing: Fishwrap's use of the WWW is different from existing servers. Rather than be an archive of documents, Fishwrap contructs [sic] its personalized news documents on the fly. Building documents on demand allows Fishwrap to provide the most recent news." (Id. at 280). Finally, as shown in Figs. 2 and 13, Fishwrap presents a user with photos (thumbnails shown below) and audio (display of a portion or representation of a second segment) that "match" or are related to the article being displayed (the first segment).

See Figs. 2 and 13.


Ground #8 - Joachims

RE: Claim 20

A method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, the method comprising the steps of:

Joachims discloses a method for acquiring and reviewing a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "WebWatcher [], an agent which assists users in locating information on the WWW or searches autonomously on their behalf.") Joachims at p. 1. Thus, Joachims discloses a method for helping a user review and find [acquire] information, such as webpages [segments] on the World Wide Web [a body of information], that is determined to be of interest to the user or that is related to a webpage the user is currently viewing. Joachims at Abstract and p. 1.

acquiring data representing the body of information;

Joachims discloses acquiring data representing the body of information (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "an algorithm which identifies pages that are related to a given page using only hypertext structure.") Joachims at p. 1 and Abstract. One skilled in the art would understand that Joachims' WebWatcher system necessarily discloses acquiring webpage data [data representing the body of information] because acquiring the data would be a necessary step before the data can be displayed and analyzed. See Joachims at p. 1-3. Thus, Joachims discloses acquiring webpage data [data representing the body of information].

storing the acquired data;

Joachims discloses storing the acquired data (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "an algorithm which identifies pages that are related to a given page using only hypertext structure.") Joachims at p. 1 and Abstract. One skilled in the art would understand that Joachims' WebWatcher system necessarily discloses storing the webpage data [acquired data] because storing the data would be a necessary step before the data can be displayed and analyzed. See Joachims at p. 1-3. Thus, Joachims discloses storing webpage data [data representing the body of information].

generating a display of a first segment of the body of information from data that is part of the stored data;

Joachims discloses generating a display of a first segment of the body of information from data that is part of the stored data (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "In our example the user follows WebWatcher's advice and takes the "ILPNET" hyperlink. She arrives at the page shown in figure 4.'; "In our scenario the user is particularly interested in the "ILPNet" page. So she clicks on the button "Mark this page as interesting" in the menu bar. WebWatcher stores this information and returns a list of 10 pages which WebWatcher estimates to be closely related (figure 5)') Joachims at pp. 1-3 and FIGS. 3-5.

Thus, Joachims discloses generating a display of the "ILPNET" page [first segment]. See Joachims at FIG. 5.

comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

There are two ways that Joachims discloses comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related.

First, Joachims discloses: (1) two hyperlinks - "data representing" two different "segments] of the body of information," i.e., two separate webpages; (2) comparing the hyperlinks to see if they both have a particular attribute such as "appears on webpage X (the predetermined criterion); and (3) if so, concluding that the linked-to webpages are related, i.e., "of similar interest." See Joachims at p. 3. Specifically, Joachims discloses that "two webpages are of similar interest if some third page points to them both." Joachims at p. 3.

Second, Joachims discloses using the "nearest neighbor" method to generate a matrix showing the relationship between webpages, where the columns of the matrix could correspond to "data representing a segment of the body of information" - each column, after all, provides a "fingerprint" of a given webpage in that it identifies where hyperlinks to that given webpage are located. Those columns are then compared to the column for a webpage of interest (e.g., the WWatcher page) to "find the ones most similar to the [of-interest webpage's] column." To the extent some number n of so-related webpages are returned by the grouping ( the predetermined criteria), the webpage is considered to be related. Joachims at FIG. 6 and pp. 3 and 4.

generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Joachims discloses generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "In our example the user follows Web Watcher's advice and takes the "ILPNET" hyperlink. She arrives at the page shown in figure 4."; "In our scenario the user is particularly interested in the "ILPNet" page. So she clicks on the button "Mark this page as interesting" in the menu bar. Web Watcher stores this information and returns a list of 10 pages which Web Watcher estimates to be closely related (figure 5)"). Joachims at pp. 1-3 and FIGS. 3-5.

Similar to the '507 patent's description of a text story being displayed "in response to" a television news story ("... a representation of the related information can be displayed in response to... the original information display. For instance.., one or more text news stories... that are related.., to a television news story being displayed can be automatically identified and a portion of the related text news story or stories displayed so that the story or stories can be reviewed for additional information ...." The '507 patent at column 3:45-54.), Joachims describes identifying and displaying "a list of 10 pages which Web Watcher estimates to be closely related [to the ILPNET webpage]" together with the ILPNET webpage. Joachims at p. 3. Thus, as described above and shown in Figure 5, Joachims discloses displaying "a list of 10 pages which WebWatcher estimates to be closely related [to the ILPNET webpage]" [second segment] in response to the "ILPNET" webpage [first segment] being displayed. Joachims at p. 3.


RE: Claim 63

A computer readable medium encoded with one or more computer programs for enabling acquisition and review of a b6dy of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information, comprising:

Joachims discloses computer readable medium encoded with one or more computer programs for enabling acquisition and review of a body of information, wherein the body of information includes a plurality of segments, each segment representing a defined set of information in the body of information (e.g., "WebWatcher [], an agent which assists users in locating information on the WWW or searches autonomously on their behalf.") Joachims at p. 1. Thus, Joachims discloses a method for helping a user review and find [acquire] information, such as webpages [segments] on the World Wide Web [a body of information], that is determined to be of interest to the user or that is related to a webpage the user is currently viewing. Joachims at Abstract and p. 1.

The WebWatcher program disclosed by Joachims is something that facilitates the gathering of information, such as webpages, from the Internet [network]. See Joachims at Abstract. Thus, it is inherent that computer-readable media implemented on a computer is used.

instructions for acquiring data representing the body of information;

Joachims discloses instructions for acquiring data representing the body of information (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "an algorithm which identifies pages that are related to a given page using only hypertext structure.") Joachims at p. 1 and Abstract. One skilled in the art would understand that Joachims' WebWatcher system necessarily discloses acquiring webpage data [data representing the body of information] because acquiring the data would be a necessary step before the data can be displayed and analyzed. See Joachims at p. 1-3. Thus, Joachims discloses instructions for acquiring webpage data [data representing the body of information].

instructions for storing the acquired data;

Joachims discloses instructions for storing the acquired data (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "an algorithm which identifies pages that are related to a given page using only hypertext structure.") Joachims at p. 1 and Abstract. One skilled in the art would understand that Joachims' WebWatcher system necessarily discloses storing the webpage data [acquired data] because storing the data would be a necessary step before the data can be displayed and analyzed. See Joachims at p. 1-3. Thus, Joachims discloses instructions for storing webpage data [data representing the body of information].

instructions for generating a display of a first segment of the body of information from data that is part of the stored data;

Joachims discloses instructions for generating a display of a first segment of the body of information from data that is part of the stored data (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "In our example the user follows Web Watcher's advice and takes the "ILPNET" hyperlink. She arrives at the page shown in figure 4."; "In our scenario the user is particularly interested in the "ILPNet" page. So she clicks on the button "Mark this page as interesting" in the menu bar. WebWatcher stores this information and returns a list of 10 pages which WebWatcher estimates to be closely related (figure 5)") Joachims at pp. 1-3 and FIGS. 3-5.

Thus, Joachims discloses generating a display of the "ILPNet" page [first segment]. See Joachims at FIG. 5.

instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related; and

There are two ways that Joachims discloses instructions for comparing data representing a segment of the body of information to data representing a different segment of the body of information to determine whether, according to one or more predetermined criteria, the compared segments are related.

First, Joachims discloses: (l)two hyperlinks - "data representing" two different "segments] of the body of information," i.e., two separate webpages; (2) comparing the hyperlinks to see if they both have a particular attribute such as "appears on webpage X (the predetermined criterion); and (3) if so, concluding that the linked-to webpages are related, i.e., "of similar interest." See Joachims at p. 3. Specifically, Joachims discloses that "two webpages are of similar interest if some third page points to them both." Joachims at p. 3.

Second, Joachims discloses using the "nearest neighbor" method to generate a matrix showing the relationship between webpages, where the columns of the matrix could correspond to "data representing a segment of the body of information" - each column, after all, provides a "fingerprint" of a given webpage in that it identifies where hyperlinks to that given webpage are located. Those columns are then compared to the column for a webpage of interest (e.g., the WWatcher page) to "find the ones most similar to the [of-interest webpage's] column." To the extent some number n of so-related webpages are returned by the grouping (the predetermined criteria), the webpage is considered to be related. Joachims at FIG. 6 and pp. 3 and 4.

instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related.

Joachims discloses instructions for generating a display of a portion of, or a representation of, a second segment of the body of information from data that is part of the stored data, wherein the display of the portion or representation of the second segment is generated in response to the display of a first segment to which the second segment is related (e.g., "Figures 1 to 5 illustrate the sequence of web pages a user visits in a typical example."; "In our example the user follows WebWatcher's advice and takes the "ILPNET" hyperlink. She arrives at the page shown in figure 4."; "In our scenario the user is particularly interested in the "ILPNet" page. So she clicks on the button "Mark this page as interesting" in the menu bar. Web Watcher stores this information and returns a list of 10 pages which Web Watcher estimates to be closely related (figure 5)"). Joachims at pp. 1-3 and FIGS. 3-5.

Similar to the 507 patent's description of a text story being displayed "in response to" a television news story ("... a representation of the related information can be displayed in response to... the original information display. For instance.., one or more text news stories... that are related.., to a television news story being displayed can be automatically identified and a portion of the related text news story or stories displayed so that the story or stories can be reviewed for additional information ...." The '507 patent at column 3:45-54.), Joachims describes identifying and displaying "a list of 10 pages which Web Watcher estimates to be closely related [to the ILPNET webpage]" together with the ILPNET webpage. Joachims at p. 3.

Thus, as described above and shown in Figure 5, Joachims discloses instructions for displaying "a list of 10 pages which Web Watcher estimates to be closely related [to the ILPNET webpage]" [second segment] in response to the "ILPNET" webpage [first segment] being displayed. Joachims at p. 3.


Ground #11 - Masand

RE: Claim 39

A method for categorizing according to subject matter an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments of the body of information having previously been categorized by identifying each of the one or more segments with one or more subject matter categories, the method comprising the steps of:

Masand teaches a method for categorizing according to subject matter an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments of the body of information having previously been categorized by identifying each of the one or more segments with one or more subject matter categories.

For example, Masand discloses a body of information comprising a plurality of segments, such as news articles from the Dow Jones Press Release News Wire, that includes uncategorized documents ("[e]ach day editors at Dow Jones assign codes to hundreds of stories originating from diverse sources such as newspapers, magazines, newswires, and press releases") and previously categorized documents that have been assigned to one or more of 350 category codes ("Using an already coded training database of about 50,000 stories from the Dow Jones Press Release News Wire ...." (Masand at Abstract); "The coding task consists of assigning one or more codes to a text document, from a possible set of about 350 codes." (Masand at p. 59)).

Masand further discloses a method for categorizing the uncategorized stories by subject matter by assigning to each story "distinct codes, grouped into seven [sic] categories: industry, market sector, product, subject, government agency, and region*" Masand at p. 59. (emphasis added*) The category codes are assigned based on codes of related previously categorized documents ("Using an already coded training database of about 50,000 stories from the Dow Jones Press Release News Wire, and SEEKER [Stanfill] (a text retrieval system that supports relevance feedback) as the underlying match engine, codes are assigned to new, unseen stories.. • .") Masand at p. 59.

Thus, Masand discloses categorizing by subject matter the uncategorized news stories of a body of information based on category codes assigned to previously categorized news stories of the body of information.

determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments;

Masand discloses determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments.

For example, Masand discloses "a method for classifying news stories using Memory Based Reasoning (MBR) (a k-nearest neighbor method)." Masand at p. 59. The MBR method includes "find[ing] the near matches for each document to be classified. This is done by constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs. This query returns a weighted list of near matches (see Fig. 4)." Masand at p. 61. Thus, Masand discloses using a relevance feedback query constructed from the text of a new document to search against the documents contained in the database of previously categorized stories. Id. at 61

Masand further discloses determining similarity scores (i.e., a degree of similarity) between the new story and each of the previously categorized stories. Masand at p. 61 ("[c)odes are assigned weights by summing similarity scores from the near matches.") (emphasis added). Fig. 4 shows the determined degree of similarity ("score") between an uncategorized news story and each of the eleven "nearest neighbors" in the previously categorized documents.

See Masand at p. 61.

Thus, Masand discloses determining similarity scores between the subject matter of an uncategorized document (e.g., news story) and the subject matter of each document of a set of previously categorized documents (e.g., previously categorized news stories) based on the contents of the documents ("constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs") Masand at p. 61.

identifying one or more of the previously categorized segments as relevant to the categorized segment based upon the determined degrees of similarity of subject matter content between the categorized segment and the previously categorized segments; and

Masand discloses identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments.

For example, as discussed above, Masand discloses determining a degree of similarity ("score") between an uncategorized news story and each of the previously categorized documents by "constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs. This query returns a weighted list of near matches (see Fig. 4)." Masand at p. 61. (emphasis added.) Additionally, "Fig. 4 shows the headlines and the normalized scores for the example used in Fig. 2 and the first few near matches from the relevance feedback search." Masand at p. 61 (emphasis added); see also, Fig. 4 which shows an uncategorized news story and the eleven "nearest neighbors" in the previously categorized documents).

Based on the results of the relevance feedback query, Masand discloses identifying the k-nearest matches and "assigning] codes to the unknown document by combining the codes assigned to the k nearest matches; for these experiments, we used up to 11 nearest neighbors." Masand at p. 61. (emphasis added.)

Thus, Masand discloses identifying k previously categorized documents as being relevant to the uncategorized document based on the determined similarity scores between the uncategorized document and the previously categorized documents.

selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

Masand teaches selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

For example, as discussed above, Masand discloses determining a degree of similarity ("score") between an uncategorized news story and each of the previously categorized documents by "constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs. This query returns a weighted list of near matches (see Fig. 4)." Masand at p. 61. Masand further discloses that the "[c)odes are assigned weights by summing similarity scores from the near matches. Finally we choose the best codes based on a score threshold. Fig. 4 shows the headlines and the normalized scores for the example used in Fig. 2 and the first few near matches from the relevance feedback search." Masand at p. 61 (emphasis added); see also, Fig. 4. In one particular example, Masand discloses "assigning] codes to the unknown document by combining the codes assigned to the k nearest matches: for these experiments, we used up to 11 nearest neighbors." Masand at p. 61. (emphasis added.) The codes may be "grouped into seven [sic] categories: industry, market sector, product, subject, government agency, and region." Masand at p. 59 (emphasis added).

Thus, Masand discloses selecting one or more subject matter category codes for an uncategorized document based on the category codes assigned to the K-nearest (i.e., relevant) documents.


RE: Claim 82

A computer readable medium encoded with one or more computer programs for enabling categorization according to subject matter of an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments having previously been categorized by identifying each of the one or more segments with one or more subject matter categories, comprising:

Masand discloses a computer readable medium encoded with one or more computer programs for enabling categorization according to subject matter of an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments having previously been categorized by identifying each of the one or more segments with one or more subject matter categories.

For example, Masand discloses a body of information comprising a plurality of segments, such as news articles from the Dow Jones Press Release News Wire, that includes uncategorized documents ("[e]ach day editors at Dow Jones assign codes to hundreds of stories originating from diverse sources such as newspapers, magazines, newswires, and press releases") and previously categorized documents that have been assigned to one or more of 350 category codes ("Using an already coded training database of about 50,000 stories from the Dow Jones Press Release News Wire ...." (Masand at Abstract); "The coding tasks consists of assigning one or more codes to a text document, from a possible set of about 350 codes." (Masand at p. 59)).

Masand further discloses a method for categorizing the uncategorized stories by subject matter by assigning to each story "distinct codes, grouped into seven [sic] categories: industry, market sector, product, subject, government agency, and region." Masand at p. 59 (emphasis added.) The category codes are assigned based on codes of related previously categorized documents ("Using an already coded training database of about 50,000 stories from the Dow Jones Press Release News Wire, and SEEKER [Stanfill] (a text retrieval system that supports relevance feedback) as the underlying match engine, codes are assigned to new, unseen stories...") Masand at p. 59.

With respect to being embodied as a computer program stored on a computer readable medium, Masand discloses that the "method for classifying news stories using Memory Based Reasoning (MBR) (a k-nearest neighbor method)[] does not require manual topic definitions." Masand at Abstract (emphasis added). Masand further discloses that the SEEKER text retrieval system that was used as the underlying match engine was executed on a "4k CM-2 Connection Machine System." Masand at p. 62. As such, it is inherent that the method disclosed by Masand is embodied as a computer program stored on a computer readable medium.

Thus, Masand discloses a computer readable medium encoded with one or more computer programs for enabling categorization according to subject matter the uncategorized news stories of a body of information based on category codes assigned to previously categorized news stories of the body of information.

instructions for determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments;

Masand discloses instructions for determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments.

For example, Masand discloses "a method for classifying news stories using Memory Based Reasoning (MBR) (a k-nearest neighbor method)." Masand at p. 59. The MBR method includes "find[ing] the near matches for each document to be classified. This is done by constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs. This query returns a weighted list of near matches (see Fig. 4)." Masand at p. 61. Thus, Masand discloses using a relevance feedback query constructed from the text of a new document to search against the documents contained in the database of previously categorized stories. Id. at 61.

Masand further discloses determining similarity scores (i.e., a degree of similarity) between the new story and each of the previously categorized stories. Masand at p. 61 ("[c)odes are assigned weights by summing similarity scores from the near matches.") (emphasis added). Fig. 4 shows the determined degree of similarity ("score") between an uncategorized news story and each of the eleven "nearest neighbors" in the previously categorized documents.

See Masand at p. 61.

Thus, Masand discloses instructions for determining similarity scores between the subject matter of an uncategorized document (e.g., news story) and the subject matter of each document of a set of previously categorized documents (e.g., previously categorized news stories) by "constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs." Masand at p. 61.

instructions for identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments; and

Masand discloses instructions for identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments.

For example, as discussed above, Masand discloses determining a degree of similarity ("score") between an uncategorized news story and each of the previously categorized documents by "constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs. This query_ returns a weighted list of near matches (see Fig. 4)." Masand at p. 61. (emphasis added.) Additionally, "Fig. 4 shows the headlines and the normalized scores for the example used in Fig. 2 and the first few near matches from the relevance feedback search." Masand at p. 61 (emphasis added); see also, Fig. 4 which shows an uncategorized news story and the eleven "nearest neighbors" in the previously categorized documents).

Based on the results of the relevance feedback query, Masand discloses identifying the k-nearest matches and "assigning] codes to the unknown document by combining the codes assigned to the k nearest matches; for these experiments, we used up to 11 nearest neighbors." Masand at p. 61 (emphasis added.)

Thus, Masand discloses instructions for identifying k previously categorized documents as being relevant to the uncategorized document based on the determined similarity score between the uncategorized document and the previously categorized documents.

instructions for selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

Masand discloses instructions for selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

For example, as discussed above, Masand discloses determining a degree of similarity ("score") between an uncategorized news story and each of the previously categorized documents by "constructing a relevance feedback query out of the text of the document, including both words and capitalized pairs. This query returns a weighted list of near matches (see Fig. 4)." Masand at p. 61. Masand further discloses that the "[c]odes are assigned weights by umming similarity scores from the near matches. Finally we choose the best codes based on a score threshold. Fig. 4 shows the headlines and the normalized scores for the example used in Fig. 2 and the first few near matches from the relevance feedback search." Masand at p. 61 (emphasis added); see also, Fig. 4. In one particular example, Masand discloses "assigning] codes to the unknown document by combining the codes assigned to the k nearest matches: for these experiments, we used up to 11 nearest neighbors." Masand at p. 61 (emphasis added). The codes may be "grouped into seven [sic] categories: industry, market sector, product, subject, government agency, and region." Masand at p. 59 (emphasis added).

Thus, Masand discloses instructions for selecting one or more subject matter category codes for assigning to an uncategorized document based on the category codes assigned to the K-nearest (i.e., relevant) documents.


Ground #12 - Iwayama

RE: Claim 39

A method for categorizing according to subject matter an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments of the body of information having previously been categorized by identifying each of the one or more segments with one or more subject matter categories, the method comprising the steps of:

Iwayama discloses a method for categorizing according to subject matter an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more , segments of the body of information having previously been categorized by identifying each of the one or more segments with one or more subject matter categories.

For example, Iwayama discloses a body of information comprising a plurality of segments, such as a collection of Wall Street Journal articles, that includes uncategorized documents ("For WSJ,... all stories from '89/10/2 to '89/11/2 went into a test set of 3,087 documents") and previously categorized documents that have been assigned to one or more of 78 categories ("For WSJ, all stories that appeared from '89/7/25 to '89/9/29 went into a training set of 5,820 documents" (Iwayama at p. 276.); "Each of the articles is assigned some of 78 categories." (Iwayama at p. 275.)).

Iwayama further discloses assigning subject matter categories to the uncategorized documents based on categories of similar previously categorized documents ("one or more categories for a test document are searched for by using given training documents with known categories.") Iwayama at Abstract. Specifically, Iwayama discloses a categorization method comprising four steps: "1. Construct clusters C ... 2. Calculate the posterior probability P(Ci/dtest) [i.e., degree of similarity] for a test document dtest and every cluster Cj.. . 3. Sort the posterior probabilities and extract the K-nearest training documents ... 4. Assign to the test document categories based on the extracted A"-nearest documents." Iwayama at p. 273. In one particular embodiment disclosed by Iwayama, the method may be used to perform a full search, such as "MBR (Memory Based Reasoning) ... for calculating a measure of similarity between a test document and every training document." Iwayama at p. 273.

Note: Iwayama discloses multiple embodiments. A second embodiment, not addressed herein uses clusters of documents having similar categories and works in much the same way as the embodiment discussed herein because, as noted by Iwayama, clusters could be single documents and the methods, except for the clustering step, would be the same. In such case, "each training document belongs to a singleton cluster whose only member is the document itself. Iwayama at pp. 273-74. The first method and system, which is addressed herein is referred to as the "full search" in Iwayama.

In this example, "each training document belongs to a singleton cluster whose only member is the document itself." Iwayama at p. 274. Thus, the method categorizes the uncategorized documents (i.e., test documents) according to subject matter and involves "calculating a measure of similarity between a test document and every training document." Iwayama at p. 273.

Thus, Iwayama discloses categorizing the uncategorized test documents of a body of information based on subject matter categories assigned to previously categorized training documents of the body of information.

determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments;

Iwayama discloses determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments.

For example, Iwayama discloses "MBR (Memory Based Reasoning)... for calculating a measure of similarity between a test document and every training document." Iwayama at p. 273 (emphasis added.) This method involves "searching] the K-nearest training documents to the test document and us[ing] the categories assigned to those training documents." Iwayama at p. 273. To determine the K-nearest training documents, Iwayama discloses "2. [c]alculat[ing] the posterior probability P(Cj/dtest) [i.e., degree of similarity] for a test document dtest and every cluster c;." Iwayama at p. 273. The posterior probability is the measure of similarity calculated based on the contents [i.e., subject matter] of the documents (e.g., using the "relative frequency of a term t in a test document," "relative frequency of a term t in a cluster," and "relative frequency of a term t in the entire set of training documents"). Iwayama at p. 274. Iwayama further discloses that "[f]or full search (MBR or K-NN), no clustering algorithm is used here. It follows that each training document belongs to a singleton cluster whose only member is the document itself." Iwayama at pp. 273-274. Thus, Iwayama discloses determining the posterior probabilities [i.e., degree of similarity] between a test document and each of the previously categorized documents.

Thus, Iwayama discloses determining a measure of similarity between the subject matter of an uncategorized test document and each document of a set of previously categorized training documents.

identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments; and

Iwayama discloses identifying one or more of the previously categorized segments ("training documents") as relevant to the uncategorized segment ("test document") based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments.

For example, Iwayama discloses "3. [s]ort[ing] the posterior probabilities and extracting] the K-nearest training documents." Iwayama at p. 273. As discussed above, the degree of similarity ("posterior probability") between the uncategorized document ("test document") and each of the previously categorized documents ("training document") is determined by the MBR method. See Iwayama at pp. 273-275. "The training documents in the nearest clusters [which comprise single documents under the MBR method] become the nearest training documents." Iwayama at p. 274.

Thus, Iwayama discloses identifying K previously categorized training documents as being relevant to the uncategorized test document based on the determined measures of similarity between the uncategorized test document and the previously categorized training documents.

selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

Iwayama discloses selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

For example, Iwayama discloses "4. [a]ssign[ing] to the test document categories based on the extracted K-nearest documents." Iwayama at p. 273. Iwayama further discloses that this step includes generating a "category ranking for each test document.... According to the category ranking, one or more categories are assigned to each test document using one of the following category assignment strategies, [k-per-doc]... [probability threshold]... [proportional assignment^" Iwayama at p. 274.

Thus, Iwayama discloses selecting one or more categories for a test document [uncategorized segment[ based on the categories assigned to the K-nearest training documents ]previously categorized segments].


RE: Claim 82

A computer readable medium encoded with one or more computer programs for enabling categorization according to subject matter of an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments having previously been categorized by identifying each of the one or more segments with one or more subject matter categories, comprising:

Iwayama discloses a computer readable medium encoded with one or more computer programs for enabling categorization according to subject matter of an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments having previously been categorized by identifying each of the one or more segments with one or more subject matter categories.

For example, Iwayama discloses a body of information comprising a plurality of segments, such as a collection of Wall Street Journal articles, that includes uncategorized documents ("For WSJ,... all stories from '89/10/2 to '89/11/2 went into a test set of 3,087 documents") and previously categorized documents that have been assigned to one or more of 78 categories ("For WSJ, all stories that appeared from '89/7/25 to '89/9/29 went into a training set of 5,820 documents" (Iwayama at p. 276.); "Each of the articles is assigned some of 78 categories." (Iwayama at p. 275.)).

Iwayama further discloses assigning subject matter categories to the uncategorized documents based on categories of similar previously categorized documents ("one or more categories for a test document are searched for by using given training documents with known categories.") Iwayama at Abstract. Specifically, Iwayama discloses a categorization method comprising four steps: "1. Construct clusters C ... 2. Calculate the posterior probability P(Cj/d,est) [i.e., degree of similarity] for a test document dtest and every cluster cf... 3. Sort the posterior probabilities and extract the K-nearest training documents ... 4. Assign to the test document categories based on the extracted K-nearest documents." Iwayama at p. 273.

In one particular embodiment disclosed by Iwayama, the method may be used to perform a full search, such as "MBR (Memory Based Reasoning)... for calculating a measure of similarity between a test document and every training document." Iwayama at p. 273.

Note: Iwayama discloses multiple embodiments. A second embodiment, not addressed herein uses clusters of documents having similar categories and works in much the same way as the embodiment discussed herein because, as noted by Iwayama, clusters could be single documents and the methods, except for the clustering step, would be the same. In such case, "each training document belongs to a singleton cluster whose only member is the document itself. Iwayama at pp. 273-74. The first method and system, which is addressed herein is referred to as the "full search" in Iwayama.

In this example, "each training document belongs to a singleton cluster whose only member is the document itself." Iwayama at p. 274. Thus, the method categorizes the uncategorized documents (i.e., test documents) according to subject matter and involves "calculating a measure of similarity between a test document and every training document." Iwayama at p. 273.

With respect to being embodied as a computer program stored on a computer readable medium, Iwayama describes the categorization as being performed by a "program searching] for one or more categories that a test document is assumed to have." Iwayama at p. 273 (emphasis added.) See also, program instructions on pp. 279-280. The use of a "program" implicates the use of a computer, and accordingly, instructions encoded on a computer readable medium.

Thus, Iwayama discloses instructions for categorizing the uncategorized test documents of a body of information based on categories assigned to previously categorized training documents of the body of information.

instructions for determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments;

Iwayama discloses instructions for determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments.

For example, Iwayama discloses "MBR (Memory Based Reasoning)... for calculating a measure of similarly between a test document and every training document." Iwayama at p. 273 (emphasis added;) This method involves "searching] the K-nearest training documents to the test document and us[ing] the categories assigned to those training documents." Iwayama at p. 273. To determine the K-nearest training documents, Iwayama discloses "2. [c]alculat[ing] the posterior probability P(Cj/dtest) [i.e., degree of similarity] for a test document dtestand every cluster Cj." Iwayama at p. 273. The posterior probability is the measure of similarity calculated based on the contents [i.e., subject matter] of the documents (e.g., using the "relative frequency of a term t in a test document," "relative frequency of a term t in a cluster," and "relative frequency of a term t in the entire set of training documents"). Iwayama at p. 274. Iwayama further discloses that "[f]or full search (MBR or K-NN), no clustering algorithm is used here. It follows that each training document belongs to a singleton cluster whose only member is the document itself." Iwayama at pp. 273-274. Thus, Iwayama discloses determining the posterior probabilities [i.e., degree of similarity] between a test document and each of the previously categorized documents.

Thus, Iwayama discloses instructions for determining a measure of similarity between the contents (subject matter) of an uncategorized test document and each document of a set of previously categorized training documents.

instructions for identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments; and

Iwayama discloses instructions for identifying one or more of the previously categorized segments ("training documents") as relevant to the uncategorized segment ("test document") based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments.

For example, Iwayama discloses "3. [s]ort[ing] the posterior probabilities and extracting] the K-nearest training documents." Iwayama at p. 273. As discussed above, the degree of similarity ("posterior probability") between the uncategorized document ("test document") and each of the previously categorized documents ("training document") is determined by the MBR method. See Iwayama at pp. 273-275. "The training documents in the nearest clusters [which comprise single documents under the MBR method] become the nearest training documents." Iwayama at p. 274.

Thus, Iwayama discloses instructions for identifying K previously categorized training documents as being relevant to the uncategorized test document based on the determined measures of similarity between the uncategorized test document and the previously categorized training documents.

instructions for selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

Iwayama discloses instructions for selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

For example, Iwayama discloses "4. [a]ssign[ing] to the test document categories based on the extracted K-nearest documents." Iwayama at p. 273. Iwayama further discloses that this step includes generating a "category ranking for each test document... According to the category ranking, one or more categories are assigned to each test document using one of the following category assignment strategies, [k-per-doc] . [probability threshold] ... [proportional assignment]." Iwayama at p. 274.

Thus, Iwayama discloses instructions for selecting one or more categories for a test document [uncategorized segment] based on the categories assigned to the K-nearest training documents [previously categorized segments].


Ground #15 - Yuasa

RE: Claim 39

A method for categorizing according to subject matter an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments of the body of information having previously been categorized by identifying each of the one or more segments with one or more subject matter categories, the method comprising the steps of:

Yuasa discloses a method for categorizing according to subject matter an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments of the body of information having previously been categorized by identifying each of the one or more segments with one or more subject matter categories.

For example, Yuasa discloses a system that performs a method of automatically classifying large volume documents. (Yuasa at [0001], [0008]^) The documents are a body of information and each document is a segment of information. Yuasa discloses that one or more of the documents (i.e., segments) have been previously categorized. (Id. at [0017]-[0018].) The categories include subject matter categories, such as "politics", "Diet", and "international". (Id. at [0058].)

determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments;

Yuasa discloses determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments (e.g., "a classifier for classifying documents using degrees of similarity between characteristic vectors of documents"). Yuasa at 1F1F [0005], [0009], [0011], [0013], [0018], [0030], [0032], [0046], [0048], [0055], and [0058]-[0060].

Yuasa describes an exemplary process by which a sentence is categorized according to a plurality of predetermined classification groups. Id. at [0031]-[0046]. The classification groups include subject matter categories, such as "politics", "Diet", and "international". (Id. at [0058].) The classification groups are determined from previously categorized documents, and a representative vector is generated for each classification group. In one example, a representative document is chosen for each classification group, and a document characteristic vector is created for each representative document. Id. at [0018]. In another example, a clustering technique is used in which "documents for which the distances between document characteristics are close [are placed] in the same field [i.e. classification]". Id. at [0017]. Yuasa determines similarity by comparing the characteristic vector of the classification group to the characteristic vector of the sample sentence. Id. at [0031 ]-[0046]. "[T]he inner products of both [the characteristic vector of the sample sentence and the characteristic vector of the classification groups] are computed, and that producing the highest value is assumed to exhibit the highest degree of similarity..." Id. at [0032].

identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments; and

Yuasa discloses identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments. (e.g., "a classifier for classifying documents using degrees of similarity between characteristic vectors of documents" and "it will be possible to classify a document read in from the document memory 301 in a classification group corresponding to the representative vector that most resembles the characteristic vector(s) for that document"). Yuasa at HH [0005], [0009], [0011], [0013], [0018], [0030], [0032], [0046], [0048], [0055], and [0058]-[0060].

For example, the Yuasa system measures the similarity between the example sentence and the previously determined classification groups by computing an inner product of the characteristic vector of the example sentence the characteristic vector of each of the classification groups. (Id. at [0031]-[0046].) "[T]he inner products of both [the characteristic vector of the sample sentence and the characteristic vector of the classification groups] are computed, and that producing the highest value is assumed to exhibit the highest degree of similarity..." (Id. at [0032].)

selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

Yuasa discloses selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments (e.g., "it is seen that the characteristic vector for example sentence C is closest to the representative vector for classification group 3, so example sentence C is classified in classification group 3.") Yuasa at n [0011], [0018], [0046] and [0058]-[0060].


RE: Claim 82

A computer readable medium encoded with one or more computer programs for enabling categorization according to subject matter of an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments having previously been categorized by identifying each of the one or more segments with one or more subject matter categories, comprising:

Yuasa discloses a computer readable medium encoded with one or more computer programs for enabling categorization according to subject matter of an uncategorized segment of a body of information that includes a plurality of segments, each segment representing a defined set of information in the body of information, one or more segments having previously been categorized by identifying each of the one or more segments with one or more subject matter categories.

For example, Yuasa discloses a system that performs a method of automatically classifying large volume documents. (Yuasa at [0001], [0008].) The documents are a body of information and each document is a segment of information. Yuasa discloses that one or more of the documents (i.e., segments) have been previously categorized. (Id. at [0017]-[0018].) The categories include subject matter categories, such as "politics", "Diet", and "international". (Id. at [0058].) The system is "for use in an automatic classifying machine, word processor, or filing system or the like which stores and/or automatically classifies documents." (Id. at [0001 ].) The system is also used to classify electronic mail and/or news. (Id. at [0061].) It is inherent that such systems would require computer programs, instructions, and/or code encoded on a computer readable medium to perform such a task.

instructions for determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments;

Yuasa discloses instructions for determining the degree of similarity between the subject matter content of the uncategorized segment and the subject matter content of each of the previously categorized segments (e.g., "a classifier for classifying documents using degrees of similarity between characteristic vectors of documents". Yuasa at UH [0005], [0009], [0011], [0013], [0018], [0030], [0032], [0046], [0048], [0055], and [0058]-[0060].

For example, Yuasa describes a process by which an example sentence is categorized according to a plurality of predetermined classification groups. (Id. at [0031]-[0046].) The classification groups include subject matter categories, such as "politics", "Diet", and "international". (Id. at [0058].) The classification groups are determined from previously categorized documents, and a representative vector is generated for each classification group. In one example, a representative document is chosen for each classification group, and a document characteristic vector is created for each representative document. (Id. at [0018].) In another example, a clustering technique is used in which "documents for which the distances between document characteristics are close [are placed] in the same field [i.e. category]". (Id. at [0017].) Yuasa determines similarity by comparing the characteristic vector of the classification group to the characteristic vector of the sample sentence. (Id. at [0031]-[0046].) "[T]he inner products of both [the characteristic vector of the sample sentence and the characteristic vector of the classification groups] are computed, and that producing the highest value is assumed to exhibit the highest degree of similarity..." (Id. at [0032].)

instructions for identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments; and

Yuasa discloses instructions for identifying one or more of the previously categorized segments as relevant to the uncategorized segment based upon the determined degrees of similarity of subject matter content between the uncategorized segment and the previously categorized segments, (e.g., "a classifier for classifying documents using degrees of similarity between characteristic vectors of documents" and "it will be possible to classify a document read in from the document memory 301 in a classification group corresponding to the representative vector that most resembles the characteristic vector(s) for that document"). Yuasa at Wl [0005], [0009], [0011], [0013], [0018], [0030], [0032], [0046], [0048], [0055], and [0058]-[0060].

For example, the Yuasa system measures the similarity between the example sentence and the previously determined classification groups by computing an inner product of the characteristic vector of the example sentence the characteristic vector of each of the classification groups. (Id. at [0031]-[0046].) "[T]he inner products of both [the characteristic vector of the sample sentence and the characteristic vector of the classification groups] are computed, and that producing the highest value is assumed to exhibit the highest degree of similarity..." (Id. at [0032].)

instructions for selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments.

Yuasa discloses instructions for selecting one or more subject matter categories with which to identify the uncategorized segment based upon the subject matter categories used to identify the relevant previously categorized segments (e.g., "it is seen that the characteristic vector for example sentence C is closest to the representative vector for classification group 3, so example sentence C is classified in classification group 3.") Yuasa at UH [0011], [0018], [0046] and [0058]-[0060].


  View Printable Version


Groklaw © Copyright 2003-2013 Pamela Jones.
All trademarks and copyrights on this page are owned by their respective owners.
Comments are owned by the individual posters.

PJ's articles are licensed under a Creative Commons License. ( Details )