decoration decoration
Stories

GROKLAW
When you want to know more...
decoration
For layout only
Home
Archives
Site Map
Search
About Groklaw
Awards
Legal Research
Timelines
ApplevSamsung
ApplevSamsung p.2
ArchiveExplorer
Autozone
Bilski
Cases
Cast: Lawyers
Comes v. MS
Contracts/Documents
Courts
DRM
Gordon v MS
GPL
Grokdoc
HTML How To
IPI v RH
IV v. Google
Legal Docs
Lodsys
MS Litigations
MSvB&N
News Picks
Novell v. MS
Novell-MS Deal
ODF/OOXML
OOXML Appeals
OraclevGoogle
Patents
ProjectMonterey
Psystar
Quote Database
Red Hat v SCO
Salus Book
SCEA v Hotz
SCO Appeals
SCO Bankruptcy
SCO Financials
SCO Overview
SCO v IBM
SCO v Novell
SCO:Soup2Nuts
SCOsource
Sean Daly
Software Patents
Switch to Linux
Transcripts
Unix Books

Gear

Groklaw Gear

Click here to send an email to the editor of this weblog.


You won't find me on Facebook


Donate

Donate Paypal


No Legal Advice

The information on Groklaw is not intended to constitute legal advice. While Mark is a lawyer and he has asked other lawyers and law students to contribute articles, all of these articles are offered to help educate, not to provide specific legal advice. They are not your lawyers.

Here's Groklaw's comments policy.


What's New

STORIES
No new stories

COMMENTS last 48 hrs
No new comments


Sponsors

Hosting:
hosted by ibiblio

On servers donated to ibiblio by AMD.

Webmaster
Java VM versus Dalvik VM | 400 comments | Create New Account
Comments belong to whoever posts them. Please notify us of inappropriate comments.
Java VMs do NOT store symbolic references IN the instructions (long post)
Authored by: Anonymous on Sunday, May 13 2012 @ 12:10 PM EDT
Here is the latest version of the VM Specification document.

All of the same stuff is still in it, e.g:
4. 4. The Constant Pool
"5.4.3. Resolution
5.4.3.2. Field Resolution
6. The Java Virtual Machine Instruction Set

"putstatic" bytecode from section 6.5
"invokevirtual" bytecode from section 6.5

I did not check for any differences in the language of those sections. But since they have to be 100% backward compatible, there won't be any major surprises.

[ Reply to This | Parent | # ]

Java VM versus Dalvik VM
Authored by: Anonymous on Sunday, May 13 2012 @ 12:51 PM EDT
In the parent post, I demonstrate that Java bytecodes don't contain symbolic references, but that they just refer to symbolic references which exist somewhere else (in the constant pool of the Java .class). But Oracle's patent claim is against Dalvik, so in this court case, its the Dalvik bytecodes that matter. A lot of the patent testimony from Google witnesses was about showing that the Dalvik bytecodes don't contain symbolic references. They were using demonstratives showing the structure of .dex files, and samples of Dalvik bytecodes, etc.


In Andrew McFadden's testimony, he described how Dalvik bytecodes don't contain symbolic references either (probably for similar reasons -- simplicity and interpreter efficiency). They keep the symbols elsewhere in the .dex file, and they can share the same symbol between multiple Dalvik bytecode instructions (just like a constant pool entry in a Java .class file can be shared by multiple Java bytecode instructions).

During cross, Oracle tried to elicit some sort of statement from him that they could twist to mean the Dalvik bytecode contained a symbolic reference:

excerpt from Cross Examination of Mr. McFadden, by Oracle:

Oracle: So the role of the iget instruction is to obtain actual field data from an object and store it in a Dalvik register, true, sir?

Mr. McFadden: Yes.

Oracle: iget with 01 as the field index, it doesn't store the number 01 in a Dalvik register, right?

Mr. McFadden: Right.

Oracle: It doesn't obtain 2 or 76 and store those in a Dalvik register, does it?

It doesn't obtain the name "byte", does it?

[five or six more questions along these lines, getting more and more excited]

Mr. McFadden: Yes.

Oracle: The actual data is what it stores, right?

Mr. McFadden: Yes.

Oracle: True or false: the Dalvik iget instruction never contains the actual memory location of the actual field it's supposed to get, true?

Mr. McFadden: True.

Oracle: True or false: the va operand is not the memory location of the actual field?

Mr. McFadden: True.

Oracle: The vb and field@CCCC are not the memory location of the actual field?

Mr. McFadden: True.

[By this point the Oracle lawyer looks like he's just triumphantly unveiled the murder weapon. I have no idea what his questions were supposed to mean, though, and the witness sounded like he didn't either. I didn't see the jury's response.]


The second half of the questions Google asked him on redirect, establish that the Dalvik instructions only contain a numeric index, and that this is NOT the same as containing a symbolic reference.

excerpt from Redirect Examination of Mr. McFadden, by Google:

Google: Why did you use the term "data" here?

Mr. McFadden: To contrast with the instruction stream.

Google: Is the instruction stream separate from the data?

Oracle: Leading.

Judge: Try not to lead.

Google: To what extent are the instructions and tables separate in the dex file?

Mr. McFadden: A lot of things are interleaved, but the instructions for a given method are distinct. They occur in a solid block inside the dex file, so you can find a chunk which is the instructions for a given method.

Google: Page 17, "insns", is that the chunk you were referring to?

Mr. McFadden: Yes.

Google: Slide 21. Mr. Jacobs asked if "1" in the instruction stream corresponds to "fun", remember that?

Mr. McFadden: I think so.

Google: What does "1" tell you?

Mr. McFadden: Index into the field IDs table.

Google: Does it tell you a location?

Mr. McFadden: Yes.

Google: What happens when you get to location 1 in the field table?

Mr. McFadden: You read the data there, and chase that to the next location.

Google: What happens when you reach the string data table?

Mr. McFadden: At that point, you're no longer working with numeric values; you've got string data, and you have to use those to find a matching field.

[More leading questions, shut down by the judge again]

Google: What's the difference between a reference where you use a symbol like this, and reference using numeric values?

Mr. McFadden: Numeric references take you directly to the next place you need to be. Symbolic references like "fun" don't give you an address, you have to take them and compare them against something else. You need to find the field that matches them. You can't just go straight there, you have to search for it.


A lot of David August's testimony was also about demonstrating that the Dalvik instructions don't contain a symbolic reference:

excerpt from David August Testimony, Google's Witness:

Google: Did you see Mitchell's report that Android's resolve.c infringes [claim numbers] of the '104 patent?

Mr. August: Yes. I believe it doesn't infringe [claims].

Resolve.c operates on instructions that contain references. The '104 patent requires that those references be symbolic references, but resolve.c doesn't do that.

I believe dexopt does not infringe claims 27 or 49 [?] of the '104 patent. They require that the instructions contain symbolic references, but dexopt doesn't operate on instructions that contain symbolic references. Also, dexopt is a static process, and the claims require that this occurs dynamically instead of statically.

excerpt from David August Testimony, Google's Witness:

Google: Okay, let's talk about the '104 patent.

Mr. August: It describes a way of executing instructions that contain symbolic references.

Figure 8 shows how to handle a symbolic reference inside a "load" instruction, and how to improve its performance. The steps involve resolving the reference "y", finding its location (in this case, slot 2), and remembering its location by overwriting the instruction "load 'y'" with "load 2".

The claims require that the symbolic references be inside the instruction.

Claim 11: "instructions containing one or more symbolic references".

Google: To what extent is that requirement also reflected in figure 8?

Mr. August: We see it throughout the patent. It's in the figures, it's in the claims, etc.

Google: Are you familiar with the Android code accused of infringing?

Mr. August: Yes, I'm very familiar.

Google: How much time have you spent viewing the code?

Mr. August: In relation to this case, more than 50 hours.

Google: Are you familiar with the format of dex files? What is it?

Mr. August: It's a Dalvik VM program file, containing both instructions and data which instruct the program how to perform an operation.

They contain symbolic references, but they're not inside instructions. You'll find plenty of symbolic references outside of instructions, but not inside instructions.

Mr. McFadden did a great job of explaining how resolve.c works at a high level.

...

I looked at the court's claim construction order: "a symbolic reference is a reference that identifies data by a name other than the numeric memory location of the data, and that is resolved dynamically rather than statically".

The Dalvik VM does not operate on instructions that contain symbolic references.

[Points out the instruction stream in McFadden's demonstrative slides, which doesn't contain any symbols]

Google: Where are the symbolic references in this figure?

Mr. August: You see some here in the string data table. "fun" was the example that was covered most heavily.

Google: To what extent does the 1 in the instruction stream represent the symbol "fun"?

Mr. August: It doesn't represent it; it gives its location. The 1 indicates that in the field ID table we'll find other data, in location 1. In this case, the other data is two pieces: there's a 02, which is a numeric reference (the location of more data), and the 76, which is another numeric reference. Those are known as indexes. It's another way of saying its location in the table.

The 02 is referring to another index, in the string ID table, which gives you 08, yet another index. This one's in the string data table, which gives the actual symbol.

Google: What's the difference between an index and an offset?

Mr. August: They're often used interchangeably. They're both numeric memory locations. In Dalvik, you might use "offset" to mean an offset from the beginning of the dex file, and "index" to refer to a location in the constant pool.

Google: Do they both refer to locations in memory?

Mr. August: They're both numeric memory locations.

excerpt from David August Testimony, Google's Witness:

Judge: So it sounds to me like the key point that you make (and believe me, I'm not saying that you're right or wrong, just trying to understand), that in that bottom box, those instructions never contain an "x" or a "y", they always contain a number?

David August: They always contain a number that refers to a memory location. You'll never find a sequence of characters like "x" or "y" there. I think there's no disagreement there.

Judge: So what is the disagreement? Google: So, what is the disagreement? [laughter]

David August: I think they must be imagining some sort of transitive property, where you follow a numeric reference to a numeric reference to a symbolic reference and it somehow makes the whole thing symbolic. Or maybe they're saying that the "instruction" contains all this data, not just the opcode and the operands.

Google: Three issues: first, Dr. Mitchell says that an index is a symbolic reference, correct?

David August: Sometimes.

Google: Is there a disagreement as to whether the instruction stream contains indexes?

David August: No, there's no disagreement. They contain indexes.

Google: Why would you move "y" out of the instruction stream and put it into the string table?

David August: It's about efficiency of executing the instructions. If you have an instruction like "52" (get), you'd like it to be a fixed size. CCCC is four digits -- well, it's hex digits, but close enough -- it's a fixed length. Now if you have a symbolic reference like "y" or, sorry, "fun", it's much more complicated. Now you have to figure out how long it is. You have to figure out when to stop, so you don't run into the other instructions. All of that is much less efficient than just knowing that an instruction that gets data is just two numbers: 52, 01, regardless of the length of the identifier.

[ Reply to This | Parent | # ]

Short form response - It's both simpler and more complicated
Authored by: bugstomper on Sunday, May 13 2012 @ 02:45 PM EDT
While it is true that the Java VM byte code instructions do not themselves
contain symbolic references, that does not necessarily mean that the Java VM
does not practice the 104 patent.

If you interpret the claims so that an instruction with a symbolic reference is
an instruction whose target refers to a name that must be looked up and mapped
to a memory location value, then an index into a constant pool that contains a
name is still a symbolic reference. By that construction, Java VM does practice
the patent and Dalvik VM still does not practice it. David August's testimony is
still correct in regard to Dalvik even though he was using a more literal
interpretation of the language of the patent which would not encompass the Java
VM.

One interesting consequence is that the re-examination document at the PTO did
not accept Google's submission of (author's name escapes me at the moment -
Steele?) prior art from Lisp on the basis that the Lisp text talked about link
smashing, which is symbol resolution that is memo-ized by replacing a link to a
symbol with a direct link to the data being referenced. The examiner said that
the patent claims talked about replacing the reference in the instruction. Thus
the symbol resolution done by the Java VM would not practice the patent
according to that limitation, and if it did then Lisp link smashing as disclosed
by (Steele?) is prior art after all.

On the other hand, the re-examiner did accept the Gries text as being prior art,
as it does talk about instructions in a way that includes the possibility of the
instruction containing an actual symbolic reference that is replaced. So if you
take the re-examination into account the claims have been rejected anyway.

[ Reply to This | Parent | # ]

Groklaw © Copyright 2003-2013 Pamela Jones.
All trademarks and copyrights on this page are owned by their respective owners.
Comments are owned by the individual posters.

PJ's articles are licensed under a Creative Commons License. ( Details )