Welcome to AVR32 Linux... Users Groups

JavaExtensionModule

Updates

07.05.2009 Progress of this project is more closely documented on http://www.open-technology.de/index.php?/pages/jem.html and in the respective blog category. We still hope we will be documenting major steps in our work here too. One of such major achievements is our success in implementing multithreading support for JEM, see the blog for more details.

28.12.2008 We haven't updated this page since a long time, now we are glad to announce, that as of tag jem-0.1 of the git-tree at http://repo.or.cz/w/jamvm-avr32-jem.git the JEM port of JamVM passes the "Hello, World!" test. This was our first goal and now we have reached it. Of course, there is still a lot of work to be done, including but not limited to optimisations, new opcodes, garbage collections. We are also planning to attend the Free Java Meeting at FOSDEM 2009, so, please, feel free to talk to us there.

Understanding the problem

Attention: this page is used by developers, trying to implement JEM support under AVR32Linux . There's no guarantee whatsoever, that this project ever produces any results, so, don't hold your breath:-) OTOH, new developers are welcome!

AVR32 SoCs have Java Extension Modules (JEM), capable of running Java bytecode on hardware. This page is dedicated to an attempt to implement support for JEM in the Linux kernel and the JamVM Java virtual machine implementation, currently the only JVM ported to AVR32.

We start by studying the Java (R) Technical Reference - document from Atmel, describing JEM. The document is freely available for download under http://www.atmel.com/dyn/resources/prod_documents/doc32049.pdf.

  1. JamVM has to enter the JEM, which is done by executing the RETJ instruction. Before calling it a few preparatory operations have to be performed, like creating a Java frame, saving R0-R7, R11, R12, etc. A few JamVM internal structures have to be modified or extended for use with JEM. For instance, JamVM pre-parses the class and converts the Constant Pool to an internal representation, whereas JEM needs the Constant Pool in its original form. Another significant difference is the direction in which the Operand Stack is grown. JamVM pushes new elements onto the Operand Stack at increasing addresses, whereas JEM grows the stack downwards in RAM. This means elements have to be exchanged between the JEM OS and the JamVM OS one at a time and memcpy-like copying cannot be used.
  2. Entering JEM has to be done in a JEM-entry function. The JamVM interpreter engine should prepare current method frame, save registers, put the next executed bytecode address to LR, then call "RETJ", while it should again monitor the VM state, handle any exceptions or errors thrown from the traps, prepare next method invocation, and do cleanup before exiting VM.
  3. If RETJ is executed in supervisor mode, the GM bit in SR will be cleared, thus re-enabling all interrupts. Attention! this is what the AVR32 Architecture Manual says on page 299 when describing the RETJ instruction and on page 13 in the GM-bit description. However, the JEM document on page 6 in section 2.1.1 "Entering Java mode" says: "If executed in supervisor mode, the GM flag in the status register is set." The same was suggested in a reply from AVR support. We still hope the former is right...
  4. To make switching to a thread, that has last been preempted in JEM mode, from the kernel easier, one can set the J-bit manually in advance, and then when one of RETE or RETS is executed, execution in user-space will continue in JEM mode.
  5. All Java method invocations are trapped, and have to be resolved in JVM. Notice two method invocation types: static and dynamic.
  6. Invoking a method is shortly described on page 8 of the Technical Reference.
  7. All method invocations in Java are implemented with different versions of the INVOKE opcode and cause different TRAP codes (see below).
  8. Leaving JEM is done by executing an instruction, that is not supported by JEM, i.e., also through a trap.
  9. Trap processing see chapter 2.5.
  10. Trap handlers also run in application mode (The doc description is vague see chapter 2.5.1 the last sentence). The statement in paragraph 2 on page 2 of the JEM reference is thus wrong. (Note: mtsr/mfsr instructions can be used to retrieve JOSP (java operand stack) and JECR (trap cause) in unprivileged mode)
  11. If a CPU exception occurs when in Java mode, for example, an interrupt, the J-bit in SR is cleared, and the CPU jumps to the exception handler. The SR register is saved in the respective RSR_* register, this can then be used to detect, that the exception happened in Java mode, and then set the J-bit again before calling RETE or RETS.
  12. The following situations are possible:
    1. JEM caused a trap, that has to be implemented in software. The function has to be executed, and RETJ called;
    2. Java thread execution has been preempted by an interrupt while in JEM mode, and the next Java thread, that gets scheduled is the preempted one, just set the J-bit (with minimum necessary context restore) and proceed to performing RETE or RETS as usual;
    3. as above, but another thread is scheduled in, that last time was in JEM mode. Save current and restore scheduled JEM context, set the J-bit;
    4. as above, but the other thread was not in JEM mode. Save current context, proceed as normal;
    5. PROBLEM what if after preempting a non-Java task the scheduler decided to schedule a Java thread, that last time was preempted in JEM mode?...
    6. 1st solution: Teach the scheduler to treat tasks, that last time were preempted in non-native mode to switch to them in a special way;
    7. 2nd solution: ThumbEE uses a thread notifier, registered with thread_register_notifier(). There one can set the J-bit and just wait for RETE / RETS.
    8. necessary fields can be added to struct thread_info in include/asm-avr32/thread_info.h. See arch/arm/kernel/thumbee.c.

Java OPs, that always call TRAPs (there are also some, which cause TRAPS only under some conditions), except INVOKEs

  1. 9: CHECKCAST_QUICK
  2. 10: INSTANCEOF_QUICK
  3. 11: GETSTATIC
  4. 12: PUTSTATIC
  5. 13: NEW_QUICK
  6. 19: ANEWARRAY_QUICK, MULTIANEWARRAY, MULTIANEWARRAY_QUICK, NEWARRAY
  7. 20: LCMP, LDIV, LMUL, LNEG, LREM, LSHL, LSHR, LUSHR
  8. 21: F2D , F2I , F2L , FADD, FCMP[GL], FDIV, FMUL, FNEG, FREM, FSUB, I2F , L2F
  9. 22: D2F , D2I , D2L , DADD, DCMP[GL], DDIV, DMUL, DNEG, DREM, DSUB, I2D , L2D
  10. 23: ANEWARRAY, ATHROW, CHECKCAST, DUP_X2, DUP2_X1, DUP2_X2, GETFIELD, GOTO_W, INSTANCEOF, LDC, LDC_W, LDC2_W, LOOKUPSWITCH, MONITORENTER, MONITOREXIT, NEW, PUTFIELD, TABLESWITCH, WIDE

Conditional TRAPs

  1. 8: IINC
  2. 0: AASTORE
  3. (incomplete)

INVOKE, JSR and RETURN OPs

  1. 5: ARETURN, FRETURN, IRETURN
  2. 6: RETURN
  3. 7: DRETURN, LRETURN
  4. 14: INVOKESTATIC
  5. 15: INVOKEINTERFACE_QUICK
  6. 16: INVOKEVIRTUAL_QUICK
  7. 17: INVOKESTATIC_QUICK
  8. 18: INVOKENONVIRTUAL_QUICK 1. 23: INVOKEINTERFACE, INVOKESPECIAL, INVOKESUPER_QUICK, INVOKEVIRTUAL, INVOKEVIRTUAL_QUICK_W, INVOKEVIRTUALOBJECT_QUICK, JSR_W

JamVM interna

At the same time we study the JamVM source code to identify places, that have to be modified.

  1. architecture-specific code:
    1. src/arch/avr32.h
    2. src/os/linux/avr32
  2. ./configure invocation parameters:
./configure --target=avr32-linux --host=avr32-linux --build=i686-linux --without-x --without-alsa --disable-gtk-peer

without --enable-int-inlining INLINING is not defined
  1. Call-trace:
    1. jam.c::main()
    2. src/jam.h::executeStaticMethod()
    3. src/execute.c::executeMethodArgs()
    4. src/execute.c::executeMethodVaList()
    5. src/interp/engine/interp.c::executeJava()
  2. Functions, that has to be modified to run on JEM:
    1. src/os/linux/avr32.bak/init.c::initialisePlatform() (if any global initialization needed)
    2. executeJava() - complete new alternative implementation for JEM
    3. initialiseJavaStack() (?)
  3. Prepare a Java frame: the information to put in the new frame shall be extracted from the compiler output (.class). This includes Incoming Arguments, space for Local Variables, Invoker's Method Context, Operand Stack. See parsing of the "Code" attribute in src/class.c::defineClass(). If the interpreter in src/interp/engine/interp.c detects a dynamic method invocation opcode, like INVOKEVIRTUAL, it calls resolveMethod(), which, should, according to The Java Virtual Machine Specification, load all required classes recursively, and initialize them. This also involves reading out compiler provided Operand Stack and Local Variable storage sizes.
  • JamVM interpreter engine
    • INTRO: "Rob's talk mostly concentrated on JamVM's implementation approach. It is what I would describe as a state of the art interpreter. It is direct threaded, it does stack caching (it has 2 cache registers and hence 3 versions of each opcode -- except on x86 where this is not very useful), and does dispatch prefetching. Apparently the stack cache yielded a 20-50% improvement, depending on the benchmark, especially in combination with the dispatch prefetching (also called interpreter pipelining or something like that). "
    • Dispatched interpreter design
      • switch dispatch
        In this method the VM interpreter contains a giant switch statement, with one case for each VM instruction. The VM instruction opcodes are represented by integers (e.g., produced by an enum) in the VM code, and dispatch occurs by loading the next opcode, switching on it, and continuing at the appropriate case; after executing the VM instruction, the VM interpreter jumps back to the dispatch code. (See VmGen document)
      • threaded code
        This method represents a VM instruction opcode by the address of the start of the machine code fragment for executing the VM instruction. Dispatch consists of loading this address, jumping to it, and incrementing the VM instruction pointer. Typically the threaded-code dispatch code is appended directly to the code for executing the VM instruction. Threaded code cannot be implemented in ANSI C, but it can be implemented using GNU C complier's labels-as-values extension (see VmGen document).
    • threaded interpreter
      • configure parameter : --enable-int-threading
      • CONTROL MACRO: THREADED
      • Implementation (TODOs):
        • L(opcode,level,label)
        • D(opcode,level,label)
        • I(opcode,level,label)
        • DEF_HANDLE_TABLES
        • DEF_HANDLE_TABLE
        • DEF_OPC*
    • direct threaded interpreter
      • configure parameter : --enable-int-threading --enable-int-direct
      • CONTROL MACRO: THREADED, DIRECT
    • traditional "switch-clause" interpreter (Trivial test)
      • configure parameter : --disable-int-threading --disable-int-direct

  • Hardware GC support
    • Object handler
    • JamVM's GC

  • JNI/JamVM's optimized native method invocation

Roadmap

The work can be broken into the following parts:

  1. General interface design - Java frame structure, status save / restore, etc.
  2. Kernel modifications:
    1. add a field to the struct thread_info,
    2. write a function to be called on all entry paths to verify if entered from the JEM mode, set flag in thread_info and save context if this was the case
    3. write a notifier (see above) to check new thread JEM status, restore context and set the J-bit if needed
  3. Kernel additions (If trap handles are executing in supervisor mode):
    1. write JEM trap handlers (Will slowdown the interpreter since each trap requires a context switch):
      • save context,
      • re-enable interrupts
      • collect necessary info for user-space trap-handlers
      • resume the trapping thread for trap-processing
  4. JamVM
    1. write a new executeJava()
    2. write a JEM-abort reason decoder
    3. write trap handlers
    4. implement JNI for JEM

Entering JEM

/* Prepare a Java frame. The information to put in the new frame shall be
 * extracted from the compiler output (.class). This includes Incoming
 * Arguments, space for Local Variables, Invoker's Method Context, Operand
 * Stack. See parsing of the "Code" attribute in src/class.c::defineClass(). If
 * the interpreter in src/interp/engine/interp.c detects a dynamic method
 * invocation opcode, like INVOKEVIRTUAL, it calls resolveMethod(), which,
 * should, according to The Java Virtual Machine Specification, load all
 * required classes recursively, and initialize them. This also involves reading
 * out compiler provided Operand Stack and Local Variable storage sizes.
 */

u32 regs[12];

regs[0] = R0;
regs[1] = R1;
regs[2] = R2;
regs[3] = R3;
regs[4] = R4;
regs[5] = R5;
regs[6] = R6;
regs[7] = R7;
regs[8] = R8;
regs[9] = R9;
/* R10 seems to be preserved and not overwritten */
regs[10] = R11;
regs[11] = R12;

/* R8 must point to the class constant pool. However, JamVM ATM doesn't preserve
 * a pointer to the original constant pool, it just parses it into an internal
 * representation. The in-RAM copy of the class is then freed. We will have to
 * copy the pool into RAM. Most Java Opcodes, working with the Constant Pool,
 * are implemented in JEM using Traps. If this was the case with all such
 * operations, one wouldn't necessarily have to support the Constant Pool in R8,
 * however, some instructions are implemented natively, namely GETSTATIC_QUICK,
 * GETSTATIC2_QUICK, LDC_QUICK, LDC_W_QUICK, LDC2_W_QUICK, PUTSTATIC_QUICK,
 * PUTSTATIC2_QUICK */
R8 = &class->ConstantPool;

/* R9 shall point at the first local variable in the frame (JEM p7). The frame
 * pointer in Figure 2-1 on page 7 of JEM looks suspicious: it says, that if,
 * for example, the local variable block occupies memory from 0 to 0xff, then
 * LV0 is located at 0xfc (if 32 bit long), and r9 should contain 0xfc. Let's
 * just keep this in mind, but presume this is correct for now. 8-, 16-, and
 * 32-bit variables are stored in 32-bit Local Variable entries, whereas 64-bit
 * variables are stored in such two consequent entries. Since all 64-bit
 * operations are trapped, we can just always set R9 to point to the last
 * 32-bit entry, just remember this when processing traps. */
R9 = &vars[mb->max_locals - 1];

LR = java_entry;

RETJ;

Current JamVM Frame / Stack layout (src/interp/engine/interp.c line 2222)

lvars
   uintptr_t vars[mb->max_locals];
Frame
   CodePntr last_pc;
   uintptr_t *lvars;
   uintptr_t *ostack;
   MethodBlock *mb;
   struct frame *prev;
ostack
   uintptr_t stack[mb->max_stack];

Java Operand Stack

JEM uses r0-r7 for the Operand Stack in Java mode, and the Top of Stack is always in r7. This means, that when the first value is pushed onto the stack. it is written into r7. When the second value is pushed onto the stack, the stack is shifted, like when issuing the "incjosp 1" AVR32 instruction, the old value lands in r6, and the new one is put in r7 again, etc. When the eighth value is pushed onto the Operand Stack, the oldest value ends up in r0, and the newest one again in r7. The ninth value causes a Stack Overflow exception.

Due to a bug in AVR32 CPU, after every Java trap the Java Stack has to be cleared by issuing the suitable number of "incjosp -1" commands. The Java enter / leave procedure can be sketched as follows:

   {r7..r0} = saved_JOS;
   while (saved_josp--)
      incjosp 1;
   retj;
   saved_josp = JOSP;
   while (JOSP--)
      incjosp -1;
   saved_JOS = {r7..r0};

If you have to push more elements on the stack during trap processing, for example when implementing the NEW and NEW_QUICK opcodes, you have to differentiate between two cases: there is space on the stack for one more element, or there isn't. In the former case you have to

  1. shift the saved stack in saved_JOS, e.g., using memmov
  2. increment saved_josp
in the latter case you have to
  1. free some space on the stack, e.g., a half, by copying values corresponding to registers r0, r1, r2, and r3 (in case of 4 stack positions to be freed) to the stack on the frame
  2. shift saved_JOS one element down
  3. put the new value in the freed element
  4. set saved_JOSP to 5

Procedure similar to the second case above has to be applied when processing stack overflow, reverse for stack underflow.

JamVM objects vs. JEM objects

As, probably, all other Java VMs, JamVM uses private headers to store internal information about all Java objects and stores pointers to those headers on the stack, in local variables and on the heap. JEM, however, operates on Java objects directly, and, logically, requires pointers to those in all the data it handles, including objects on the heap. Having at first tried to preserve the original JamVM structure as much as possible an attempt has been made to convert object references between "header pointers" and "instance pointers" every time control has been passed from JamVM to JEM and back. This, however, proved to be very inefficient and error prone. Therefore a decision has been made to convert JamVM to also use instance pointers everywhere. The affected code has been identified and converted, and, as of 28.12.2008 (git tag jem-0.1) JEM-based JamVM passes the "Hello, World!" test with this change. However, it is still possible, that some further places in the code have not yet been identified and will need to be converted. * 20080811-avr-kernel-jem_trap.gitdiff: A minimal patch to the Linux kernel to set the Java Trap Base Address

  File Size Date By Actions
else 20080811-avr-kernel-jem_trap.gitdiff
A minimal patch to the Linux kernel to set the Java Trap Base Address
2.5 K 2008-12-28 - 21:47 GuennadiLiakhovetski props, move
r20 - 2009-05-07 - 10:10:38 - GuennadiLiakhovetski
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.
Atmel®, AVR® and others are registered trademarks or trademarks of Atmel Corporation or its subsidiaries.
All other trademarks are the property of their respective owners.
Syndicate this site RSSATOM