The JAVA Accelerator
Java is becoming an increasingly important standard in the embedded and mobile systems space. The main benefits of using Java in this application area are:
- Binary portability.
- Small binary size - Useful for distribution over a network (wireless / low bandwidth).
- Well defined security model.
A problem with all Java implementations is achieving satisfactory run-time performance. In different application domains this could be measured in terms of a weighted difference in execution speed and memory requirements compared to native compiled code. Penalties in these areas are due to the binary translation from Java byte-code to native code necessary during execution. Typically, the use of techiniques such as just in time (JIT) compilation improve execution speed at the expense of using much more memory in comparison to simpler interpreters.
Research is being carried out in the APT group to try and improve the execution speed of Java programs in embedded systems, with low memory requirements, by adding hardware support for Java execution. Currently scope for hardware Java support is being investigated at the processor pipeline level. The aim is to reduce the memory footprint and execution time required to execute a given piece of Java code with a relatively small amount of extra hardware performing translation of Java binaries to native code. A reduction of memory accesses and processor cycles will also reduce power requirements, suiting the specific needs of embedded Java enabled systems.
Although processor pipeline level Java accelerators exist in the marketplace already, work in Manchester is looking at novel translation mechanisms, and the possibility of exploiting asynchronous design. The central idea being that a single pipeline stage can perform mostly simple translation steps interspersed with more time consuming optimisation stages. Average case performance, due to asynchronous design style, can be achieved within a clean and simple (single step) architecture and therefore result in small and low-power design.
JASPA - Java Aware SPA
JASPA is a Java enabled version of the SPA processor developed within the APT group. The Java decoder is a simple prototype, based on the translation of frequently used single Java byte-codes to ARM instructions. Any unhandled byte-codes are handled in software. The hardware also takes the interpreter loop out of software and classifies all byte-codes further saving time over a software only implementation. Around 80 simple byte-codes are handled directly in hardware improving speed by a factor of 4-8 over a software interpreter. As the Java instruction set is stack based, 4 registers are used as a stack cache to facilitate translation, however local variables were left uncached.
As with the SPA processor, the Java decoder was described in Balsa, as an initial feasibility study. The architecture is shown in the diagram below. The synthesised circuit was composed of around 40,000 transistors in single-rail bundled data technology and around 80,000 when synthesised in dual-rail QDI.
Current Research
Current research is progressing in the area of novel hardware support techniques for binary translation. This work is focusing on the simulation of different processor pipelines and Java translator designs, looking at different JVM software-hardware partitioning strategies.