pdk/docs/guide/dalvik.jd

   1 page.title=Dalvik
   2 pdk.version=1.0
   3 @jd:body
   4
   5 <div id="qv-wrapper">
   6 <div id="qv">
   7 <h2>In this document</h2>
   8 <a name="toc"/>
   9 <ul>
  10 <li><a href="#dalvikCoreLibraries">Core Libraries</a></li>
  11 <li><a href="#dalvikJNICallBridge">JNI Call Bridge</a></li>
  12 <li><a href="#dalvikInterpreter">Interpreter</a></li>
  13 </ul>
  14 </div>
  15 </div>
  16
  17 <p>
  18 The Dalvik virtual machine is intended to run on a variety of platforms.
  19 The baseline system is expected to be a variant of UNIX (Linux, BSD, Mac
  20 OS X) running the GNU C compiler.  Little-endian CPUs have been exercised
  21 the most heavily, but big-endian systems are explicitly supported.
  22 </p><p>
  23 There are two general categories of work: porting to a Linux system
  24 with a previously unseen CPU architecture, and porting to a different
  25 operating system.  This document covers the former.
  26 </p>
  27
  28
  29 <a name="dalvikCoreLibraries"></a><h3>Core Libraries</h3>
  30
  31 <p>
  32 The native code in the core libraries (chiefly <code>dalvik/libcore</code>,
  33 but also <code>dalvik/vm/native</code>) is written in C/C++ and is expected
  34 to work without modification in a Linux environment.  Much of the code
  35 comes directly from the Apache Harmony project.
  36 </p><p>
  37 The core libraries pull in code from many other projects, including
  38 OpenSSL, zlib, and ICU.  These will also need to be ported before the VM
  39 can be used.
  40 </p>
  41
  42
  43 <a name="dalvikJNICallBridge"></a><h3>JNI Call Bridge</h3>
  44
  45 <p>
  46 Most of the Dalvik VM runtime is written in portable C.  The one
  47 non-portable component of the runtime is the JNI call bridge.  Simply put,
  48 this converts an array of integers into function arguments of various
  49 types, and calls a function.  This must be done according to the C calling
  50 conventions for the platform.  The task could be as simple as pushing all
  51 of the arguments onto the stack, or involve complex rules for register
  52 assignment and stack alignment.
  53 </p><p>
  54 To ease porting to new platforms, the <a href="http://sourceware.org/libffi/">
  55 open-source FFI library</a> (Foreign Function Interface) is used when a
  56 custom bridge is unavailable.  FFI is not as fast as a native implementation,
  57 and the optional performance improvements it does offer are not used, so
  58 writing a replacement is a good first step.
  59 </p><p>
  60 The code lives in <code>dalvik/vm/arch/*</code>, with the FFI-based version
  61 in the "generic" directory.  There are two source files for each architecture.
  62 One defines the call bridge itself:
  63 </p><p><blockquote>
  64 <code>void dvmPlatformInvoke(void* pEnv, ClassObject* clazz, int argInfo,
  65 int argc, const u4* argv, const char* signature, void* func,
  66 JValue* pReturn)</code>
  67 </blockquote></p><p>
  68 This will invoke a C/C++ function declared:
  69 </p><p><blockquote>
  70     <code>return_type func(JNIEnv* pEnv, Object* this [, <i>args</i>])<br></code>
  71 </blockquote>or (for a "static" method):<blockquote>
  72     <code>return_type func(JNIEnv* pEnv, ClassObject* clazz [, <i>args</i>])</code>
  73 </blockquote></p><p>
  74 The role of <code>dvmPlatformInvoke</code> is to convert the values in
  75 <code>argv</code> into C-style calling conventions, call the method, and
  76 then place the return type into <code>pReturn</code> (a union that holds
  77 all of the basic JNI types).  The code may use the method signature
  78 (a DEX "shorty" signature, with one character for the return type and one
  79 per argument) to determine how to handle the values.
  80 </p><p>
  81 The other source file involved here defines a 32-bit "hint".  The hint
  82 is computed when the method's class is loaded, and passed in as the
  83 "argInfo" argument.  The hint can be used to avoid scanning the ASCII
  84 method signature for things like the return value, total argument size,
  85 or inter-argument 64-bit alignment restrictions.
  86 </p>
  87
  88 <a name="dalvikInterpreter"></a><h3>Interpreter</h3>
  89
  90 <p>
  91 The Dalvik runtime includes two interpreters, labeled "portable" and "fast".
  92 The portable interpreter is largely contained within a single C function,
  93 and should compile on any system that supports gcc.  (If you don't have gcc,
  94 you may need to disable the "threaded" execution model, which relies on
  95 gcc's "goto table" implementation; look for the THREADED_INTERP define.)
  96 </p><p>
  97 The fast interpreter uses hand-coded assembly fragments.  If none are
  98 available for the current architecture, the build system will create an
  99 interpreter out of C "stubs".  The resulting "all stubs" interpreter is
 100 quite a bit slower than the portable interpreter, making "fast" something
 101 of a misnomer.
 102 </p><p>
 103 The fast interpreter is enabled by default.  On platforms without native
 104 support, you may want to switch to the portable interpreter.  This can
 105 be controlled with the <code>dalvik.vm.execution-mode</code> system
 106 property.  For example, if you:
 107 </p><p><blockquote>
 108 <code>adb shell "echo dalvik.vm.execution-mode = int:portable >> /data/local.prop"</code>
 109 </blockquote></p><p>
 110 and reboot, the Android app framework will start the VM with the portable
 111 interpreter enabled.
 112 </p>
 113
 114
 115 <h3>Mterp Interpreter Structure</h3>
 116
 117 <p>
 118 There may be significant performance advantages to rewriting the
 119 interpreter core in assembly language, using architecture-specific
 120 optimizations.  In Dalvik this can be done one instruction at a time.
 121 </p><p>
 122 The simplest way to implement an interpreter is to have a large "switch"
 123 statement.  After each instruction is handled, the interpreter returns to
 124 the top of the loop, fetches the next instruction, and jumps to the
 125 appropriate label.
 126 </p><p>
 127 An improvement on this is called "threaded" execution.  The instruction
 128 fetch and dispatch are included at the end of every instruction handler.
 129 This makes the interpreter a little larger overall, but you get to avoid
 130 the (potentially expensive) branch back to the top of the switch statement.
 131 </p><p>
 132 Dalvik mterp goes one step further, using a computed goto instead of a goto
 133 table.  Instead of looking up the address in a table, which requires an
 134 extra memory fetch on every instruction, mterp multiplies the opcode number
 135 by a fixed value.  By default, each handler is allowed 64 bytes of space.
 136 </p><p>
 137 Not all handlers fit in 64 bytes.  Those that don't can have subroutines
 138 or simply continue on to additional code outside the basic space.  Some of
 139 this is handled automatically by Dalvik, but there's no portable way to detect
 140 overflow of a 64-byte handler until the VM starts executing.
 141 </p><p>
 142 The choice of 64 bytes is somewhat arbitrary, but has worked out well for
 143 ARM and x86.
 144 </p><p>
 145 In the course of development it's useful to have C and assembly
 146 implementations of each handler, and be able to flip back and forth
 147 between them when hunting problems down.  In mterp this is relatively
 148 straightforward.  You can always see the files being fed to the compiler
 149 and assembler for your platform by looking in the
 150 <code>dalvik/vm/mterp/out</code> directory.
 151 </p><p>
 152 The interpreter sources live in <code>dalvik/vm/mterp</code>.  If you
 153 haven't yet, you should read <code>dalvik/vm/mterp/README.txt</code> now.
 154 </p>
 155
 156
 157 <h3>Getting Started With Mterp</h3>
 158
 159 </p><p>
 160 Getting started:
 161 <ol>
 162 <li>Decide on the name of your architecture.  For the sake of discussion,
 163 let's call it <code>myarch</code>.
 164 <li>Make a copy of <code>dalvik/vm/mterp/config-allstubs</code> to
 165 <code>dalvik/vm/mterp/config-myarch</code>.
 166 <li>Create a <code>dalvik/vm/mterp/myarch</code> directory to hold your
 167 source files.
 168 <li>Add <code>myarch</code> to the list in
 169 <code>dalvik/vm/mterp/rebuild.sh</code>.
 170 <li>Make sure <code>dalvik/vm/Android.mk</code> will find the files for
 171 your architecture.  If <code>$(TARGET_ARCH)</code> is configured this
 172 will happen automatically.
 173 </ol>
 174 </p><p>
 175 You now have the basic framework in place.  Whenever you make a change, you
 176 need to perform two steps: regenerate the mterp output, and build the
 177 core VM library.  (It's two steps because we didn't want the build system
 178 to require Python 2.5.  Which, incidentally, you need to have.)
 179 <ol>
 180 <li>In the <code>dalvik/vm/mterp</code> directory, regenerate the contents
 181 of the files in <code>dalvik/vm/mterp/out</code> by executing
 182 <code>./rebuild.sh</code>.  Note there are two files, one in C and one
 183 in assembly.
 184 <li>In the <code>dalvik</code> directory, regenerate the
 185 <code>libdvm.so</code> library with <code>mm</code>.  You can also use
 186 <code>make libdvm</code> from the top of the tree.
 187 </ol>
 188 </p><p>
 189 This will leave you with an updated libdvm.so, which can be pushed out to
 190 a device with <code>adb sync</code> or <code>adb push</code>.  If you're
 191 using the emulator, you need to add <code>make snod</code> (System image,
 192 NO Dependency check) to rebuild the system image file.  You should not
 193 need to do a top-level "make" and rebuild the dependent binaries.
 194 </p><p>
 195 At this point you have an "all stubs" interpreter.  You can see how it
 196 works by examining <code>dalvik/vm/mterp/cstubs/entry.c</code>.  The
 197 code runs in a loop, pulling out the next opcode, and invoking the
 198 handler through a function pointer.  Each handler takes a "glue" argument
 199 that contains all of the useful state.
 200 </p><p>
 201 Your goal is to replace the entry method, exit method, and each individual
 202 instruction with custom implementations.  The first thing you need to do
 203 is create an entry function that calls the handler for the first instruction.
 204 After that, the instructions chain together, so you don't need a loop.
 205 (Look at the ARM or x86 implementation to see how they work.)
 206 </p><p>
 207 Once you have that, you need something to jump to.  You can't branch
 208 directly to the C stub because it's expecting to be called with a "glue"
 209 argument and then return.  We need a C stub "wrapper" that does the
 210 setup and jumps directly to the next handler.  We write this in assembly
 211 and then add it to the config file definition.
 212 </p><p>
 213 To see how this works, create a file called
 214 <code>dalvik/vm/mterp/myarch/stub.S</code> that contains one line:
 215 <pre>
 216 /* stub for ${opcode} */
 217 </pre>
 218 Then, in <code>dalvik/vm/mterp/config-myarch</code>, add this below the
 219 <code>handler-size</code> directive:
 220 <pre>
 221 # source for the instruction table stub
 222 asm-stub myarch/stub.S
 223 </pre>
 224 </p><p>
 225 Regenerate the sources with <code>./rebuild.sh</code>, and take a look
 226 inside <code>dalvik/vm/mterp/out/InterpAsm-myarch.S</code>.  You should
 227 see 256 copies of the stub function in a single large block after the
 228 <code>dvmAsmInstructionStart</code> label.  The <code>stub.S</code>
 229 code will be used anywhere you don't provide an assembly implementation.
 230 </p><p>
 231 Note that each block begins with a <code>.balign 64</code> directive.
 232 This is what pads each handler out to 64 bytes.  Note also that the
 233 <code>${opcode}</code> text changed into an opcode name, which should
 234 be used to call the C implementation (<code>dvmMterp_${opcode}</code>).
 235 </p><p>
 236 The actual contents of <code>stub.S</code> are up to you to define.
 237 See <code>entry.S</code> and <code>stub.S</code> in the <code>armv5te</code>
 238 or <code>x86</code> directories for working examples.
 239 </p><p>
 240 If you're working on a variation of an existing architecture, you may be
 241 able to use most of the existing code and just provide replacements for
 242 a few instructions.  Look at the <code>armv4t</code> implementation as
 243 an example.
 244 </p>
 245
 246
 247 <h3>Replacing Stubs</h3>
 248
 249 <p>
 250 There are roughly 230 Dalvik opcodes, including some that are inserted by
 251 <a href="dexopt.html">dexopt</a> and aren't described in the
 252 <a href="dalvik-bytecode.html">Dalvik bytecode</a> documentation.  Each
 253 one must perform the appropriate actions, fetch the next opcode, and
 254 branch to the next handler.  The actions performed by the assembly version
 255 must exactly match those performed by the C version (in
 256 <code>dalvik/vm/mterp/c/OP_*</code>).
 257 </p><p>
 258 It is possible to customize the set of "optimized" instructions for your
 259 platform.  This is possible because optimized DEX files are not expected
 260 to work on multiple devices.  Adding, removing, or redefining instructions
 261 is beyond the scope of this document, and for simplicity it's best to stick
 262 with the basic set defined by the portable interpreter.
 263 </p><p>
 264 Once you have written a handler that looks like it should work, add
 265 it to the config file.  For example, suppose we have a working version
 266 of <code>OP_NOP</code>.  For demonstration purposes, fake it for now by
 267 putting this into <code>dalvik/vm/mterp/myarch/OP_NOP.S</code>:
 268 <pre>
 269 /* This is my NOP handler */
 270 </pre>
 271 </p><p>
 272 Then, in the <code>op-start</code> section of <code>config-myarch</code>, add:
 273 <pre>
 274     op OP_NOP myarch
 275 </pre>
 276 </p><p>
 277 This tells the generation script to use the assembly version from the
 278 <code>myarch</code> directory instead of the C version from the <code>c</code>
 279 directory.
 280 </p><p>
 281 Execute <code>./rebuild.sh</code>.  Look at <code>InterpAsm-myarch.S</code>
 282 and <code>InterpC-myarch.c</code> in the <code>out</code> directory.  You
 283 will see that the <code>OP_NOP</code> stub wrapper has been replaced with our
 284 new code in the assembly file, and the C stub implementation is no longer
 285 included.
 286 </p><p>
 287 As you implement instructions, the C version and corresponding stub wrapper
 288 will disappear from the output files.  Eventually you will have a 100%
 289 assembly interpreter.
 290 </p>
 291
 292
 293 <h3>Interpreter Switching</h3>
 294
 295 <p>
 296 The Dalvik VM actually includes a third interpreter implementation: the debug
 297 interpreter.  This is a variation of the portable interpreter that includes
 298 support for debugging and profiling.
 299 </p><p>
 300 When a debugger attaches, or a profiling feature is enabled, the VM
 301 will switch interpreters at a convenient point.  This is done at the
 302 same time as the GC safe point check: on a backward branch, a method
 303 return, or an exception throw.  Similarly, when the debugger detaches
 304 or profiling is discontinued, execution transfers back to the "fast" or
 305 "portable" interpreter.
 306 </p><p>
 307 Your entry function needs to test the "entryPoint" value in the "glue"
 308 pointer to determine where execution should begin.  Your exit function
 309 will need to return a boolean that indicates whether the interpreter is
 310 exiting (because we reached the "bottom" of a thread stack) or wants to
 311 switch to the other implementation.
 312 </p><p>
 313 See the <code>entry.S</code> file in <code>x86</code> or <code>armv5te</code>
 314 for examples.
 315 </p>
 316
 317
 318 <h3>Testing</h3>
 319
 320 <p>
 321 A number of VM tests can be found in <code>dalvik/tests</code>.  The most
 322 useful during interpreter development is <code>003-omnibus-opcodes</code>,
 323 which tests many different instructions.
 324 </p><p>
 325 The basic invocation is:
 326 <pre>
 327 $ cd dalvik/tests
 328 $ ./run-test 003
 329 </pre>
 330 </p><p>
 331 This will run test 003 on an attached device or emulator.  You can run
 332 the test against your desktop VM by specifying <code>--reference</code>
 333 if you suspect the test may be faulty.  You can also use
 334 <code>--portable</code> and <code>--fast</code> to explictly specify
 335 one Dalvik interpreter or the other.
 336 </p><p>
 337 Some instructions are replaced by <code>dexopt</code>, notably when
 338 "quickening" field accesses and method invocations.  To ensure
 339 that you are testing the basic form of the instruction, add the
 340 <code>--no-optimize</code> option.
 341 </p><p>
 342 There is no in-built instruction tracing mechanism.  If you want
 343 to know for sure that your implementation of an opcode handler
 344 is being used, the easiest approach is to insert a "printf"
 345 call.  For an example, look at <code>common_squeak</code> in
 346 <code>dalvik/vm/mterp/armv5te/footer.S</code>.
 347 </p><p>
 348 At some point you need to ensure that debuggers and profiling work with
 349 your interpreter.  The easiest way to do this is to simply connect a
 350 debugger or toggle profiling.  (A future test suite may include some
 351 tests for this.)
 352 </p>
 353
 354