Introduction
============
-TCG (Tiny Code Generator) began as a generic backend for a C
-compiler. It was simplified to be used in QEMU. It also has its roots
-in the QOP code generator written by Paul Brook.
+TCG (Tiny Code Generator) began as a generic backend for a C compiler.
+It was simplified to be used in QEMU. It also has its roots in the
+QOP code generator written by Paul Brook.
Definitions
===========
-TCG receives RISC-like *TCG ops* and performs some optimizations on them,
-including liveness analysis and trivial constant expression
-evaluation. TCG ops are then implemented in the host CPU back end,
-also known as the TCG target.
-
-The TCG *target* is the architecture for which we generate the
-code. It is of course not the same as the "target" of QEMU which is
-the emulated architecture. As TCG started as a generic C backend used
-for cross compiling, it is assumed that the TCG target is different
-from the host, although it is never the case for QEMU.
+The TCG *target* is the architecture for which we generate the code.
+It is of course not the same as the "target" of QEMU which is the
+emulated architecture. As TCG started as a generic C backend used
+for cross compiling, the assumption was that TCG target might be
+different from the host, although this is never the case for QEMU.
In this document, we use *guest* to specify what architecture we are
emulating; *target* always means the TCG target, the machine on which
we are running QEMU.
-A TCG *function* corresponds to a QEMU Translated Block (TB).
-
-A TCG *temporary* is a variable only live in a basic block. Temporaries are allocated explicitly in each function.
-
-A TCG *local temporary* is a variable only live in a function. Local temporaries are allocated explicitly in each function.
-
-A TCG *global* is a variable which is live in all the functions
-(equivalent of a C global variable). They are defined before the
-functions defined. A TCG global can be a memory location (e.g. a QEMU
-CPU register), a fixed host register (e.g. the QEMU CPU state pointer)
-or a memory location which is stored in a register outside QEMU TBs
-(not implemented yet).
-
-A TCG *basic block* corresponds to a list of instructions terminated
-by a branch instruction.
-
An operation with *undefined behavior* may result in a crash.
An operation with *unspecified behavior* shall not crash. However,
the result may be one of several possibilities so may be considered
an *undefined result*.
-Intermediate representation
-===========================
+Basic Blocks
+============
-Introduction
-------------
+A TCG *basic block* is a single entry, multiple exit region which
+corresponds to a list of instructions terminated by a label, or
+any branch instruction.
-TCG instructions operate on variables which are temporaries, local
-temporaries or globals. TCG instructions and variables are strongly
-typed. Two types are supported: 32 bit integers and 64 bit
-integers. Pointers are defined as an alias to 32 bit or 64 bit
-integers depending on the TCG target word size.
+A TCG *extended basic block* is a single entry, multiple exit region
+which corresponds to a list of instructions terminated by a label or
+an unconditional branch. Specifically, an extended basic block is
+a sequence of basic blocks connected by the fall-through paths of
+zero or more conditional branch instructions.
-Each instruction has a fixed number of output variable operands, input
-variable operands and always constant operands.
+Operations
+==========
-The notable exception is the call instruction which has a variable
-number of outputs and inputs.
+TCG instructions or *ops* operate on TCG *variables*, both of which
+are strongly typed. Each instruction has a fixed number of output
+variable operands, input variable operands and constant operands.
+Vector instructions have a field specifying the element size within
+the vector. The notable exception is the call instruction which has
+a variable number of outputs and inputs.
In the textual form, output operands usually come first, followed by
input operands, followed by constant operands. The output type is
add_i32 t0, t1, t2 /* (t0 <- t1 + t2) */
+Variables
+=========
-Assumptions
------------
+* ``TEMP_FIXED``
-Basic blocks
-^^^^^^^^^^^^
+ There is one TCG *fixed global* variable, ``cpu_env``, which is
+ live in all translation blocks, and holds a pointer to ``CPUArchState``.
+ This variable is held in a host cpu register at all times in all
+ translation blocks.
-* Basic blocks end after branches (e.g. brcond_i32 instruction),
- goto_tb and exit_tb instructions.
+* ``TEMP_GLOBAL``
-* Basic blocks start after the end of a previous basic block, or at a
- set_label instruction.
+ A TCG *global* is a variable which is live in all translation blocks,
+ and corresponds to memory location that is within ``CPUArchState``.
+ These may be specified as an offset from ``cpu_env``, in which case
+ they are called *direct globals*, or may be specified as an offset
+ from a direct global, in which case they are called *indirect globals*.
+ Even indirect globals should still reference memory within
+ ``CPUArchState``. All TCG globals are defined during
+ ``TCGCPUOps.initialize``, before any translation blocks are generated.
-After the end of a basic block, the content of temporaries is
-destroyed, but local temporaries and globals are preserved.
+* ``TEMP_CONST``
-Floating point types
-^^^^^^^^^^^^^^^^^^^^
+ A TCG *constant* is a variable which is live throughout the entire
+ translation block, and contains a constant value. These variables
+ are allocated on demand during translation and are hashed so that
+ there is exactly one variable holding a given value.
-* Floating point types are not supported yet
+* ``TEMP_TB``
-Pointers
-^^^^^^^^
+ A TCG *translation block temporary* is a variable which is live
+ throughout the entire translation block, but dies on any exit.
+ These temporaries are allocated explicitly during translation.
-* Depending on the TCG target, pointer size is 32 bit or 64
- bit. The type ``TCG_TYPE_PTR`` is an alias to ``TCG_TYPE_I32`` or
- ``TCG_TYPE_I64``.
+* ``TEMP_EBB``
-Helpers
-^^^^^^^
+ A TCG *extended basic block temporary* is a variable which is live
+ throughout an extended basic block, but dies on any exit.
+ These temporaries are allocated explicitly during translation.
+
+Types
+=====
+
+* ``TCG_TYPE_I32``
+
+ A 32-bit integer.
+
+* ``TCG_TYPE_I64``
+
+ A 64-bit integer. For 32-bit hosts, such variables are split into a pair
+ of variables with ``type=TCG_TYPE_I32`` and ``base_type=TCG_TYPE_I64``.
+ The ``temp_subindex`` for each indicates where it falls within the
+ host-endian representation.
-* Using the tcg_gen_helper_x_y it is possible to call any function
- taking i32, i64 or pointer types. By default, before calling a helper,
- all globals are stored at their canonical location and it is assumed
- that the function can modify them. By default, the helper is allowed to
- modify the CPU state or raise an exception.
+* ``TCG_TYPE_PTR``
- This can be overridden using the following function modifiers:
+ An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size
+ of a pointer for the host.
- - ``TCG_CALL_NO_READ_GLOBALS`` means that the helper does not read globals,
- either directly or via an exception. They will not be saved to their
- canonical locations before calling the helper.
+* ``TCG_TYPE_REG``
- - ``TCG_CALL_NO_WRITE_GLOBALS`` means that the helper does not modify any globals.
- They will only be saved to their canonical location before calling helpers,
- but they won't be reloaded afterwards.
+ An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size
+ of the integer registers for the host. This may be larger
+ than ``TCG_TYPE_PTR`` depending on the host ABI.
- - ``TCG_CALL_NO_SIDE_EFFECTS`` means that the call to the function is removed if
- the return value is not used.
+* ``TCG_TYPE_I128``
- Note that ``TCG_CALL_NO_READ_GLOBALS`` implies ``TCG_CALL_NO_WRITE_GLOBALS``.
+ A 128-bit integer. For all hosts, such variables are split into a number
+ of variables with ``type=TCG_TYPE_REG`` and ``base_type=TCG_TYPE_I128``.
+ The ``temp_subindex`` for each indicates where it falls within the
+ host-endian representation.
- On some TCG targets (e.g. x86), several calling conventions are
- supported.
+* ``TCG_TYPE_V64``
-Branches
-^^^^^^^^
+ A 64-bit vector. This type is valid only if the TCG target
+ sets ``TCG_TARGET_HAS_v64``.
-* Use the instruction 'br' to jump to a label.
+* ``TCG_TYPE_V128``
+
+ A 128-bit vector. This type is valid only if the TCG target
+ sets ``TCG_TARGET_HAS_v128``.
+
+* ``TCG_TYPE_V256``
+
+ A 256-bit vector. This type is valid only if the TCG target
+ sets ``TCG_TARGET_HAS_v256``.
+
+Helpers
+=======
+
+Helpers are registered in a guest-specific ``helper.h``,
+which is processed to generate ``tcg_gen_helper_*`` functions.
+With these functions it is possible to call a function taking
+i32, i64, i128 or pointer types.
+
+By default, before calling a helper, all globals are stored at their
+canonical location. By default, the helper is allowed to modify the
+CPU state (including the state represented by tcg globals)
+or may raise an exception. This default can be overridden using the
+following function modifiers:
+
+* ``TCG_CALL_NO_WRITE_GLOBALS``
+
+ The helper does not modify any globals, but may read them.
+ Globals will be saved to their canonical location before calling helpers,
+ but need not be reloaded afterwards.
+
+* ``TCG_CALL_NO_READ_GLOBALS``
+
+ The helper does not read globals, either directly or via an exception.
+ They will not be saved to their canonical locations before calling
+ the helper. This implies ``TCG_CALL_NO_WRITE_GLOBALS``.
+
+* ``TCG_CALL_NO_SIDE_EFFECTS``
+
+ The call to the helper function may be removed if the return value is
+ not used. This means that it may not modify any CPU state nor may it
+ raise an exception.
Code Optimizations
-------------------
+==================
When generating instructions, you can count on at least the following
optimizations:
often modified, e.g. the integer registers and the condition
codes. TCG will be able to use host registers to store them.
-- Avoid globals stored in fixed registers. They must be used only to
- store the pointer to the CPU state and possibly to store a pointer
- to a register window.
-
-- Use temporaries. Use local temporaries only when really needed,
- e.g. when you need to use a value after a jump. Local temporaries
- introduce a performance hit in the current TCG implementation: their
- content is saved to memory at end of each basic block.
-
-- Free temporaries and local temporaries when they are no longer used
- (tcg_temp_free). Since tcg_const_x() also creates a temporary, you
- should free it after it is used. Freeing temporaries does not yield
- a better generated code, but it reduces the memory usage of TCG and
- the speed of the translation.
+- Free temporaries when they are no longer used (``tcg_temp_free``).
+ Since ``tcg_const_x`` also creates a temporary, you should free it
+ after it is used.
- Don't hesitate to use helpers for complicated or seldom used guest
instructions. There is little performance advantage in using TCG to
the instruction is mostly doing loads and stores, and in those cases
inline TCG may still be faster for longer sequences.
-- The hard limit on the number of TCG instructions you can generate
- per guest instruction is set by ``MAX_OP_PER_INSTR`` in ``exec-all.h`` --
- you cannot exceed this without risking a buffer overrun.
-
- Use the 'discard' instruction if you know that TCG won't be able to
prove that a given global is "dead" at a given program point. The
x86 guest uses it to improve the condition codes optimisation.