docs/GlobalISel.rst

   1 ============================
   2 Global Instruction Selection
   3 ============================
   4
   5 .. contents::
   6    :local:
   7    :depth: 1
   8
   9 .. warning::
  10    This document is a work in progress.  It reflects the current state of the
  11    implementation, as well as open design and implementation issues.
  12
  13 Introduction
  14 ============
  15
  16 GlobalISel is a framework that provides a set of reusable passes and utilities
  17 for instruction selection --- translation from LLVM IR to target-specific
  18 Machine IR (MIR).
  19
  20 GlobalISel is intended to be a replacement for SelectionDAG and FastISel, to
  21 solve three major problems:
  22
  23 * **Performance** --- SelectionDAG introduces a dedicated intermediate
  24   representation, which has a compile-time cost.
  25
  26   GlobalISel directly operates on the post-isel representation used by the
  27   rest of the code generator, MIR.
  28   It does require extensions to that representation to support arbitrary
  29   incoming IR: :ref:`gmir`.
  30
  31 * **Granularity** --- SelectionDAG and FastISel operate on individual basic
  32   blocks, losing some global optimization opportunities.
  33
  34   GlobalISel operates on the whole function.
  35
  36 * **Modularity** --- SelectionDAG and FastISel are radically different and share
  37   very little code.
  38
  39   GlobalISel is built in a way that enables code reuse. For instance, both the
  40   optimized and fast selectors share the :ref:`pipeline`, and targets can
  41   configure that pipeline to better suit their needs.
  42
  43
  44 .. _gmir:
  45
  46 Generic Machine IR
  47 ==================
  48
  49 Machine IR operates on physical registers, register classes, and (mostly)
  50 target-specific instructions.
  51
  52 To bridge the gap with LLVM IR, GlobalISel introduces "generic" extensions to
  53 Machine IR:
  54
  55 .. contents::
  56    :local:
  57
  58 ``NOTE``:
  59 The generic MIR (GMIR) representation still contains references to IR
  60 constructs (such as ``GlobalValue``).  Removing those should let us write more
  61 accurate tests, or delete IR after building the initial MIR.  However, it is
  62 not part of the GlobalISel effort.
  63
  64 .. _gmir-instructions:
  65
  66 Generic Instructions
  67 --------------------
  68
  69 The main addition is support for pre-isel generic machine instructions (e.g.,
  70 ``G_ADD``).  Like other target-independent instructions (e.g., ``COPY`` or
  71 ``PHI``), these are available on all targets.
  72
  73 ``TODO``:
  74 While we're progressively adding instructions, one kind in particular exposes
  75 interesting problems: compares and how to represent condition codes.
  76 Some targets (x86, ARM) have generic comparisons setting multiple flags,
  77 which are then used by predicated variants.
  78 Others (IR) specify the predicate in the comparison and users just get a single
  79 bit.  SelectionDAG uses SETCC/CONDBR vs BR_CC (and similar for select) to
  80 represent this.
  81
  82 The ``MachineIRBuilder`` class wraps the ``MachineInstrBuilder`` and provides
  83 a convenient way to create these generic instructions.
  84
  85 .. _gmir-gvregs:
  86
  87 Generic Virtual Registers
  88 -------------------------
  89
  90 Generic instructions operate on a new kind of register: "generic" virtual
  91 registers.  As opposed to non-generic vregs, they are not assigned a Register
  92 Class.  Instead, generic vregs have a :ref:`gmir-llt`, and can be assigned
  93 a :ref:`gmir-regbank`.
  94
  95 ``MachineRegisterInfo`` tracks the same information that it does for
  96 non-generic vregs (e.g., use-def chains).  Additionally, it also tracks the
  97 :ref:`gmir-llt` of the register, and, instead of the ``TargetRegisterClass``,
  98 its :ref:`gmir-regbank`, if any.
  99
 100 For simplicity, most generic instructions only accept generic vregs:
 101
 102 * instead of immediates, they use a gvreg defined by an instruction
 103   materializing the immediate value (see :ref:`irtranslator-constants`).
 104 * instead of physical register, they use a gvreg defined by a ``COPY``.
 105
 106 ``NOTE``:
 107 We started with an alternative representation, where MRI tracks a size for
 108 each gvreg, and instructions have lists of types.
 109 That had two flaws: the type and size are redundant, and there was no generic
 110 way of getting a given operand's type (as there was no 1:1 mapping between
 111 instruction types and operands).
 112 We considered putting the type in some variant of MCInstrDesc instead:
 113 See `PR26576 <http://llvm.org/PR26576>`_: [GlobalISel] Generic MachineInstrs
 114 need a type but this increases the memory footprint of the related objects
 115
 116 .. _gmir-regbank:
 117
 118 Register Bank
 119 -------------
 120
 121 A Register Bank is a set of register classes defined by the target.
 122 A bank has a size, which is the maximum store size of all covered classes.
 123
 124 In general, cross-class copies inside a bank are expected to be cheaper than
 125 copies across banks.  They are also coalesceable by the register coalescer,
 126 whereas cross-bank copies are not.
 127
 128 Also, equivalent operations can be performed on different banks using different
 129 instructions.
 130
 131 For example, X86 can be seen as having 3 main banks: general-purpose, x87, and
 132 vector (which could be further split into a bank per domain for single vs
 133 double precision instructions).
 134
 135 Register banks are described by a target-provided API,
 136 :ref:`RegisterBankInfo <api-registerbankinfo>`.
 137
 138 .. _gmir-llt:
 139
 140 Low Level Type
 141 --------------
 142
 143 Additionally, every generic virtual register has a type, represented by an
 144 instance of the ``LLT`` class.
 145
 146 Like ``EVT``/``MVT``/``Type``, it has no distinction between unsigned and signed
 147 integer types.  Furthermore, it also has no distinction between integer and
 148 floating-point types: it mainly conveys absolutely necessary information, such
 149 as size and number of vector lanes:
 150
 151 * ``sN`` for scalars
 152 * ``pN`` for pointers
 153 * ``<N x sM>`` for vectors
 154 * ``unsized`` for labels, etc..
 155
 156 ``LLT`` is intended to replace the usage of ``EVT`` in SelectionDAG.
 157
 158 Here are some LLT examples and their ``EVT`` and ``Type`` equivalents:
 159
 160    =============  =========  ======================================
 161    LLT            EVT        IR Type
 162    =============  =========  ======================================
 163    ``s1``         ``i1``     ``i1``
 164    ``s8``         ``i8``     ``i8``
 165    ``s32``        ``i32``    ``i32``
 166    ``s32``        ``f32``    ``float``
 167    ``s17``        ``i17``    ``i17``
 168    ``s16``        N/A        ``{i8, i8}``
 169    ``s32``        N/A        ``[4 x i8]``
 170    ``p0``         ``iPTR``   ``i8*``, ``i32*``, ``%opaque*``
 171    ``p2``         ``iPTR``   ``i8 addrspace(2)*``
 172    ``<4 x s32>``  ``v4f32``  ``<4 x float>``
 173    ``s64``        ``v1f64``  ``<1 x double>``
 174    ``<3 x s32>``  ``v3i32``  ``<3 x i32>``
 175    ``unsized``    ``Other``  ``label``
 176    =============  =========  ======================================
 177
 178
 179 Rationale: instructions already encode a specific interpretation of types
 180 (e.g., ``add`` vs. ``fadd``, or ``sdiv`` vs. ``udiv``).  Also encoding that
 181 information in the type system requires introducing bitcast with no real
 182 advantage for the selector.
 183
 184 Pointer types are distinguished by address space.  This matches IR, as opposed
 185 to SelectionDAG where address space is an attribute on operations.
 186 This representation better supports pointers having different sizes depending
 187 on their addressspace.
 188
 189 ``NOTE``:
 190 Currently, LLT requires at least 2 elements in vectors, but some targets have
 191 the concept of a '1-element vector'.  Representing them as their underlying
 192 scalar type is a nice simplification.
 193
 194 ``TODO``:
 195 Currently, non-generic virtual registers, defined by non-pre-isel-generic
 196 instructions, cannot have a type, and thus cannot be used by a pre-isel generic
 197 instruction.  Instead, they are given a type using a COPY.  We could relax that
 198 and allow types on all vregs: this would reduce the number of MI required when
 199 emitting target-specific MIR early in the pipeline.  This should purely be
 200 a compile-time optimization.
 201
 202 .. _pipeline:
 203
 204 Core Pipeline
 205 =============
 206
 207 There are four required passes, regardless of the optimization mode:
 208
 209 .. contents::
 210    :local:
 211
 212 Additional passes can then be inserted at higher optimization levels or for
 213 specific targets. For example, to match the current SelectionDAG set of
 214 transformations: MachineCSE and a better MachineCombiner between every pass.
 215
 216 ``NOTE``:
 217 In theory, not all passes are always necessary.
 218 As an additional compile-time optimization, we could skip some of the passes by
 219 setting the relevant MachineFunction properties.  For instance, if the
 220 IRTranslator did not encounter any illegal instruction, it would set the
 221 ``legalized`` property to avoid running the :ref:`milegalizer`.
 222 Similarly, we considered specializing the IRTranslator per-target to directly
 223 emit target-specific MI.
 224 However, we instead decided to keep the core pipeline simple, and focus on
 225 minimizing the overhead of the passes in the no-op cases.
 226
 227
 228 .. _irtranslator:
 229
 230 IRTranslator
 231 ------------
 232
 233 This pass translates the input LLVM IR ``Function`` to a GMIR
 234 ``MachineFunction``.
 235
 236 ``TODO``:
 237 This currently doesn't support the more complex instructions, in particular
 238 those involving control flow (``switch``, ``invoke``, ...).
 239 For ``switch`` in particular, we can initially use the ``LowerSwitch`` pass.
 240
 241 .. _api-calllowering:
 242
 243 API: CallLowering
 244 ^^^^^^^^^^^^^^^^^
 245
 246 The ``IRTranslator`` (using the ``CallLowering`` target-provided utility) also
 247 implements the ABI's calling convention by lowering calls, returns, and
 248 arguments to the appropriate physical register usage and instruction sequences.
 249
 250 .. _irtranslator-aggregates:
 251
 252 Aggregates
 253 ^^^^^^^^^^
 254
 255 Aggregates are lowered to a single scalar vreg.
 256 This differs from SelectionDAG's multiple vregs via ``GetValueVTs``.
 257
 258 ``TODO``:
 259 As some of the bits are undef (padding), we should consider augmenting the
 260 representation with additional metadata (in effect, caching computeKnownBits
 261 information on vregs).
 262 See `PR26161 <http://llvm.org/PR26161>`_: [GlobalISel] Value to vreg during
 263 IR to MachineInstr translation for aggregate type
 264
 265 .. _irtranslator-constants:
 266
 267 Constant Lowering
 268 ^^^^^^^^^^^^^^^^^
 269
 270 The ``IRTranslator`` lowers ``Constant`` operands into uses of gvregs defined
 271 by ``G_CONSTANT`` or ``G_FCONSTANT`` instructions.
 272 Currently, these instructions are always emitted in the entry basic block.
 273 In a ``MachineFunction``, each ``Constant`` is materialized by a single gvreg.
 274
 275 This is beneficial as it allows us to fold constants into immediate operands
 276 during :ref:`instructionselect`, while still avoiding redundant materializations
 277 for expensive non-foldable constants.
 278 However, this can lead to unnecessary spills and reloads in an -O0 pipeline, as
 279 these vregs can have long live ranges.
 280
 281 ``TODO``:
 282 We're investigating better placement of these instructions, in fast and
 283 optimized modes.
 284
 285
 286 .. _milegalizer:
 287
 288 Legalizer
 289 ---------
 290
 291 This pass transforms the generic machine instructions such that they are legal.
 292
 293 A legal instruction is defined as:
 294
 295 * **selectable** --- the target will later be able to select it to a
 296   target-specific (non-generic) instruction.
 297
 298 * operating on **vregs that can be loaded and stored** -- if necessary, the
 299   target can select a ``G_LOAD``/``G_STORE`` of each gvreg operand.
 300
 301 As opposed to SelectionDAG, there are no legalization phases.  In particular,
 302 'type' and 'operation' legalization are not separate.
 303
 304 Legalization is iterative, and all state is contained in GMIR.  To maintain the
 305 validity of the intermediate code, instructions are introduced:
 306
 307 * ``G_MERGE_VALUES`` --- concatenate multiple registers of the same
 308   size into a single wider register.
 309
 310 * ``G_UNMERGE_VALUES`` --- extract multiple registers of the same size
 311   from a single wider register.
 312
 313 * ``G_EXTRACT`` --- extract a simple register (as contiguous sequences of bits)
 314   from a single wider register.
 315
 316 As they are expected to be temporary byproducts of the legalization process,
 317 they are combined at the end of the :ref:`milegalizer` pass.
 318 If any remain, they are expected to always be selectable, using loads and stores
 319 if necessary.
 320
 321 .. _api-legalizerinfo:
 322
 323 API: LegalizerInfo
 324 ^^^^^^^^^^^^^^^^^^
 325
 326 Currently the API is broadly similar to SelectionDAG/TargetLowering, but
 327 extended in two ways:
 328
 329 * The set of available actions is wider, avoiding the currently very
 330   overloaded ``Expand`` (which can cover everything from libcalls to
 331   scalarization depending on the node's opcode).
 332
 333 * Since there's no separate type legalization, independently varying
 334   types on an instruction can have independent actions. For example a
 335   ``G_ICMP`` has 2 independent types: the result and the inputs; we need
 336   to be able to say that comparing 2 s32s is OK, but the s1 result
 337   must be dealt with in another way.
 338
 339 As such, the primary key when deciding what to do is the ``InstrAspect``,
 340 essentially a tuple consisting of ``(Opcode, TypeIdx, Type)`` and mapping to a
 341 suggested course of action.
 342
 343 An example use might be:
 344
 345   .. code-block:: c++
 346
 347     // The CPU can't deal with an s1 result, do something about it.
 348     setAction({G_ICMP, 0, s1}, WidenScalar);
 349     // An s32 input (the second type) is fine though.
 350     setAction({G_ICMP, 1, s32}, Legal);
 351
 352
 353 ``TODO``:
 354 An alternative worth investigating is to generalize the API to represent
 355 actions using ``std::function`` that implements the action, instead of explicit
 356 enum tokens (``Legal``, ``WidenScalar``, ...).
 357
 358 ``TODO``:
 359 Moreover, we could use TableGen to initially infer legality of operation from
 360 existing patterns (as any pattern we can select is by definition legal).
 361 Expanding that to describe legalization actions is a much larger but
 362 potentially useful project.
 363
 364 .. _milegalizer-non-power-of-2:
 365
 366 Non-power of 2 types
 367 ^^^^^^^^^^^^^^^^^^^^
 368
 369 ``TODO``:
 370 Types which have a size that isn't a power of 2 aren't currently supported.
 371 The setAction API will probably require changes to support them.
 372 Even notionally explicitly specified operations only make suggestions
 373 like "Widen" or "Narrow". The eventual type is still unspecified and a
 374 search is performed by repeated doubling/halving of the type's
 375 size.
 376 This is incorrect for types that aren't a power of 2.  It's reasonable to
 377 expect we could construct an efficient set of side-tables for more general
 378 lookups though, encoding a map from the integers (i.e. the size of the current
 379 type) to types (the legal size).
 380
 381 .. _milegalizer-vector:
 382
 383 Vector types
 384 ^^^^^^^^^^^^
 385
 386 Vectors first get their element type legalized: ``<A x sB>`` becomes
 387 ``<A x sC>`` such that at least one operation is legal with ``sC``.
 388
 389 This is currently specified by the function ``setScalarInVectorAction``, called
 390 for example as:
 391
 392     setScalarInVectorAction(G_ICMP, s1, WidenScalar);
 393
 394 Next the number of elements is chosen so that the entire operation is
 395 legal. This aspect is not controllable at the moment, but probably
 396 should be (you could imagine disagreements on whether a ``<2 x s8>``
 397 operation should be scalarized or extended to ``<8 x s8>``).
 398
 399
 400 .. _regbankselect:
 401
 402 RegBankSelect
 403 -------------
 404
 405 This pass constrains the :ref:`gmir-gvregs` operands of generic
 406 instructions to some :ref:`gmir-regbank`.
 407
 408 It iteratively maps instructions to a set of per-operand bank assignment.
 409 The possible mappings are determined by the target-provided
 410 :ref:`RegisterBankInfo <api-registerbankinfo>`.
 411 The mapping is then applied, possibly introducing ``COPY`` instructions if
 412 necessary.
 413
 414 It traverses the ``MachineFunction`` top down so that all operands are already
 415 mapped when analyzing an instruction.
 416
 417 This pass could also remap target-specific instructions when beneficial.
 418 In the future, this could replace the ExeDepsFix pass, as we can directly
 419 select the best variant for an instruction that's available on multiple banks.
 420
 421 .. _api-registerbankinfo:
 422
 423 API: RegisterBankInfo
 424 ^^^^^^^^^^^^^^^^^^^^^
 425
 426 The ``RegisterBankInfo`` class describes multiple aspects of register banks.
 427
 428 * **Banks**: ``addRegBankCoverage`` --- which register bank covers each
 429   register class.
 430
 431 * **Cross-Bank Copies**: ``copyCost`` --- the cost of a ``COPY`` from one bank
 432   to another.
 433
 434 * **Default Mapping**: ``getInstrMapping`` --- the default bank assignments for
 435   a given instruction.
 436
 437 * **Alternative Mapping**: ``getInstrAlternativeMapping`` --- the other
 438   possible bank assignments for a given instruction.
 439
 440 ``TODO``:
 441 All this information should eventually be static and generated by TableGen,
 442 mostly using existing information augmented by bank descriptions.
 443
 444 ``TODO``:
 445 ``getInstrMapping`` is currently separate from ``getInstrAlternativeMapping``
 446 because the latter is more expensive: as we move to static mapping info,
 447 both methods should be free, and we should merge them.
 448
 449 .. _regbankselect-modes:
 450
 451 RegBankSelect Modes
 452 ^^^^^^^^^^^^^^^^^^^
 453
 454 ``RegBankSelect`` currently has two modes:
 455
 456 * **Fast** --- For each instruction, pick a target-provided "default" bank
 457   assignment.  This is the default at -O0.
 458
 459 * **Greedy** --- For each instruction, pick the cheapest of several
 460   target-provided bank assignment alternatives.
 461
 462 We intend to eventually introduce an additional optimizing mode:
 463
 464 * **Global** --- Across multiple instructions, pick the cheapest combination of
 465   bank assignments.
 466
 467 ``NOTE``:
 468 On AArch64, we are considering using the Greedy mode even at -O0 (or perhaps at
 469 backend -O1):  because :ref:`gmir-llt` doesn't distinguish floating point from
 470 integer scalars, the default assignment for loads and stores is the integer
 471 bank, introducing cross-bank copies on most floating point operations.
 472
 473
 474 .. _instructionselect:
 475
 476 InstructionSelect
 477 -----------------
 478
 479 This pass transforms generic machine instructions into equivalent
 480 target-specific instructions.  It traverses the ``MachineFunction`` bottom-up,
 481 selecting uses before definitions, enabling trivial dead code elimination.
 482
 483 .. _api-instructionselector:
 484
 485 API: InstructionSelector
 486 ^^^^^^^^^^^^^^^^^^^^^^^^
 487
 488 The target implements the ``InstructionSelector`` class, containing the
 489 target-specific selection logic proper.
 490
 491 The instance is provided by the subtarget, so that it can specialize the
 492 selector by subtarget feature (with, e.g., a vector selector overriding parts
 493 of a general-purpose common selector).
 494 We might also want to parameterize it by MachineFunction, to enable selector
 495 variants based on function attributes like optsize.
 496
 497 The simple API consists of:
 498
 499   .. code-block:: c++
 500
 501     virtual bool select(MachineInstr &MI)
 502
 503 This target-provided method is responsible for mutating (or replacing) a
 504 possibly-generic MI into a fully target-specific equivalent.
 505 It is also responsible for doing the necessary constraining of gvregs into the
 506 appropriate register classes as well as passing through COPY instructions to
 507 the register allocator.
 508
 509 The ``InstructionSelector`` can fold other instructions into the selected MI,
 510 by walking the use-def chain of the vreg operands.
 511 As GlobalISel is Global, this folding can occur across basic blocks.
 512
 513 SelectionDAG Rule Imports
 514 ^^^^^^^^^^^^^^^^^^^^^^^^^
 515
 516 TableGen will import SelectionDAG rules and provide the following function to
 517 execute them:
 518
 519   .. code-block:: c++
 520
 521     bool selectImpl(MachineInstr &MI)
 522
 523 The ``--stats`` option can be used to determine what proportion of rules were
 524 successfully imported. The easiest way to use this is to copy the
 525 ``-gen-globalisel`` tablegen command from ``ninja -v`` and modify it.
 526
 527 Similarly, the ``--warn-on-skipped-patterns`` option can be used to obtain the
 528 reasons that rules weren't imported. This can be used to focus on the most
 529 important rejection reasons.
 530
 531 PatLeaf Predicates
 532 ^^^^^^^^^^^^^^^^^^
 533
 534 PatLeafs cannot be imported because their C++ is implemented in terms of
 535 ``SDNode`` objects. PatLeafs that handle immediate predicates should be
 536 replaced by ``ImmLeaf``, ``IntImmLeaf``, or ``FPImmLeaf`` as appropriate.
 537
 538 There's no standard answer for other PatLeafs. Some standard predicates have
 539 been baked into TableGen but this should not generally be done.
 540
 541 Custom SDNodes
 542 ^^^^^^^^^^^^^^
 543
 544 Custom SDNodes should be mapped to Target Pseudos using ``GINodeEquiv``. This
 545 will cause the instruction selector to import them but you will also need to
 546 ensure the target pseudo is introduced to the MIR before the instruction
 547 selector. Any preceeding pass is suitable but the legalizer will be a
 548 particularly common choice.
 549
 550 ComplexPatterns
 551 ^^^^^^^^^^^^^^^
 552
 553 ComplexPatterns cannot be imported because their C++ is implemented in terms of
 554 ``SDNode`` objects. GlobalISel versions should be defined with
 555 ``GIComplexOperandMatcher`` and mapped to ComplexPattern with
 556 ``GIComplexPatternEquiv``.
 557
 558 The following predicates are useful for porting ComplexPattern:
 559
 560 * isBaseWithConstantOffset() - Check for base+offset structures
 561 * isOperandImmEqual() - Check for a particular constant
 562 * isObviouslySafeToFold() - Check for reasons an instruction can't be sunk and folded into another.
 563
 564 There are some important points for the C++ implementation:
 565
 566 * Don't modify MIR in the predicate
 567 * Renderer lambdas should capture by value to avoid use-after-free. They will be used after the predicate returns.
 568 * Only create instructions in a renderer lambda. GlobalISel won't clean up things you create but don't use.
 569
 570
 571 .. _maintainability:
 572
 573 Maintainability
 574 ===============
 575
 576 .. _maintainability-iterative:
 577
 578 Iterative Transformations
 579 -------------------------
 580
 581 Passes are split into small, iterative transformations, with all state
 582 represented in the MIR.
 583
 584 This differs from SelectionDAG (in particular, the legalizer) using various
 585 in-memory side-tables.
 586
 587
 588 .. _maintainability-mir:
 589
 590 MIR Serialization
 591 -----------------
 592
 593 .. FIXME: Update the MIRLangRef to include GMI additions.
 594
 595 :ref:`gmir` is serializable (see :doc:`MIRLangRef`).
 596 Combined with :ref:`maintainability-iterative`, this enables much finer-grained
 597 testing, rather than requiring large and fragile IR-to-assembly tests.
 598
 599 The current "stage" in the :ref:`pipeline` is represented by a set of
 600 ``MachineFunctionProperties``:
 601
 602 * ``legalized``
 603 * ``regBankSelected``
 604 * ``selected``
 605
 606
 607 .. _maintainability-verifier:
 608
 609 MachineVerifier
 610 ---------------
 611
 612 The pass approach lets us use the ``MachineVerifier`` to enforce invariants.
 613 For instance, a ``regBankSelected`` function may not have gvregs without
 614 a bank.
 615
 616 ``TODO``:
 617 The ``MachineVerifier`` being monolithic, some of the checks we want to do
 618 can't be integrated to it:  GlobalISel is a separate library, so we can't
 619 directly reference it from CodeGen.  For instance, legality checks are
 620 currently done in RegBankSelect/InstructionSelect proper.  We could #ifdef out
 621 the checks, or we could add some sort of verifier API.
 622
 623
 624 .. _progress:
 625
 626 Progress and Future Work
 627 ========================
 628
 629 The initial goal is to replace FastISel on AArch64.  The next step will be to
 630 replace SelectionDAG as the optimized ISel.
 631
 632 ``NOTE``:
 633 While we iterate on GlobalISel, we strive to avoid affecting the performance of
 634 SelectionDAG, FastISel, or the other MIR passes.  For instance, the types of
 635 :ref:`gmir-gvregs` are stored in a separate table in ``MachineRegisterInfo``,
 636 that is destroyed after :ref:`instructionselect`.
 637
 638 .. _progress-fastisel:
 639
 640 FastISel Replacement
 641 --------------------
 642
 643 For the initial FastISel replacement, we intend to fallback to SelectionDAG on
 644 selection failures.
 645
 646 Currently, compile-time of the fast pipeline is within 1.5x of FastISel.
 647 We're optimistic we can get to within 1.1/1.2x, but beating FastISel will be
 648 challenging given the multi-pass approach.
 649 Still, supporting all IR (via a complete legalizer) and avoiding the fallback
 650 to SelectionDAG in the worst case should enable better amortized performance
 651 than SelectionDAG+FastISel.
 652
 653 ``NOTE``:
 654 We considered never having a fallback to SelectionDAG, instead deciding early
 655 whether a given function is supported by GlobalISel or not.  The decision would
 656 be based on :ref:`milegalizer` queries.
 657 We abandoned that for two reasons:
 658 a) on IR inputs, we'd need to basically simulate the :ref:`irtranslator`;
 659 b) to be robust against unforeseen failures and to enable iterative
 660 improvements.
 661
 662 .. _progress-targets:
 663
 664 Support For Other Targets
 665 -------------------------
 666
 667 In parallel, we're investigating adding support for other - ideally quite
 668 different - targets.  For instance, there is some initial AMDGPU support.
 669
 670
 671 .. _porting:
 672
 673 Porting GlobalISel to A New Target
 674 ==================================
 675
 676 There are four major classes to implement by the target:
 677
 678 * :ref:`CallLowering <api-calllowering>` --- lower calls, returns, and arguments
 679   according to the ABI.
 680 * :ref:`RegisterBankInfo <api-registerbankinfo>` --- describe
 681   :ref:`gmir-regbank` coverage, cross-bank copy cost, and the mapping of
 682   operands onto banks for each instruction.
 683 * :ref:`LegalizerInfo <api-legalizerinfo>` --- describe what is legal, and how
 684   to legalize what isn't.
 685 * :ref:`InstructionSelector <api-instructionselector>` --- select generic MIR
 686   to target-specific MIR.
 687
 688 Additionally:
 689
 690 * ``TargetPassConfig`` --- create the passes constituting the pipeline,
 691   including additional passes not included in the :ref:`pipeline`.
 692
 693 .. _other_resources:
 694
 695 Resources
 696 =========
 697
 698 * `Global Instruction Selection - A Proposal by Quentin Colombet @LLVMDevMeeting 2015 <https://www.youtube.com/watch?v=F6GGbYtae3g>`_
 699 * `Global Instruction Selection - Status by Quentin Colombet, Ahmed Bougacha, and Tim Northover @LLVMDevMeeting 2016 <https://www.youtube.com/watch?v=6tfb344A7w8>`_
 700 * `GlobalISel - LLVM's Latest Instruction Selection Framework by Diana Picus @FOSDEM17 <https://www.youtube.com/watch?v=d6dF6E4BPeU>`_
 701 * GlobalISel: Past, Present, and Future by Quentin Colombet and Ahmed Bougacha @LLVMDevMeeting 2017
 702 * Head First into GlobalISel by Daniel Sanders, Aditya Nandakumar, and Justin Bogner @LLVMDevMeeting 2017