cgen/doc/rtl.texi

   1 @c Copyright (C) 2000 Red Hat, Inc.
   2 @c This file is part of the CGEN manual.
   3 @c For copying conditions, see the file cgen.texi.
   4
   5 @node RTL
   6 @chapter CGEN's Register Transfer Language
   7 @cindex RTL
   8 @cindex Register Transfer Language
   9
  10 CGEN uses a variant of GCC's Register Transfer Language as the basis for
  11 its CPU description language.
  12
  13 @menu
  14 * RTL Introduction::            Introduction to CGEN's RTL
  15 * Trade-offs::                  Various trade-offs in the design
  16 * Rules and notes::             Rules and notes common to all entries
  17 * Definitions::                 Definitions in the description file
  18 * Attributes::                  Random data associated with any entry
  19 * Architecture variants::       Specifying variations of a CPU
  20 * Model variants::              Specifying variations of a CPU's implementation
  21 * Hardware elements::           Elements of a CPU
  22 * Instruction fields::          Fields of an instruction
  23 * Enumerated constants::        Assigning useful names to important numbers
  24 * Instruction operands::
  25 * Derived operands::            Operands for CISC-like architectures
  26 * Instructions::
  27 * Macro-instructions::
  28 * Modes::
  29 * Expressions::
  30 * Macro-expressions::
  31 @end menu
  32
  33 @node RTL Introduction
  34 @section RTL Introduction
  35
  36 The description language, or RTL
  37 @footnote{While RTL stands for Register Transfer Language, it is also used
  38 to denote the CPU description language as a whole.}, needs to support the
  39 definition of all the
  40 architectural and implementation features of a CPU, as well as enough
  41 information for all intended applications.  At present this is just the
  42 opcodes table and an ISA level simulator, but it is not intended that
  43 applications be restricted to these two areas.  The goal is having an
  44 application independent description of the CPU.  In the end that's a lot to
  45 ask for from one language.  Certainly gate level specification of a CPU
  46 is not attempted!
  47
  48 The syntax of the language is inspired by GCC's RTL and by the Scheme
  49 programming language, theoretically taking the best of both.  To what
  50 extent that is true, and to what extent that is sufficient inspiration
  51 is certainly open to discussion.  In actuality, there isn't much difference
  52 here from GCC's RTL that is attributable to being Scheme-ish.  One
  53 important Scheme-derived concept is arbitrary precision of constants.
  54 Sign or zero extension of constants in GCC has always been a source of
  55 problems.  In CGEN'S RTL constants have modes and there are both signed
  56 and unsigned modes.
  57
  58 Here is a graphical layout of the hierarchy of elements of a @file{.cpu}
  59 file.
  60
  61 @example
  62                            architecture
  63                           /            \
  64                     cpu-family1        cpu-family2  ...
  65                       /     \            /      \
  66                 machine1   machine2  machine3   ...
  67                  /   \
  68              model1  model2  ...
  69 @end example
  70
  71 Each of these elements is explained in more detail below.  The
  72 @emph{architecture} is one of @samp{sparc}, @samp{m32r}, etc.  Within
  73 the @samp{sparc} architecture, @emph{cpu-family} might be
  74 @samp{sparc32}, @samp{sparc64}, etc.  Within the @samp{sparc32} CPU
  75 family, the @emph{machine} might be @samp{sparc-v8}, @samp{sparclite},
  76 etc.  Within the @samp{sparc-v8} machine classification, @emph{model}
  77 might be @samp{hypersparc}, @samp{supersparc}, etc.
  78
  79 Instructions form their own hierarchy as each instruction may be supported
  80 by more than one machine.  Also, some architectures can handle more than
  81 one instruction set on one chip (e.g. ARM).
  82
  83 @example
  84                      isa
  85                       |
  86                  instruction
  87                     /   \
  88              operand1  operand2  ...
  89                 |         |
  90          hw1+ifield1   hw2+ifield2  ...
  91 @end example
  92
  93 Each of these elements is explained in more detail below.
  94
  95 @node Trade-offs
  96 @section Trade-offs
  97
  98 While CGEN is written in Scheme, this is not a requirement.  The
  99 description language should be considered absent of any particular
 100 implementation, though certainly some things were done to simplify
 101 reading @file{.cpu} files with Scheme.  Scheme related choices have been
 102 made in areas that have no serious impact on the usefulness of the CPU
 103 description language.  Places where that is not the case need to be
 104 revisited, though there currently are no known ones.
 105
 106 One place where the Scheme implementation influenced the design of
 107 CGEN's RTL is in the handling of modes.  The Scheme implementation was
 108 simplified by treating modes as an explicit argument, rather than as an
 109 optional suffix of the operation name.  For example, compare @code{(add
 110 SI dr sr)} in CGEN versus @code{(add:SI dr sr)} in GCC RTL.  The mode is
 111 treated as optional so a shorthand form of @code{(add dr sr)} works.
 112
 113 @node Rules and notes
 114 @section Rules and notes
 115
 116 A few basic guidelines for all entries:
 117
 118 @itemize @bullet
 119 @item names must be valid Scheme symbols.
 120 @item comments are used, for example, to comment the generated C code
 121 @footnote{It is possible to produce a reference manual from
 122 @file{.cpu} files and such an application wouldn't be a bad idea.}.
 123 @item comments may be any number of lines, though generally succinct comments
 124 are preferable@footnote{It would be reasonable to have a short form
 125 and a long form of comment. Either as two entries are as one entry with
 126 the short form separated from the long form via some delimiter (say the
 127 first newline).}.
 128 @item everything is case sensitive.@footnote{??? This is true in RTL,
 129 though some apps add symbols and convert case that can cause collisions.}
 130 @item while "_" is a valid character to use in symbols, "-" is preferred
 131 @item except for the @samp{comment} and @samp{attrs} fields and unless
 132 otherwise specified all fields must be present.
 133 @end itemize
 134
 135 Symbols and strings
 136
 137 Symbols in CGEN are the same as in Scheme.
 138 Symbols can be used anywhere a string can be used.
 139 The reverse is not true, and in general strings can't be used in place
 140 of symbols.
 141
 142 @node Definitions
 143 @section Definitions
 144 @cindex Definitions
 145
 146 Each entry has the same format: @code{(define-foo arg1 arg2 ...)}, where
 147 @samp{foo} designates the type of entry (e.g. @code{define-insn}).  In
 148 the general case each argument is a name/value pair expressed as
 149 @code{(name value)}.
 150 (*note: Another style in common use is `:name value' and doesn't require
 151 parentheses.  Maybe that would be a better way to go here.  The current
 152 style is easier to construct from macros though.)
 153
 154 While the general case is flexible, it also is excessively verbose in
 155 the normal case.  To reduce this verbosity, a second version of most
 156 define-foo's exists that takes positional arguments.  To further reduce
 157 this verbosity, preprocessor macros can be written to simplify things
 158 further for the normal case.  See sections titled ``Simplification
 159 macros'' below.
 160
 161 @node Attributes
 162 @section Attributes
 163 @cindex Attributes
 164
 165 Attributes are used throughout for specifying various properties.
 166 For portability reasons attributes can only have 32 bit integral values
 167 (signed or unsigned).
 168 @c How about an example?
 169
 170 There are four kinds of attributes: boolean, integer, enumerated, and bitset.
 171 Boolean attributes can be achieved via others, but they occur frequently
 172 enough that they are special cased (and one bit can be used to record them).
 173 Bitset attributes are a useful simplification when one wants to indicate an
 174 object can be in one of many states (e.g. an instruction may be supported by
 175 multiple machines).
 176
 177 String attributes might be a useful addition.
 178 Another useful addition might be functional attributes (the attribute
 179 is computed at run-time - currently all attributes are computed at
 180 compile time).  One way to implement functional attributes would be to
 181 record the attributes as byte-code and lazily evaluate them, caching the
 182 results as appropriate.  The syntax has been carefully done to not
 183 preclude either as an upward compatible extension.
 184
 185 Attributes must be defined before they can be used.
 186 There are several predefined attributes for entry types that need them
 187 (instruction field, hardware, operand, and instruction).  Predefined
 188 attributes are documented in each relevant section below.
 189
 190 In C applications an enum is created that defines all the attributes.
 191 Applications that wish to be architecture independent need the attribute
 192 to have the same value across all architectures.  This is achieved by
 193 giving the attribute the INDEX attribute, which specifies the enum value
 194 must be fixed across all architectures.
 195 @c FIXME: Give an example here.
 196 @c FIXME: Need a better name than `INDEX'.
 197
 198 Convention requires attribute names consist of uppercase letters, numbers,
 199 "-", and "_", and must begin with a letter.
 200 To be consistent with Scheme, "-" is preferred over "_".
 201
 202 @subsection Boolean Attributes
 203 @cindex Attributes, boolean
 204
 205 Boolean attributes are defined with:
 206
 207 @example
 208 (define-attribute
 209   (type boolean)
 210   (for user-list)
 211   (name attribute-name)
 212   (comment "attribute comment")
 213   (attrs attribute-attributes)
 214 )
 215 @end example
 216
 217 The default value of boolean attributes is always false.  This can be
 218 relaxed, but it's one extra complication that is currently unnecessary.
 219 Boolean attributes are specified in either of two forms: (NAME expr),
 220 and NAME, !NAME.  The first form is the canonical form.  The latter two
 221 are shorthand versions.  `NAME' means "true" and `!NAME' means "false".
 222 @samp{expr} is an expression that evaluates to 0 for false and non-zero
 223 for true @footnote{The details of @code{expr} is still undecided.}.
 224
 225 @code{user-list} is a space separated list of entry types that will use
 226 the attribute.  Possible values are: @samp{attr}, @samp{enum},
 227 @samp{cpu}, @samp{mach}, @samp{model}, @samp{ifield}, @samp{hardware},
 228 @samp{operand}, @samp{insn} and @samp{macro-insn}.  If omitted all are
 229 considered users of the attribute.
 230
 231 @subsection Integer Attributes
 232 @cindex Attributes, integer
 233
 234 Integer attributes are defined with:
 235
 236 @example
 237 (define-attribute
 238   (type integer)
 239   (for user-list)
 240   (name attribute-name)
 241   (comment "attribute comment")
 242   (attrs attribute-attributes)
 243   (default expr)
 244 )
 245 @end example
 246
 247 If omitted, the default is 0.
 248
 249 (*note: The details of `expr' is still undecided.  For now it must be
 250 an integer.)
 251
 252 Integer attributes are specified with (NAME expr).
 253
 254 @subsection Enumerated Attributes
 255 @cindex Attributes, enumerated
 256
 257 Enumerated attributes are the same as integer attributes except the
 258 range of possible values is restricted and each value has a name.
 259 Enumerated attributes are defined with
 260
 261 @example
 262 (define-attribute
 263   (type enum)
 264   (for user-list)
 265   (name attribute-name)
 266   (comment "attribute comment")
 267   (attrs attribute-attributes)
 268   (values enum-value1 enum-value2 ...)
 269   (default expr)
 270 )
 271 @end example
 272
 273 If omitted, the default is the first specified value.
 274
 275 (*note: The details of `expr' is still undecided.  For now it must be the
 276 name of one of the specified values.)
 277
 278 Enum attributes are specified with (NAME expr).
 279
 280 @subsection Bitset Attributes
 281 @cindex Attributes, bitset
 282
 283 Bitset attributes are for situations where you want to indicate something
 284 is a subset of a small set of possibilities.  The MACH attribute uses this
 285 for example to allow specifying which of the various machines support a
 286 particular insn.
 287 (*note: At present the maximum number of possibilities is 32.
 288 This is an implementation restriction which can be relaxed, but there's
 289 currently no rush.)
 290
 291 Bitset attributes are defined with:
 292
 293 @example
 294 (define-attribute
 295   (type bitset)
 296   (for user-list)
 297   (name attribute-name)
 298   (comment "attribute comment")
 299   (attrs attribute-attributes)
 300   (values enum-value1 enum-value2 ...)
 301   (default default-name)
 302 )
 303 @end example
 304
 305 @samp{default-name} must be the name of one of the specified values.  If
 306 omitted, it is the first value.
 307
 308 Bitset attributes are specified with @code{(NAME val1,val2,...)}.  There
 309 must be no spaces in ``@code{val1,val2,...}'' and each value must be a
 310 valid Scheme symbol.
 311
 312 (*note: it's not clear whether allowing arbitrary expressions will be
 313 useful here, but doing so is not precluded.  For now each value must be
 314 the name of one of the specified values.)
 315
 316 @node Architecture variants
 317 @section Architecture Variants
 318 @cindex Architecture variants
 319
 320 The base architecture and its variants are described in four parts:
 321 @code{define-arch}, @code{define-isa}, @code{define-cpu}, and
 322 @code{define-mach}.
 323
 324 @menu
 325 * define-arch::
 326 * define-isa::
 327 * define-cpu::
 328 * define-mach::
 329 @end menu
 330
 331 @node define-arch
 332 @subsection define-arch
 333 @cindex define-arch
 334
 335 @code{define-arch} describes the overall architecture, and must be
 336 present.
 337
 338 The syntax of @code{define-arch} is:
 339
 340 @example
 341 (define-arch
 342   (name architecture-name) ; e.g. m32r
 343   (comment "description")  ; e.g. "Mitsubishi M32R"
 344   (attrs attribute-list)
 345   (default-alignment aligned|unaligned|forced)
 346   (insn-lsb0? #f|#t)
 347   (machs mach-name-list)
 348   (isas isa-name-list)
 349 )
 350 @end example
 351
 352 @subsubsection default-alignment
 353
 354 Specify the default alignment to use when fetching data (and
 355 instructions) from memory.  At present this can't be overridden, but
 356 support can be added if necessary.  The default is @code{aligned}.
 357
 358 @subsubsection insn-lsb0?
 359 @cindex insn-lsb0?
 360
 361 Specifies whether the most significant or least significant bit in a
 362 word is bit number 0.  Generally this should conform to the convention
 363 in the architecture manual.  This is independent of endianness and is an
 364 architecture wide specification.  There is no support for using
 365 different bit numbering conventions within an architecture.
 366 @c Not that such support can't be added of course.
 367
 368 Instruction fields are always numbered beginning with the most
 369 significant bit.  That is, the `start' of a field is always its most
 370 significant bit.  For example, a 4 bit field in the uppermost bits of a
 371 32 bit instruction would have a start/length of (31 4) when insn-lsb0? =
 372 @code{#t}, and (0 4) when insn-lsb0? = @code{#f}.
 373
 374 @subsubsection mach-name-list
 375
 376 The list of names of machines in the architecture.
 377 There should be one entry for each @code{define-mach}.
 378
 379 @subsubsection isa-name-list
 380
 381 The list of names of instruction sets in the architecture.
 382 There must be one for each @code{define-isa}.
 383 An example of an architecture with more than one is the ARM which
 384 has a 32 bit instruction set and a 16 bit "Thumb" instruction set
 385 (the sizes here refer to instruction size).
 386
 387 @node define-isa
 388 @subsection define-isa
 389 @cindex define-isa
 390
 391 @code{define-isa} describes aspects of the instruction set.
 392 A minimum of one ISA must be defined.
 393
 394 The syntax of @code{define-isa} is:
 395
 396 @example
 397 (define-isa
 398   (name isa-name)
 399   (comment "description")
 400   (attrs attribute-list)
 401   (default-insn-word-bitsize n)
 402   (default-insn-bitsize n)
 403   (base-insn-bitsize n)
 404   ; (decode-assist (b0 b1 b2 ...)) ; generally unnecessary
 405   (liw-insns n)
 406   (parallel-insns n)
 407   (condition ifield-name expr)
 408   (setup-semantics expr)
 409   ; (decode-splits decode-split-list) ; support temporarily disabled
 410   ; ??? missing here are fetch/execute specs
 411 )
 412 @end example
 413
 414 @subsubsection default-insn-word-bitsize
 415
 416 Specifies the default size of an instruction word in bits.
 417 This affects the numbering of field bits in words beyond the
 418 base instruction.
 419 @xref{Instruction fields}, for more information.
 420
 421 ??? There is currently no explicit way to specify a different instruction
 422 word bitsize for particular instructions, it is derived from the instruction
 423 field specs.
 424
 425 @subsubsection default-insn-bitsize
 426
 427 The default size of an instruction in bits. It is generally the size of
 428 the smallest instruction. It is used when parsing instruction fields.
 429 It is also used by the disassembler to know how many bytes to skip for
 430 unrecognized instructions.
 431
 432 @subsubsection base-insn-bitsize
 433
 434 The minimum size of an instruction, in bits, to fetch during execution.
 435 If the architecture has a variable length instruction set, this is the
 436 size of the initial word to fetch.  There is no need to specify the
 437 maximum length of an instruction, that can be computed from the
 438 instructions.  Examples:
 439
 440 @table @asis
 441 @item i386
 442 8
 443 @item M68k
 444 16
 445 @item SPARC
 446 32
 447 @item M32R
 448 32
 449 @end table
 450
 451 The M32R case is interesting because instructions can be 16 or 32 bits.
 452 However instructions on 32 bit boundaries can always be fetched 32 bits
 453 at a time as 16 bit instructions always come in pairs.
 454
 455 @subsubsection decode-assist
 456 @cindex decode-assist
 457
 458 Override CGEN's heuristics about which bits to initially use to decode
 459 instructions in a simulator.  For example on the SPARC these are bits:
 460 31 30 24 23 22 21 20 19.  The entire decoder can be machine generated,
 461 so this field is entirely optional.  Since the heuristics are quite
 462 good, you should only use this field if you have evidence that you
 463 can pick a better set, in which case the CGEN developers would like to
 464 hear from you!
 465
 466 ??? It might be useful to provide greater control, but this is sufficient
 467 for now.
 468
 469 It is okay if the opcode bits are over-specified for some instructions.
 470 It is also okay if the opcode bits are under-specified for some instructions.
 471 The machine generated decoder will properly handle both these situations.
 472 Just pick a useful number of bits that distinguishes most instructions.
 473 It is usually best to not pick more than 8 bits to keep the size of the
 474 initial decode table down.
 475
 476 Bit numbering is defined by the @code{insn-lsb0?} field.
 477
 478 @subsubsection liw-insns
 479 @cindex liw-insns
 480
 481 The number of instructions the CPU always fetches at once.  This is
 482 intended for architectures like the M32R, and does not refer to a CPU's
 483 ability to pre-fetch instructions.  The default is 1.
 484
 485 @subsubsection parallel-insns
 486 @cindex parallel-insns
 487
 488 The maximum number of instructions the CPU can execute in parallel.  The
 489 default is 1.
 490
 491 ??? Rename this to @code{max-parallel-insns}?
 492
 493 @subsubsection condition
 494
 495 Some architectures like ARM and ARC conditionally execute every instruction
 496 based on the condition specified by one instruction field.
 497 The @code{condition} spec exists to support these architectures.
 498 @code{ifield-name} is the name of the instruction field denoting the
 499 condition and @code{expression} is an RTL expressions that returns
 500 the value of the condition (false=zero, true=non-zero).
 501
 502 @subsubsection setup-semantics
 503
 504 Specify a statement to be performed prior to executing particular instructions.
 505 This is used, for example, on the ARM where the value of the program counter
 506 (general register 15) is a function of the instruction (it is either
 507 pc+8 or pc+12, depending on the instruction).
 508
 509 @subsubsection decode-splits
 510
 511 Specify a list of field names and values to split instructions up by.
 512 This is used, for example, on the ARM where the behavior of some instructions
 513 is quite different when the destination register is r15 (the pc).
 514
 515 The syntax is:
 516
 517 @example
 518 (decode-splits
 519   (ifield1-name
 520    constraints
 521    ((split1-name (value1 value2 ...)) (split2-name ...)))
 522   (ifield2-name
 523    ...)
 524 )
 525 @end example
 526
 527 @code{constraints} is work-in-progress and should be @code{()} for now.
 528
 529 One copy of each instruction satisfying @code{constraint} is made
 530 for each specified split.  The semantics of each copy are then
 531 simplified based on the known values of the specified instruction field.
 532
 533 @node define-cpu
 534 @subsection define-cpu
 535 @cindex define-cpu
 536
 537 @code{define-cpu} defines a ``CPU family'' which is a programmer
 538 specified collection of related machines.  What constitutes a family is
 539 work-in-progress however it is intended to distinguish things like
 540 sparc32 vs sparc64.  Machines in a family are sufficiently similar that
 541 the simulator semantic code can handle any differences at run time.  At
 542 least that's the current idea.  A minimum of one CPU family must be
 543 defined.
 544 @footnote{FIXME: Using "cpu" in "cpu-family" here is confusing.
 545 Need a better name.  Maybe just "family"?}
 546
 547 The syntax of @code{define-cpu} is:
 548
 549 @example
 550 (define-cpu
 551   (name cpu-name)
 552   (comment "description")
 553   (attrs attribute-list)
 554   (endian big|little|either)
 555   (insn-endian big|little|either)
 556   (data-endian big|little|either)
 557   (float-endian big|little|either)
 558   (word-bitsize n)
 559   (insn-chunk-bitsize n)
 560   (parallel-insns n)
 561   (file-transform transformation)
 562 )
 563 @end example
 564
 565 @subsubsection endian
 566
 567 The endianness of the architecture is one of three values: @code{big},
 568 @code{little} and @code{either}.
 569
 570 An architecture may have multiple endiannesses, including one for each
 571 of: instructions, integers, and floats (not that that's intended to be the
 572 complete list).  These are specified with @code{insn-endian},
 573 @code{data-endian}, and @code{float-endian} respectively.
 574
 575 Possible values for @code{insn-endian} are: @code{big}, @code{little},
 576 and @code{either}.  If missing, the value is taken from @code{endian}.
 577
 578 Possible values for @code{data-endian} and @code{float-endian} are: @code{big},
 579 @code{big-words}, @code{little}, @code{little-words} and @code{either}.
 580 If @code{big-words} then each word is little-endian.
 581 If @code{little-words} then each word is big-endian.
 582 If missing, the value is taken from @code{endian}.
 583
 584 ??? Support for these is work-in-progress.  All forms are recognized
 585 by the @file{.cpu} file reader, but not all are supported internally.
 586
 587 @subsubsection word-bitsize
 588
 589 The number of bits in a word.  In GCC, this is @code{BITS_PER_WORD}.
 590
 591 @subsubsection insn-chunk-bitsize
 592
 593 The number of bits in an instruction word chunk, for purposes of
 594 per-chunk endianness conversion.  The default is zero, meaning
 595 no chunking is required.
 596
 597 @subsubsection parallel-insns
 598
 599 This is the same as the @code{parallel-insns} spec of @code{define-isa}.
 600 It allows a CPU family to override the value.
 601
 602 @subsubsection file-transform
 603
 604 Specify the file name transformation of generated code.
 605
 606 Each generated file has a named related to the ISA or CPU family.
 607 Sometimes generated code needs to know the name of another generated
 608 file (e.g. #include's).
 609 At present @code{file-transform} specifies the suffix.
 610
 611 For example, M32R/x generated files have an `x' suffix, as in @file{cpux.h}
 612 for the @file{cpu.h} header.  This is indicated with
 613 @code{(file-transform "x")}.
 614
 615 ??? Ideally generated code wouldn't need to know anything about file names.
 616 This breaks down for #include's.  It can be fixed with symlinks or other
 617 means.
 618
 619 @node define-mach
 620 @subsection define-mach
 621 @cindex define-mach
 622
 623 @code{define-mach} defines a distinct variant of a CPU.  It currently
 624 has a one-to-one correspondence with BFD's "mach number".  A minimum of
 625 one mach must be defined.
 626
 627 The syntax of @code{define-mach} is:
 628
 629 @example
 630 (define-mach
 631   (name mach-name)
 632   (comment "description")
 633   (attrs attribute-list)
 634   (cpu cpu-family-name)
 635   (bfd-name "bfd-name")
 636   (isas isa-name-list)
 637 )
 638 @end example
 639
 640 @subsubsection bfd-name
 641 @cindex bfd-name
 642
 643 The name of the mach as used by BFD.  If not specified the name of the
 644 mach is used.
 645
 646 @subsubsection isas
 647
 648 List of names of ISA's the machine supports.
 649
 650 @node Model variants
 651 @section Model Variants
 652
 653 For each `machine', as defined here, there is one or more `models'.
 654 There must be at least one model for each machine.
 655 (*note: There could be a default, but requiring one doesn't involve that much
 656 extra typing and forces the programmer to at least think about such things.)
 657
 658 @example
 659 (define-model
 660   (name model-name)
 661   (comment "description")
 662   (attrs attribute-list)
 663   (mach machine-name)
 664   (state (variable-name-1 variable-mode-1) ...)
 665   (unit name "comment" (attributes)
 666         issue done state inputs outputs profile)
 667 )
 668 @end example
 669
 670 @subsection mach
 671
 672 The name of the machine the model is an implementation of.
 673
 674 @subsection state
 675
 676 A list of variable-name/mode pairs for recording global function unit
 677 state.  For example on the M32R the value is @code{(state (h-gr UINT))}
 678 and is a bitmask of which register(s) are the targets of loads and thus
 679 subject to load stalls.
 680
 681 @subsection unit
 682
 683 Specifies a function unit.  Any number of function units may be specified.
 684 The @code{u-exec} unit must be specified as it is the default.
 685
 686 The syntax is:
 687
 688 @example
 689   (unit name "comment" (attributes)
 690      issue done state inputs outputs profile)
 691 @end example
 692
 693 @samp{issue} is the number of operations that may be in progress.
 694 It originates from GCC function unit specification.  In general the
 695 value should be 1.
 696
 697 @samp{done} is the latency of the unit.  The value is the number of cycles
 698 until the result is ready.
 699
 700 @samp{state} has the same syntax as the global model `state' and is a list of
 701 variable-name/mode pairs.
 702
 703 @samp{inputs} is a list of inputs to the function unit.
 704 Each element is @code{(operand-name mode default-value)}.
 705
 706 @samp{outputs} is a list of outputs of the function unit.
 707 Each element is @code{(operand-name mode default-value)}.
 708
 709 @samp{profile} is an rtl-code sequence that performs function unit
 710 modeling.  At present the only possible value is @code{()} meaning
 711 invoke a user supplied function named @code{<cpu>_model_<mach>_<unit>}.
 712
 713 The current function unit specification is a first pass in order to
 714 achieve something that moderately works for the intended purpose (cycle
 715 counting on the simulator).  Something more elaborate is on the todo list
 716 but there is currently no schedule for it.  The new specification must
 717 try to be application independent.  Some known applications are:
 718 cycle counting in the simulator, code scheduling in a compiler, and code
 719 scheduling in a JIT simulator (where speed of analysis can be more
 720 important than getting an optimum schedule).
 721
 722 The inputs/outputs fields are how elements in the semantic code are mapped
 723 to function units.  Each input and output has a name that corresponds
 724 with the name of the operand in the semantics.  Where there is no
 725 correspondence, a mapping can be made in the unit specification of the
 726 instruction (see the subsection titled ``Timing'').
 727
 728 Another way to achieve the correspondence is to create separate function
 729 units that contain the desired input/output names.  For example on the
 730 M32R the u-exec unit is defined as:
 731
 732 @example
 733 (unit u-exec "Execution Unit" ()
 734    1 1 ; issue done
 735    () ; state
 736    ((sr INT -1) (sr2 INT -1)) ; inputs
 737    ((dr INT -1)) ; outputs
 738    () ; profile action (default)
 739 )
 740 @end example
 741
 742 This handles instructions that use sr, sr2 and dr as operands.  A second
 743 function unit called @samp{u-cmp} is defined as:
 744
 745 @example
 746 (unit u-cmp "Compare Unit" ()
 747    1 1 ; issue done
 748    () ; state
 749    ((src1 INT -1) (src2 INT -1)) ; inputs
 750    () ; outputs
 751    () ; profile action (default)
 752 )
 753 @end example
 754
 755 This handles instructions that use src1 and src2 as operands.  The
 756 organization of units is arbitrary.  On the M32R, src1/src2 instructions
 757 are typically compare instructions so a separate function unit was
 758 created for them.
 759
 760 @node Hardware elements
 761 @section Hardware Elements
 762
 763 The elements of hardware that make up a CPU are defined with
 764 @code{define-hardware}.  Examples of hardware elements include
 765 registers, condition bits, immediate constants and memory.
 766
 767 Instruction fields that provide numerical values (``immediate
 768 constants'') aren't really elements of the hardware, but it simplifies
 769 things to think of them this way.  Think of them as @emph{constant
 770 generators}@footnote{A term borrowed from the book on the Bulldog
 771 compiler and perhaps other sources.}.
 772
 773 Hardware elements are defined with:
 774
 775 @example
 776 (define-hardware
 777   (name hardware-name)
 778   (comment "description")
 779   (attrs attribute-list)
 780   (semantic-name hardware-semantic-name)
 781   (type type-name type-arg1 type-arg2 ...)
 782   (indices index-type index-arg1 index-arg2 ...)
 783   (values values-type values-arg1 values-arg2 ...)
 784   (handlers handler1 handler2 ...)
 785   (get (args) (expression))
 786   (set (args) (expression))
 787 )
 788 @end example
 789
 790 The only required members are @samp{name} and @samp{type}. Convention
 791 requires @samp{hardware-name} begin with @samp{h-}.
 792
 793 @subsection attrs
 794
 795 List of attributes. There are several predefined hardware attributes:
 796
 797 @itemize @minus
 798 @item MACH
 799
 800 A bitset attribute used to specify which machines have this hardware element.
 801 Do not specify the MACH attribute if the value is "all machs".
 802
 803 Usage: @code{(MACH mach1,mach2,...)}
 804 There must be no spaces in ``@code{mach1,mach2,...}''.
 805
 806 @item CACHE-ADDR
 807
 808 A hint to the simulator semantic code generator to tell it it can record the
 809 address of a selected register in an array of registers.  This speeds up
 810 simulation by moving the array computation to extraction time.
 811 This attribute is only useful to register arrays and cannot be specified
 812 with @code{VIRTUAL} (??? revisit).
 813
 814 @item PROFILE
 815
 816 Ignore.  This is a work-in-progress to define how to profile references
 817 to hardware elements.
 818
 819 @item VIRTUAL
 820
 821 The hardware element doesn't require any storage.
 822 This is used when you want a value that is derived from some other value.
 823 If @code{VIRTUAL} is specified, @code{get} and @code{set} specs must be
 824 provided.
 825 @end itemize
 826
 827 @subsection type
 828
 829 This is the type of hardware.  Current values are: @samp{register},
 830 @samp{memory}, and @samp{immediate}.
 831
 832 For registers the syntax is one of:
 833
 834 @example
 835 @code{(register mode [(number)])}
 836 @code{(register (mode bits) [(number)])}
 837 @end example
 838
 839 where @samp{(number)} is the number of registers and is optional. If
 840 omitted, the default is @samp{(1)}.
 841 The second form is useful for describing registers with an odd (as in
 842 unusual) number of bits.
 843 @code{mode} for the second form must be one of @samp{INT} or @samp{UINT}.
 844 Since these two modes don't have an implicit size, they cannot be used for
 845 the first form.
 846
 847 @c ??? Might wish to remove the mode here and just specify number of bits.
 848
 849 For memory the syntax is:
 850
 851 @example
 852 @code{(memory mode (size))}
 853 @end example
 854
 855 where @samp{(size)} is the size of the memory in @samp{mode} units.
 856 In general @samp{mode} should be @code{QI}.
 857
 858 For immediates the syntax is one of
 859
 860 @example
 861 @code{(immediate mode)}
 862 @code{(immediate (mode bits))}
 863 @end example
 864
 865 The second form is for values for which a mode of that size doesn't exist.
 866 @samp{mode} for the second form must be one of @code{INT} or @code{UINT}.
 867 Since these two modes don't have an implicit size, they cannot be used
 868 for the first form.
 869
 870 ??? There's no real reason why a mode like SI can't be used
 871 for odd-sized immediate values.  The @samp{bits} field indicates the size
 872 and the @samp{mode} field indicates the mode in which the value will be used,
 873 as well as its signedness.  This would allow removing INT/UINT for this
 874 purpose.  On the other hand, a non-width specific mode allows applications
 875 to choose one (a simulator might prefer to store immediates in an `int'
 876 rather than, say, char if the specified mode was @code{QI}).
 877
 878 @subsection indices
 879
 880 Specify names for individual elements with the @code{indices} spec.
 881 It is only valid for registers with more than one element.
 882
 883 The syntax is:
 884
 885 @example
 886 @code{(indices index-type arg1 arg2 ...)}
 887 @end example
 888
 889 where @samp{index-type} specifies the kind of index and @samp{arg1 arg2 ...}
 890 are arguments to @samp{index-type}.
 891
 892 The are two supported values for @samp{index-type}: @code{keyword}
 893 and @code{extern-keyword}.  The difference is that indices defined with
 894 @code{keyword} are kept internal to the hardware element's definition
 895 and are not usable elsewhere, whereas @code{extern-keyword} specifies
 896 a set of indices defined elsewhere.
 897
 898 @subsubsection keyword
 899
 900 @example
 901 @code{(indices keyword "prefix" ((name1 value1) (name2 value2) ...))}
 902 @end example
 903
 904 @samp{prefix} is the common prefix for each of the index names.
 905 For example, SPARC registers usually begin with @samp{"%"}.
 906
 907 Each @samp{(name value)} pair maps a name with an index number.
 908 An index can be specified multiple times, for example, when a register
 909 has multiple names.
 910
 911 Example from Thumb:
 912
 913 @example
 914 (define-hardware
 915   (name h-gr-t)
 916   (comment "Thumb's general purpose registers")
 917   (attrs (ISA thumb) VIRTUAL) ; ??? CACHE-ADDR should be doable
 918   (type register WI (8))
 919   (indices keyword ""
 920            ((r0 0) (r1 1) (r2 2) (r3 3) (r4 4) (r5 5) (r6 6) (r7 7)))
 921   (get (regno) (reg h-gr regno))
 922   (set (regno newval) (set (reg h-gr regno) newval))
 923 )
 924 @end example
 925
 926 @subsubsection extern-keyword
 927
 928 @example
 929 @code{(indices extern-keyword keyword-name)}
 930 @end example
 931
 932 Example from M32R:
 933
 934 @example
 935 (define-keyword
 936   (name gr-names)
 937   (print-name h-gr)
 938   (prefix "")
 939   (values (fp 13) (lr 14) (sp 15)
 940           (r0 0) (r1 1) (r2 2) (r3 3) (r4 4) (r5 5) (r6 6) (r7 7)
 941           (r8 8) (r9 9) (r10 10) (r11 11) (r12 12) (r13 13) (r14 14) (r15 15))
 942 )
 943
 944 (define-hardware
 945   (name h-gr)
 946   (comment "general registers")
 947   (attrs PROFILE CACHE-ADDR)
 948   (type register WI (16))
 949   (indices extern-keyword gr-names)
 950 )
 951 @end example
 952
 953 @subsection values
 954
 955 Specify a list of valid values with the @code{values} spec.
 956 @c Clumsy wording.
 957
 958 The syntax is identical to the syntax for @code{indices}.
 959 It is only valid for immediates.
 960
 961 Example from sparc64:
 962
 963 @example
 964 (define-hardware
 965   (name h-p)
 966   (comment "prediction bit")
 967   (attrs (MACH64))
 968   (type immediate (UINT 1))
 969   (values keyword "" (("" 0) (",pf" 0) (",pt" 1)))
 970 )
 971 @end example
 972
 973 @subsection handlers
 974
 975 The @code{handlers} spec is an escape hatch for indicating when a
 976 programmer supplied routine must be called to perform a function.
 977
 978 The syntax is:
 979
 980 @example
 981 @samp{(handlers (handler-name1 "function_name1")
 982                 (handler-name2 "function_name2")
 983                 ...)}
 984 @end example
 985
 986 @samp{handler-name} must be one of @code{parse} or @code{print}.
 987 How @samp{function_name} is used is application specific, but in
 988 general it is the name of a function to call.  The only application
 989 that uses this at present is Opcodes.  See the Opcodes documentation for
 990 a description of each function's expected prototype.
 991
 992 @subsection get
 993
 994 Specify special processing to be performed when a value is read
 995 with the @code{get} spec.
 996
 997 The syntax for scalar registers is:
 998
 999 @example
1000 @samp{(get () (expression))}
1001 @end example
1002
1003 The syntax for vector registers is:
1004
1005 @example
1006 @samp{(get (index) (expression))}
1007 @end example
1008
1009 @code{expression} is an RTL expression that computes the value to return.
1010 The mode of the result must be the mode of the register.
1011
1012 @code{index} is the name of the index as it appears in @code{expression}.
1013
1014 At present, @code{sequence}, @code{parallel}, and @code{case} expressions
1015 are not allowed here.
1016
1017 @subsection set
1018
1019 Specify special processing to be performed when a value is written
1020 with the @code{set} spec.
1021
1022 The syntax for scalar registers is:
1023
1024 @example
1025 @samp{(set (newval) (expression))}
1026 @end example
1027
1028 The syntax for vector registers is:
1029
1030 @example
1031 @samp{(set (index newval) (expression))}
1032 @end example
1033
1034 @code{expression} is an RTL expression that stores @code{newval}
1035 in the register.  This may involve storing values in other registers as well.
1036 @code{expression} must be one of @code{set}, @code{if}, @code{sequence}, or
1037 @code{case}.
1038
1039 @code{index} is the name of the index as it appears in @code{expression}.
1040
1041 @subsection Predefined hardware elements
1042
1043 Several hardware types are predefined:
1044
1045 @table @code
1046 @item h-uint
1047 unsigned integer
1048 @item h-sint
1049 signed integer
1050 @item h-memory
1051 main memory, where ``main'' is loosely defined
1052 @item h-addr
1053 data address (data only)
1054 @item h-iaddr
1055 instruction address (instructions only)
1056 @end table
1057
1058 @subsection Program counter
1059
1060 The program counter must be defined and is not a builtin.
1061 If get/set specs are not required, define it as:
1062
1063 @example
1064 (dnh h-pc "program counter" (PC) (pc) () () ())
1065 @end example
1066
1067 If get/set specs are required, define it as:
1068
1069 @example
1070 (define-hardware
1071   (name h-pc)
1072   (comment "<ARCH> program counter")
1073   (attrs PC)
1074   (type pc)
1075   (get () <insert get code here>)
1076   (set (newval) <insert set code here>)
1077 )
1078 @end example
1079
1080 If the architecture has multiple instruction sets, all must be specified.
1081 If they're not, the default is the first one which is not what you want.
1082 Here's an example from @file{arm.cpu}:
1083
1084 @example
1085 (define-hardware
1086   (name h-pc)
1087   (comment "ARM program counter (h-gr reg 15)")
1088   (attrs PC (ISA arm,thumb))
1089   (type pc)
1090   (set (newval)
1091        (if (reg h-tbit)
1092            (set (raw-reg SI h-pc) (and newval -2))
1093            (set (raw-reg SI h-pc) (and newval -4))))
1094 )
1095 @end example
1096
1097 @subsection Simplification macros
1098
1099 To simplify @file{.cpu} files, the @code{dnh}
1100 (@code{define-normal-hardware}) macro exists that takes a fixed set of
1101 positional arguments for the typical hardware element.  The syntax of
1102 @code{dnh} is:
1103
1104 @code{(dnh name comment attributes type indices values handlers)}
1105
1106 Example:
1107
1108 @example
1109 (dnh h-gr "general registers"
1110      () ; attributes
1111      (register WI (16))
1112      (keyword "" ((fp 13) (sp 15) (lr 14)
1113                   (r0 0) (r1 1) (r2 2) (r3 3)
1114                   (r4 4) (r5 5) (r6 6) (r7 7)
1115                   (r8 8) (r9 9) (r10 10) (r11 11)
1116                   (r12 12) (r13 13) (r14 14) (r15 15)))
1117      () ()
1118 )
1119 @end example
1120
1121 This defines an array of 16 registers of mode @code{WI} ("word int").
1122 The names of the registers are @code{r0...r15}, and registers 13, 14 and
1123 15 also have the names @code{fp}, @code{lr} and @code{sp} respectively.
1124
1125 Scalar registers with no special requirements occur frequently.
1126 Macro @code{dsh} (@code{define-simple-hardware}) is identical to
1127 @code{dnh} except does not include the @code{indices}, @code{values},
1128 or @code{handlers} specs.
1129
1130 @example
1131 (dsh h-ibit "interrupt enable bit" () (register BI))
1132 @end example
1133
1134 @node Instruction fields
1135 @section Instruction Fields
1136 @cindex Fields, instruction
1137
1138 Instruction fields define the raw bitfields of each instruction.
1139 Minimal semantic meaning is attributed to them.  Support is provided for
1140 mapping to and from the raw bit pattern and the usable contents, and
1141 other simple manipulations.
1142
1143 The syntax for defining instruction fields is:
1144
1145 @example
1146 (define-ifield
1147   (name field-name)
1148   (comment "description")
1149   (attrs attribute-list)
1150   (start starting-bit-number)
1151   (length number-of-bits)
1152   (follows ifield-name)
1153   (mode mode-name)
1154   (encode (value pc) (rtx to describe encoding))
1155   (decode (value pc) (rtx to describe decoding))
1156 )
1157 @end example
1158
1159 (*note: Whether to also provide a way to specify instruction formats is not yet
1160 clear.  Currently they are computed from the instructions, so there's no
1161 current *need* to provided them.  However, providing the ability as an
1162 option may simplify other tools CGEN is used to generate.  This
1163 simplification would come in the form of giving known names to the formats
1164 which CPU reference manuals often do.  Pre-specified instruction formats
1165 may also simplify expression of more complicated instruction sets.)
1166
1167 (*note: Positional specification simplifies instruction description somewhat
1168 in that there is no required order of fields, and a disjunct set of fields can
1169 be referred to as one.  On the other hand it can require knowledge of the length
1170 of the instruction which is inappropriate in cases like the M32R where
1171 the main fields have the same name and "position" regardless of the length
1172 of the instruction.  Moving positional specification into instruction formats,
1173 whether machine generated or programmer specified, may be done.)
1174
1175 Convention requires @samp{field-name} begin with @samp{f-}.
1176
1177 @subsection attrs
1178
1179 There are several predefined instruction field attributes:
1180
1181 @table @code
1182 @item PCREL-ADDR
1183 The field contains a PC relative address.  Various CPUs have various
1184 offsets from the PC from which the address is calculated.  This is
1185 specified in the encode and decode sections.
1186
1187 @item ABS-ADDR
1188 The field contains an absolute address.
1189
1190 @item SIGN-OPT
1191 The field has an optional sign.  It is sign-extended during
1192 extraction. Allowable values are -2^(n-1) to (2^n)-1.
1193
1194 @item RESERVED
1195 The field is marked as ``reserved'' by the architecture.
1196 This is an informational attribute.  Tools may use it
1197 to validate programs, either statically or dynamically.
1198
1199 @item VIRTUAL
1200 The field does not directly contribute to the instruction's value.  This
1201 is used to simplify semantic or assembler descriptions where a fields
1202 value is based on other values.  Multi-ifields are always virtual.
1203 @end table
1204
1205 @subsection start
1206 The bit number of the field's most significant bit in the instruction.
1207 Bit numbering is determined by the @code{insn-lsb0?} field of
1208 @code{define-arch}.
1209
1210 @subsection length
1211 The number of bits in the field.  The field must be contiguous.  For
1212 non-contiguous instruction fields use "multi-ifields"
1213 (@pxref{Instruction fields}).
1214
1215 @subsection follows
1216 Optional.  Experimental.
1217 This should not be used for the specification of RISC-like architectures.
1218 It is an experiment in supporting CISC-like architectures.
1219 The argument is the name of the ifield or operand that immediately precedes
1220 this one.  In general the argument is an "anyof" operand.  The @code{follows}
1221 spec allows subsequent ifields to ``float''.
1222
1223 @subsection mode
1224 The mode the value is to be interpreted in.
1225 Usually this is @code{INT} or @code{UINT}.
1226
1227 @c ??? There's no real reason why modes like SI can't be used here.
1228 The @samp{length} field specifies the number of bits in the field,
1229 and the @samp{mode} field indicates the mode in which the value will be used,
1230 as well as its signedness.  This would allow removing INT/UINT for this
1231 purpose.  On the other hand, a non-width specific mode allows applications
1232 to choose one (a simulator might prefer to store immediates in an `int'
1233 rather than, say, char if the specified mode was @code{QI}).
1234
1235 @subsection encode
1236 An expression to apply to convert from usable values to raw field
1237 values.  The syntax is @code{(encode (value pc) expression)} or more
1238 specifically @code{(encode ((<mode1> value) (IAI pc)) <expression>)},
1239 where @code{<mode1>} is the mode of the the ``incoming'' value, and
1240 @code{<expression>} is an rtx to convert @code{value} to something that
1241 can be stored in the field.
1242
1243 Example:
1244
1245 @example
1246 (encode ((SF value) (IAI pc))
1247         (cond WI
1248               ((eq value (const SF 1.0)) (const 0))
1249               ((eq value (const SF 0.5)) (const 1))
1250               ((eq value (const SF -1.0)) (const 2))
1251               ((eq value (const SF 2.0)) (const 3))
1252               (else (error "invalid floating point value for field foo"))))
1253 @end example
1254
1255 In this example four floating point immediate values are represented in a
1256 field of two bits.  The above might be expanded to a series of `if' statements
1257 or the generator could determine a `switch' statement is more appropriate.
1258
1259 @subsection decode
1260
1261 An expression to apply to convert from raw field values to usable
1262 values.  The syntax is @code{(decode (value pc) expression)} or more
1263 specifically @code{(decode ((WI value) (IAI pc)) <expression>)}, where
1264 @code{<expression>} is an rtx to convert @code{value} to something
1265 usable.
1266
1267 Example:
1268
1269 @example
1270 (decode ((WI value) (IAI pc))
1271         (cond SF
1272               ((eq value 0) (const SF 1.0))
1273               ((eq value 1) (const SF 0.5))
1274               ((eq value 2) (const SF -1.0))
1275               ((eq value 3) (const SF 2.0))))
1276 @end example
1277
1278 There's no need to provide an error case as presumably @code{value}
1279 would never have an invalid value, though certainly one could provide an
1280 error case if one wanted to.
1281
1282 @subsection Non-contiguous fields
1283 @cindex Fields, non-contiguous
1284
1285 Non-contiguous fields (e.g. sparc64's 16 bit displacement field) are
1286 built on top of support for contiguous fields.  The syntax for defining
1287 such fields is:
1288
1289 @example
1290 (define-multi-ifield
1291   (name field-name)
1292   (comment "description")
1293   (attrs attribute-list)
1294   (mode mode-name)
1295   (subfields field1-name field2-name ...)
1296   (insert (code to set each subfield))
1297   (extract (code to set field from subfields))
1298 )
1299 @end example
1300
1301 (*note: insert/extract are analogous to encode/decode so maybe these
1302 fields are misnamed.  The operations are subtly different though.)
1303
1304 Example:
1305
1306 @example
1307 (define-multi-ifield
1308   (name f-i20)
1309   (comment "20 bit unsigned")
1310   (attrs)
1311   (mode UINT)
1312   (subfields f-i20-4 f-i20-16)
1313   (insert (sequence ()
1314                     (set (ifield f-i20-4)  (srl (ifield f-i20) (const 16)))
1315                     (set (ifield f-i20-16) (and (ifield f-i20) (const #xffff)))
1316                     ))
1317   (extract (sequence ()
1318                      (set (ifield f-i20) (or (sll (ifield f-i20-4) (const 16))
1319                                              (ifield f-i20-16)))
1320                      ))
1321 )
1322 @end example
1323
1324 @subsection subfields
1325 The names of the already defined fields that make up the multi-ifield.
1326
1327 @subsection insert
1328 Code to set the subfields from the multi-ifield. All fields are referred
1329 to with @code{(ifield <name>)}.
1330
1331 @subsection extract
1332 Code to set the multi-ifield from the subfields. All fields are referred
1333 to with @code{(ifield <name>)}.
1334
1335 @subsection Simplification macros
1336 To simplify @file{.cpu} files, the @code{dnf}, @code{df} and @code{dnmf}
1337 macros have been created. Each takes a fixed set of positional arguments
1338 for the typical instruction field.  @code{dnf} is short for
1339 @code{define-normal-field}, @code{df} is short for @code{define-field},
1340 and @code{dnmf} is short for @code{define-normal-multi-ifield}.
1341
1342 The syntax of @code{dnf} is:
1343
1344 @code{(dnf name comment attributes start length)}
1345
1346 Example:
1347
1348 @code{(dnf f-r1 "register r1" () 4 4)}
1349
1350 This defines a field called @samp{f-r1} that is an unsigned field of 4
1351 bits beginning at bit 4.  All fields defined with @code{dnf} are unsigned.
1352
1353 The syntax of @code{df} is:
1354
1355 @code{(df name comment attributes start length mode encode decode)}
1356
1357 Example:
1358
1359 @example
1360 (df f-disp8
1361     "disp8, slot unknown" (PCREL-ADDR)
1362     8 8 INT
1363     ((value pc) (sra WI (sub WI value (and WI pc (const -4))) (const 2)))
1364     ((value pc) (add WI (sll WI value (const 2)) (and WI pc (const -4)))))
1365 @end example
1366
1367 This defines a field called @samp{f-disp8} that is a signed PC-relative
1368 address beginning at bit 8 of size 8 bits that is left shifted by 2.
1369
1370 The syntax of @code{dnmf} is:
1371
1372 @code{(dnmf name comment attributes mode subfields insert extract)}
1373
1374 @node Enumerated constants
1375 @section Enumerated constants
1376 @cindex Enumerated constants
1377 @cindex Enumerations
1378
1379 Enumerated constants (@emph{enums}) are important enough in instruction
1380 set descriptions that they are given special treatment. Enums are
1381 defined with:
1382
1383 @example
1384 (define-enum
1385   (name enum-name)
1386   (comment "description")
1387   (attrs attribute-list)
1388   (prefix prefix)
1389   (values val1 val2 ...)
1390 )
1391 @end example
1392
1393 Enums in opcode fields are further enhanced by specifying the opcode
1394 field they are used in.  This allows the enum's name to be specified
1395 in an instruction's @code{format} entry.
1396
1397 @example
1398 (define-insn-enum
1399   (name enum-name)
1400   (comment "description")
1401   (attrs (attribute list))
1402   (prefix prefix)
1403   (ifield instruction-field-name)
1404   (values val1 val2 ...)
1405 )
1406 @end example
1407
1408 (*note: @code{define-insn-enum} isn't implemented yet: use
1409 @code{define-normal-insn-enum})
1410
1411 Example:
1412
1413 @example
1414 (define-insn-enum
1415   (name insn-op1)
1416   (comment "op1 field values")
1417   (prefix OP1_)
1418   (ifield f-op1)
1419   (values "0" "1" "2" "3" "4" "5" "6" "7"
1420           "8" "9" "10" "11" "12" "13" "14" "15")
1421 )
1422 @end example
1423
1424 @subsection prefix
1425 Convention requires each enum value to be prefixed with the same text.
1426 Rather than specifying the prefix in each entry, it is specified once, here.
1427 Convention requires @samp{prefix} not contain any lowercase characters.
1428
1429 @subsection ifield
1430 The name of the instruction field that the enum is intended for.
1431
1432 @subsection values
1433 A list of possible values.  Each element has one of the following forms:
1434
1435 @itemize @bullet
1436 @item @code{name}
1437 @item @code{(name)}
1438 @item @code{(name value)}
1439 @item @code{(name - (attribute-list))}
1440 @item @code{(name value (attribute-list))}
1441 @end itemize
1442
1443 The syntax for numbers is Scheme's, so hex numbers are @code{#xnnnn}.
1444 A value of @code{-} means use the next value (previous value plus 1).
1445
1446 Example:
1447
1448 @example
1449 (values "a" ("b") ("c" #x12)
1450         ("d" - (sanitize foo)) ("e" #x1234 (sanitize bar)))
1451 @end example
1452
1453 @subsection Simplification macros
1454
1455 @code{(define-normal-enum name comment attrs prefix vals)}
1456
1457 @code{(define-normal-insn-enum name comment attrs prefix ifield vals)}
1458
1459 @node Instruction operands
1460 @section Instruction Operands
1461 @cindex Operands, instruction
1462
1463 Instruction operands provide:
1464
1465 @itemize @bullet
1466 @item a layer between the assembler and the raw hardware description
1467 @item the main means of manipulating instruction fields in the semantic code
1468 @c More?
1469 @end itemize
1470
1471 The syntax is:
1472
1473 @example
1474 (define-operand
1475   (name operand-name)
1476   (comment "description")
1477   (attrs attribute-list)
1478   (type hardware-element)
1479   (index instruction-field)
1480   (asm asm-spec)
1481 )
1482 @end example
1483
1484 @subsection name
1485
1486 This is the name of the operand as a Scheme symbol.
1487 The name choice is fairly important as it is used in instruction
1488 syntax entries, instruction format entries, and semantic expressions.
1489 It can't collide with symbols used in semantic expressions
1490 (e.g. @code{and}, @code{set}, etc).
1491
1492 The convention is that operands have no prefix (whereas ifields begin
1493 with @samp{f-} and hardware elements begin with @samp{h-}).  A prefix
1494 like @samp{o-} would avoid collisions with other semantic elements, but
1495 operands are used often enough that any prefix is a hassle.
1496
1497 @subsection attrs
1498
1499 A list of attributes. In addition to attributes defined for the operand,
1500 an operand inherits the attributes of its instruction field. There are
1501 several predefined operand attributes:
1502
1503 @table @code
1504 @item NEGATIVE
1505 The operand contains negative values (not used yet so definition is
1506 still nebulous.
1507
1508 @item RELAX
1509 This operand contains the changeable field (usually a branch address) of
1510 a relaxable instruction.
1511
1512 @item SEM-ONLY
1513 Use the SEM-ONLY attribute for cases where the operand will only be used
1514 in semantic specification, and not assembly code specification.  A
1515 typical example is condition codes.
1516 @end table
1517
1518 To refer to a hardware element in semantic code one must either use an
1519 operand or one of reg/mem/const.  Operands generally exist to map
1520 instruction fields to the selected hardware element and are easier to
1521 use in semantic code than referring to the hardware element directly
1522 (e.g. @code{sr} is easier to type and read than @code{(reg h-gr
1523 <index>)}). Example:
1524
1525 @example
1526   (dnop condbit "condition bit" (SEM-ONLY) h-cond f-nil)
1527 @end example
1528
1529 @code{f-nil} is the value to use when there is no instruction field
1530
1531 @c There might be some language cleanup to be done here regarding f-nil.
1532 @c It is kind of extraneous.
1533
1534 @subsection type
1535 The hardware element this operand applies to. This must be the name of a
1536 hardware element.
1537
1538 @subsection index
1539 The index of the hardware element. This is used to mate the hardware
1540 element with the instruction field that selects it, and must be the name
1541 of an ifield entry. (*note: The index may be other things besides
1542 ifields in the future.)
1543
1544 @subsection asm
1545 Sometimes it's necessary to escape to C to parse assembler, or print
1546 a value.  This field is an escape hatch to implement this.
1547 The current syntax is:
1548
1549 @code{(asm asm-spec)}
1550
1551 where @code{asm-spec} is one or more of:
1552
1553 @code{(parse "function_suffix")} -- a call to function
1554 @code{parse_<function_suffix>} is generated.
1555
1556 @code{(print "function_suffix")} -- a call to function
1557 @code{print_<function_suffix>} is generated.
1558
1559 These functions are intended to be provided in a separate @file{.opc}
1560 file.  The prototype of a parse function depends on the hardware type.
1561 See @file{cgen/*.opc} for examples.
1562
1563 @c FIXME: The following needs review.
1564
1565 For integer it is:
1566
1567 @example
1568 static const char *
1569 parse_foo (CGEN_CPU_DESC cd,
1570            const char **strp,
1571            int opindex,
1572            unsigned long *valuep);
1573 @end example
1574
1575 @code{cd} is the result of @code{<arch>_cgen_cpu_open}.
1576 @code{strp} is a pointer to a pointer to the assembler and is updated by
1577 the function.
1578 @c FIXME
1579 @code{opindex} is ???.
1580 @code{valuep} is a pointer to where to record the parsed value.
1581 @c FIXME
1582 If a relocation is needed, it is queued with a call to ???. Queued
1583 relocations are processed after the instruction has been parsed.
1584
1585 The result is an error message or NULL if successful.
1586
1587 The prototype of a print function depends on the hardware type.  See
1588 @file{cgen/*.opc} for examples. For integers it is:
1589
1590 @example
1591 void print_foo (CGEN_CPU_DESC cd,
1592                 PTR dis_info,
1593                 long value,
1594                 unsigned int attrs,
1595                 bfd_vma pc,
1596                 int length);
1597 @end example
1598
1599 @samp{cd} is the result of @code{<arch>_cgen_cpu_open}.
1600 @samp{ptr} is the `info' argument to print_insn_<arch>.
1601 @samp{value} is the value to be printed.
1602 @samp{attrs} is the set of boolean attributes.
1603 @samp{pc} is the PC value of the instruction.
1604 @samp{length} is the length of the instruction.
1605
1606 Actual printing is done by calling @code{((disassemble_info *)
1607 dis_info)->fprintf_func}.
1608
1609 @node Derived operands
1610 @section Derived Operands
1611 @cindex Derived operands
1612 @cindex Operands, instruction
1613 @cindex Operands, derived
1614
1615 Derived operands are an experiment in supporting the addressing modes of
1616 CISC-like architectures.  Addressing modes are difficult to support as
1617 they essentially increase the number of instructions in the architecture
1618 by an order of magnitude.  Defining all the variants requires something
1619 in addition to the RISC-like architecture support.  The theory is that
1620 since CISC-like instructions are basically "normal" instructions with
1621 complex operands the place to add the necessary support is in the
1622 operands.
1623
1624 Two kinds of operands exist to support CISC-like cpus, and they work
1625 together.  "derived-operands" describe one variant of a complex
1626 argument, and "anyof" operands group them together.
1627
1628 The syntax for defining derived operands is:
1629
1630 @example
1631 (define-derived-operand
1632   (name operand-name)
1633   (comment "description")
1634   (attrs attribute-list)
1635   (mode mode-name)
1636   (args arg1-operand-name arg2-operand-name ...)
1637   (syntax "syntax")
1638   (base-ifield ifield-name)
1639   (encoding (+ arg1-operand-name arg2-operand-name ...))
1640   (ifield-assertion expression)
1641   (getter expression)
1642   (setter expression)
1643 )
1644 @end example
1645
1646 @cindex anyof operands
1647 @cindex Operands, anyof
1648
1649 The syntax for defining anyof operands is:
1650
1651 @example
1652 (define-anyof-operand
1653   (name operand-name)
1654   (comment "description")
1655   (attrs attribute-list)
1656   (mode mode-name)
1657   (base-ifield ifield-name)
1658   (choices derived-operand1-name derived-operand2-name ...)
1659 )
1660 @end example
1661
1662 @subsection mode
1663
1664 The name of the mode of the operand.
1665
1666 @subsection args
1667
1668 List of names of operands the derived operand uses.
1669 The operands must already be defined.
1670 The argument operands can be any kind of operand: normal, derived, anyof.
1671
1672 @subsection syntax
1673
1674 Assembler syntax of the operand.
1675
1676 ??? This part needs more work.  Addressing mode specification in assembler
1677 needn't be localized to the vicinity of the operand.
1678
1679 @subsection base-ifield
1680
1681 The name of the instruction field common to all related derived operands.
1682 Here related means "used by the same `anyof' operand".
1683
1684 @subsection encoding
1685
1686 The machine encoding of the operand.
1687
1688 @subsection ifield-assertion
1689
1690 An assertion of what values any instruction fields will or will not have
1691 in the containing instruction.
1692
1693 ??? A better name for this might be "constraint".
1694
1695 @subsection getter
1696
1697 RTL expression to get the value of the operand.
1698 All operands refered to must be specified in @code{args}.
1699
1700 @subsection setter
1701
1702 RTL expression to set the value of the operand.
1703 All operands refered to must be specified in @code{args}.
1704 Use @code{newval} to refer to the value to be set.
1705
1706 @subsection choices
1707
1708 For anyof operands, the names of the derived operands.
1709 The operand may be "any of" the specified choices.
1710
1711 @node Instructions
1712 @section Instructions
1713 @cindex Instructions
1714
1715 Each instruction in the instruction set has an entry in the description
1716 file.  For complicated instruction sets this is a lot of typing.  However,
1717 macros can reduce a lot of that typing.  The real question is given the
1718 amount of information that must be expressed, how succinct can one express
1719 it and still be clean and usable?  I'm open to opinions on how to improve
1720 this, but such improvements must take everything CGEN wishes to be into
1721 account.
1722 (*note: Of course no claim is made that the current design is the
1723 be-all and end-all or that there is one be-all and end-all.)
1724
1725 The syntax for defining an instruction is:
1726
1727 @example
1728 (define-insn
1729   (name insn-name)
1730   (comment "description")
1731   (attrs attribute-list)
1732   (syntax "assembler syntax")
1733   (format (+ field-list))
1734   (semantics (semantic-expression))
1735   (timing timing-data)
1736 )
1737 @end example
1738
1739 Instructions specific to a particular cpu variant are denoted as such with
1740 the MACH attribute.
1741
1742 Possible additions for the future:
1743
1744 @itemize @bullet
1745 @item a field to describe a final constraint for determining a match
1746 @item choosing the output from a set of choices
1747 @end itemize
1748
1749 @subsection attrs
1750
1751 A list of attributes, for which there are several predefined instruction
1752 attributes:
1753
1754 @table @code
1755 @item MACH
1756 A bitset attribute used to specify which machines have this hardware
1757 element. Do not specify the MACH attribute if the value is for all
1758 machines.
1759
1760 Usage: @code{(MACH mach1,mach2,...)}
1761
1762 There must be no spaces in ``@code{mach1,mach2,...}''.
1763
1764 @item UNCOND-CTI
1765 The instruction is an unconditional ``control transfer instruction''.
1766
1767 (*note: This attribute is derived from the semantic code. However if the
1768 computed value is wrong (dunno if it ever will be) the value can be
1769 overridden by explicitly mentioning it.)
1770
1771 @item COND-CTI
1772 The instruction is an conditional "control transfer instruction".
1773
1774 (*note: This attribute is derived from the semantic code. However if the
1775 computed value is wrong (dunno if it ever will be) the value can be
1776 overridden by explicitly mentioning it.)
1777
1778 @item SKIP-CTI
1779 The instruction can cause one or more insns to be skipped. This is
1780 derived from the semantic code.
1781
1782 @item DELAY-SLOT
1783 The instruction has one or more delay slots. This is derived from the
1784 semantic code.
1785
1786 @item RELAXABLE
1787 The instruction has one or more identical variants.  The assembler tries
1788 this one first and then the relaxation phases switches to larger ones as
1789 necessary.
1790
1791 @item RELAX
1792 The instruction is a non-minimal variant of a relaxable instruction.  It
1793 is avoided by the assembler in the first pass.
1794
1795 @item ALIAS
1796 Internal attribute set for macro-instructions that are an alias for one
1797 real insn.
1798
1799 @item NO-DIS
1800 For macro-instructions, don't use during disassembly.
1801 @end table
1802
1803 @subsection syntax
1804
1805 This is a character string consisting of raw characters and operands.
1806 Fields are denoted by @code{$operand} or
1807 @code{$@{operand@}}@footnote{Support for @code{$@{operand@}} is
1808 work-in-progress.}.  If a @samp{$} is required in the syntax, it is
1809 specified with @samp{\$}.  At most one white-space character may be
1810 present and it must be a blank separating the instruction mnemonic from
1811 the operands.  This doesn't restrict the user's assembler, this is
1812 @c Is this reasonable?
1813 just a description file restriction to separate the mnemonic from the
1814 operands@footnote{The restriction can be relaxed by saying the first
1815 blank is the one that separates the mnemonic from its operands.}.
1816 The assembly language accepted by the generated assembler does not
1817 have to take exactly the same form as the syntax described in this
1818 field--additional whitespace may be present in the input file.
1819
1820 Operands can refer to registers, constants, and whatever else is necessary.
1821
1822 Instruction mnemonics can take operands.  For example, on the SPARC a
1823 branch instruction can take @code{,a} as an argument to indicate the
1824 instruction is being annulled (e.g. @code{bge$a $disp22}).
1825
1826 @subsection format
1827
1828 This is a complete list of fields that specify the instruction.  At
1829 present it must be prefaced with @code{+} to allow for future additions.
1830 Reserved bits must also be specified, gaps are not allowed.
1831 @c Well, actually I think they are and it could certainly be allowed.
1832 @c Question: should they be allowed?
1833 The ordering of the fields is not important.
1834
1835 Format elements can be any of:
1836
1837 @itemize @bullet
1838 @item instruction field specifiers with a value (e.g. @code{(f-r1 14)})
1839 @item an instruction field enum, as in @code{OP1_4}
1840 @item an operand
1841 @end itemize
1842
1843 @subsection semantics
1844 @cindex Semantics
1845
1846 This field provides a mathematical description of what the instruction
1847 does.  Its syntax is GCC RTL-like on purpose since GCC's RTL is well
1848 known by the intended audience.  However, it is not intended that it be
1849 precisely GCC RTL.
1850
1851 Obviously there are some instructions that are difficult if not
1852 impossible to provide a description for (e.g. I/O instructions).  Rather
1853 than create a new semantic function for each quirky operation, escape
1854 hatches to C are provided to handle all such cases.  The @code{c-code},
1855 @code{c-call} and @code{c-raw-call} semantic functions provide an
1856 escape-hatch to invoke C code to perform the
1857 operation. @xref{Expressions}.
1858
1859 @subsection timing
1860 @cindex Timing
1861
1862 A list of entries for each function unit the instruction uses on each machine
1863 that supports the instruction.  The default function unit is the u-exec unit.
1864
1865 The syntax is:
1866
1867 @example
1868 (mach-name (unit name (unit-var-name1 insn-operand-name1)
1869                       (unit-var-name2 insn-operand-name2)
1870                       ...
1871                       (cycles cycle-count))
1872 @end example
1873
1874 unit-var-name/insn-operand-name mappings are optional.
1875 They map unit inputs/outputs to semantic elements.
1876
1877 @code{cycles} overrides the @code{done} value (latency) of the function
1878 unit and is optional.
1879
1880 @subsection Simplification macros
1881
1882 To simplify @file{.cpu} files, the @code{dni} macro has been created.
1883 It takes a fixed set of positional arguments for the typical instruction
1884 field.  @code{dni} is short for @code{define-normal-insn}.
1885
1886 The syntax of @code{dni} is:
1887
1888 @code{(dni name comment attrs syntax format semantics timing)}
1889
1890 Example:
1891
1892 @example
1893 (dni addi "add 8 bit signed immediate"
1894      ()
1895      "addi $dr,$simm8"
1896      (+ OP1_4 dr simm8)
1897      (set dr (add dr simm8))
1898      ()
1899 )
1900 @end example
1901
1902 @node Macro-instructions
1903 @section Macro-instructions
1904 @cindex Macro-instructions
1905 @cindex Instructions, macro
1906
1907 Macro-instructions are for the assembler side of things and are not used
1908 by the simulator. The syntax for defining a macro-instruction is:
1909
1910 @example
1911 (define-macro-insn
1912   (name macro-insn-name)
1913   (comment "description")
1914   (attrs attribute-list)
1915   (syntax "assembler syntax")
1916   (expansions expansion-spec)
1917 )
1918 @end example
1919
1920 @subsection syntax
1921
1922 Syntax of the macro-instruction. This has the same value as the
1923 @code{syntax} field in @code{define-insn}.
1924
1925 @subsection expansions
1926
1927 An expression to emit code for the instruction.  This is intended to be
1928 general in nature, allowing tests to be done at runtime that choose the
1929 form of the expansion.  Currently the only supported form is:
1930
1931 @code{(emit insn arg1 arg2 ...)}
1932
1933 where @code{insn} is the name of an instruction defined with
1934 @code{define-insn} and @emph{argn} is the set of operands to
1935 @code{insn}'s syntax.  Each argument is mapped in order to one operand
1936 in @code{insn}'s syntax and may be any of:
1937
1938 @itemize @bullet
1939 @item operand specified in @code{syntax}
1940 @item @code{(operand value)}
1941 @end itemize
1942
1943 Example:
1944
1945 @example
1946 (dni st-minus "st-" ()
1947      "st $src1,@-$src2"
1948      (+ OP1_2 OP2_7 src1 src2)
1949      (sequence ((WI new-src2))
1950                (set new-src2 (sub src2 (const 4)))
1951                (set (mem WI new-src2) src1)
1952                (set src2 new-src2))
1953      ()
1954 )
1955 @end example
1956
1957 @example
1958 (dnmi push "push" ()
1959   "push $src1"
1960   (emit st-minus src1 (src2 15)) ; "st %0,@-sp"
1961 )
1962 @end example
1963
1964 In this example, the @code{st-minus} instruction is a general
1965 store-and-decrement instruction and @code{push} is a specialized version
1966 of it that uses the stack pointer.
1967
1968 @node Modes
1969 @section Modes
1970 @cindex Modes
1971
1972 Modes provide a simple and succinct way of specifying data types.
1973
1974 (*note: Should more complex types will be needed (e.g. structs? unions?),
1975 these can be handled by extending the definition of a mode to encompass them.)
1976
1977 Modes are similar to their usage in GCC, but there are some differences:
1978
1979 @itemize @bullet
1980 @item modes for boolean values (i.e. bits) are also supported as they are
1981 useful
1982 @item integer modes exist in signed and unsigned versions
1983 @item constants have modes
1984 @end itemize
1985
1986 Currently supported modes are:
1987
1988 @table @code
1989 @item VOID
1990 VOIDmode in GCC.
1991
1992 @item DFLT
1993 Indicate the default mode is wanted, the value of which depends on context.
1994 This is a pseudo-mode and never appears in generated code.
1995
1996 @item BI
1997 Boolean zero/one
1998
1999 @item QI,HI,SI,DI
2000 Same as GCC.
2001
2002 QI is an 8 bit quantity ("quarter int").
2003 HI is a 16 bit quantity ("half int").
2004 SI is a 32 bit quantity ("single int").
2005 DI is a 64 bit quantity ("double int").
2006
2007 In cases where signedness matters, these modes are signed.
2008
2009 @item UQI,UHI,USI,UDI
2010 Unsigned versions of QI,HI,SI,DI.
2011
2012 These modes do not appear in semantic RTL.  Instead, the RTL function
2013 specifies the signedness of its operands where necessary.
2014
2015 ??? I'm not entirely sure these unsigned modes are needed.
2016 They are useful in removing any ambiguity in how to sign extend constants
2017 which has been a source of problems in GCC.
2018
2019 ??? Some existing ports use these modes.
2020
2021 @item WI,UWI
2022 word int, unsigned word int (word_mode in gcc).
2023 These are aliases for the real mode, typically either @code{SI} or @code{DI}.
2024
2025 @item SF,DF,XF,TF
2026 Same as GCC.
2027
2028 SF is a 32 bit IEEE float ("single float").
2029 DF is a 64 bit IEEE float ("double float").
2030 XF is either an 80 or 96 bit IEEE float ("extended float").
2031 (*note: XF values on m68k and i386 are different so may
2032 wish to give them different names).
2033 TF is a 128 bit IEEE float ("??? float").
2034
2035 @item AI
2036 Address integer
2037
2038 @item IAI
2039 Instruction address integer
2040
2041 @item INT,UINT
2042 Varying width int/unsigned-int.  The width is specified by context,
2043 usually in an instruction field definition.
2044
2045 @end table
2046
2047 @node Expressions
2048 @section Expressions
2049 @cindex Expressions
2050
2051 The syntax of CGEN's RTL expressions (or @emph{rtx}) basically follows that of
2052 GCC's RTL.
2053
2054 The handling of modes is different to simplify the implementation.
2055 Implementation shouldn't necessarily drive design, but it was a useful
2056 simplification.  Still, it needs to be reviewed.  The difference is that
2057 in GCC @code{(function:MODE arg1 ...)} is written in CGEN as
2058 @code{(function MODE arg1 ...)}.  Note the space after @samp{function}.
2059
2060 GCC RTL allows flags to be recorded with RTL (e.g. MEM_VOLATILE_P).
2061 This is supported in CGEN RTL by prefixing each RTL function's arguments
2062 with an optional list of modifiers:
2063 @code{(function (:mod1 :mod2) MODE arg1 ...)}.
2064 The list is a set of modifier names prefixed with ':'.  They can take
2065 arguments.
2066 ??? Modifiers are supported by the RTL traversing code, but no use is
2067 made of them yet.
2068
2069 The currently defined semantic functions are:
2070
2071 @table @code
2072 @item (set mode destination source)
2073 Assign @samp{source} to @samp{destination} reference in mode @samp{mode}.
2074
2075 @item (set-quiet mode destination source)
2076 Assign @samp{source} to @samp{destination} referenced in mode
2077 @samp{mode}, but do not print any tracing message.
2078
2079 @item (reg mode hw-name [index])
2080 Return an `operand' of hardware element @samp{hw-name} in mode @samp{mode}.
2081 If @samp{hw-name} is an array, @samp{index} selects which register.
2082
2083 @item (raw-reg mode hw-name [index])
2084 Return an `operand' of hardware element @samp{hw-name} in mode @samp{mode},
2085 bypassing any @code{get} or @code{set} specs of the register.
2086 If @samp{hw-name} is an array, @samp{index} selects which register.
2087 This cannot be used with virtual registers (those specified with the
2088 @samp{VIRTUAL} attribute).
2089
2090 @code{raw-reg} is most often used in @code{get} and @code{set} specs
2091 of a register: if it weren't read and write operations would infinitely
2092 recurse.
2093
2094 @item (mem mode address)
2095 Return an `operand' of memory referenced at @samp{address} in mode
2096 @samp{mode}.
2097
2098 @item (const mode value)
2099 Return an `operand' of constant @samp{value} in mode @samp{mode}.
2100
2101 @item (enum mode value-name)
2102 Return an `operand' of constant @samp{value-name} in mode @samp{mode}.
2103 The value must be from a previously defined enum.
2104
2105 @item (subword mode value word-num)
2106 Return part of @samp{value}.  Which part is determined by @samp{mode} and
2107 @samp{word-num}.  There are three cases.
2108
2109 If @samp{mode} is the same size as the mode of @samp{value}, @samp{word-num}
2110 must be @samp{0} and the result is @samp{value} recast in the new mode.
2111 There is no change in the bits of @samp{value}, they're just interpreted in a
2112 possibly different mode.  This is most often used to interpret an integer
2113 value as a float and vice versa.
2114
2115 If @samp{mode} is smaller, @samp{value} is divided into N pieces and
2116 @samp{word-num} picks which piece.  All pieces have the size of @samp{mode}
2117 except possibly the last.  If the last piece has a different size,
2118 it cannot be referenced.
2119 This follows GCC and is byte order dependent.@footnote{To be
2120 revisited}.
2121 Word number 0 is the most significant word if big-endian-words.
2122 Word number 0 is the least significant word if little-endian-words.
2123
2124 If @samp{mode} is larger, @samp{value} is interpreted in the larger mode
2125 with the upper most significant bits treated as garbage (their value is
2126 assumed to be unimportant to the context in which the value will be used).
2127 @samp{word-num} must be @samp{0}.
2128 This case is byte order independent.
2129
2130 @item (join out-mode in-mode arg1 . arg-rest)
2131 Concatenate @samp{arg1[,arg2[,...]]} to create a value of mode @samp{out-mode}.
2132 @samp{arg1} becomes the most significant part of the result.
2133 Each argument is interpreted in mode @samp{in-mode}.
2134 @samp{in-mode} must evenly divide @samp{out-mode}.
2135 ??? Endianness issues have yet to be decided.
2136
2137 @item (sequence mode ((mode1 local1) ...) expr1 expr2 ...)
2138 Execute @samp{expr1}, @samp{expr2}, etc. sequentially. @samp{mode} is the
2139 mode of the result, which is defined to be that of the last expression.
2140 `@code{((mode1 local1) ...)}' is a set of local variables.
2141
2142 @item (parallel mode empty expr1 ...)
2143 Execute @samp{expr1}, @samp{expr2}, etc. in parallel. All inputs are
2144 read before any output is written.  @samp{empty} must be @samp{()} and
2145 is present for consistency with @samp{sequence}. @samp{mode} must be
2146 @samp{VOID} (void mode). @samp{((mode1 local1) ...)} is a set of local
2147 variables.
2148
2149 @item (unop mode operand)
2150 Perform a unary arithmetic operation. @samp{unop} is one of @code{neg},
2151 @code{abs}, @code{inv}, @code{not}, @code{zflag}, @code{nflag}.
2152 @code{zflag} returns a bit indicating if @samp{operand} is
2153 zero. @code{nflag} returns a bit indicating if @samp{operand} is
2154 negative. @code{inv} returns the bitwise complement of @samp{operand},
2155 whereas @code{not} returns its logical negation.
2156
2157 @item (binop mode operand1 operand2)
2158 Perform a binary arithmetic operation. @samp{binop} is one of
2159 @code{add}, @code{sub}, @code{and}, @code{or}, @code{xor}, @code{mul},
2160 @code{div}, @code{udiv}, @code{mod}, @code{umod}.
2161
2162 @item (binop-with-bit mode operand1 operand2 operand3)
2163 Same as @samp{binop}, except taking 3 operands. The third operand is
2164 always a single bit. @samp{binop-with-bit} is one of @code{addc},
2165 @code{add-cflag}, @code{add-oflag}, @code{subc}, @code{sub-cflag},
2166 @code{sub-oflag}.
2167
2168 @item (shiftop mode operand1 operand2)
2169 Perform a shift operation. @samp{shiftop} is one of @code{sll},
2170 @code{srl}, @code{sra}, @code{ror}, @code{rol}.
2171
2172 @item (boolifop mode operand1 operand2)
2173 Perform a sequential boolean operation. @samp{operand2} is not processed
2174 if @samp{operand1} ``fails''. @samp{boolifop} is one of @code{andif},
2175 @code{orif}.
2176
2177 @item (convop mode operand)
2178 Perform a mode->mode conversion operation. @samp{convop} is one of
2179 @code{ext}, @code{zext}, @code{trunc}, @code{float}, @code{ufloat},
2180 @code{fix}, @code{ufix}.
2181
2182 @item (cmpop mode operand1 operand2)
2183 Perform a comparison. @samp{cmpop} is one of @code{eq}, @code{ne},
2184 @code{lt}, @code{le}, @code{gt}, @code{ge}, @code{ltu}, @code{leu},
2185 @code{gtu}, @code{geu}.
2186
2187 @item (mathop mode operand)
2188 Perform a mathematical operation. @samp{mathop} is one of @code{sqrt},
2189 @code{cos}, @code{sin}.
2190
2191 @item (if mode condition then [else])
2192 Standard @code{if} statement.
2193
2194 @samp{condition} is any arithmetic expression.
2195 If the value is non-zero the @samp{then} part is executed.
2196 Otherwise, the @samp{else} part is executed (if present).
2197
2198 @samp{mode} is the mode of the result, not of @samp{condition}.
2199 If @samp{mode} is not @code{VOID} (void mode), @samp{else} must be present.
2200
2201 @item (cond mode (condition1 expr1a ...) (...) [(else exprNa...)])
2202 From Scheme: keep testing conditions until one succeeds, and then
2203 process the associated expressions.
2204
2205 @item (case mode test ((case1 ..) expr1a ..) (..) [(else exprNa ..)])
2206 From Scheme: Compare @samp{test} with @samp{case1}, @samp{case2},
2207 etc. and process the associated expressions.
2208
2209 @item (c-code mode "C expression")
2210 An escape hook to insert arbitrary C code. @samp{mode} must the
2211 compatible with the result of ``C expression''.
2212
2213 @item (c-call mode symbol operand1 operand2 ...)
2214 An escape hook to emit a subroutine call to function named @samp{symbol}
2215 passing operands @samp{operand1}, @samp{operand2}, etc.  An implicit
2216 first argument of @code{current_cpu} is passed to @samp{symbol}.
2217 @samp{mode} is the mode of the result.  Be aware that @samp{symbol} will
2218 be restricted by reserved words in the C programming language any by
2219 existing symbols in the generated code.
2220
2221 @item (c-raw-call mode symbol operand1 operand2 ...)
2222 Same as @code{c-call}: except there is no implicit @code{current_cpu}
2223 first argument.
2224 @samp{mode} is the mode of the result.
2225
2226 @item (clobber mode object)
2227 Indicate that @samp{object} is written in mode @samp{mode}, without
2228 saying how. This could be useful in conjunction with the C escape hooks.
2229
2230 @item (delay mode num expr)
2231 Indicate that there are @samp{num} delay slots in the processing of
2232 @samp{expr}.  When using this rtx in instruction semantics, CGEN will
2233 infer that the instruction has the DELAY-SLOT attribute.
2234
2235 @item (annul yes?)
2236 @c FIXME: put annul into the glossary.
2237 Annul the following instruction if @samp{yes?} is non-zero. This rtx is
2238 an experiment and will probably change.
2239
2240 @item (skip yes?)
2241 Skip the next instruction if @samp{yes?} is non-zero. This rtx is
2242 an experiment and will probably change.
2243
2244 @item (attr mode kind attr-name)
2245 Return the value of attribute @samp{attr-name} in mode
2246 @samp{mode}. @samp{kind} must currently be @samp{insn}: the current
2247 instruction.
2248
2249 @item (symbol name)
2250 Return a symbol with value @samp{name}, for use in attribute
2251 processing. This is equivalent to @samp{quote} in Scheme but
2252 @samp{quote} sounds too jargonish.
2253
2254 @item (eq-attr mode attr-name value)
2255 Return non-zero if the value of attribute @samp{attr-name} is
2256 @samp{value}. If @samp{value} is a list return ``true'' if
2257 @samp{attr-name} is any of the listed values.
2258
2259 @item (index-of operand)
2260 Return the index of @samp{operand}. For registers this is the register number.
2261
2262 @item (regno operand)
2263 Same as @code{index-of}, but improves readability for registers
2264
2265 @item (error mode message)
2266 Emit an error message from CGEN RTL. Error message is specified by @samp{message}.
2267
2268 @item (nop)
2269 A no-op.
2270
2271 @item (ifield field-name)
2272 Return the value of field @samp{field-name}. @samp{field-name} must be a
2273 field in the instruction. Operands can be any of:
2274 @c ???
2275
2276 @itemize @bullet
2277 @item an operand defined in the description file
2278 @item a register reference, created with (reg mode [index])
2279 @item a memory reference, created with (mem mode address)
2280 @item a constant, created with (const mode value)
2281 @item a `sequence' local variable
2282 @item another expression
2283 @end itemize
2284
2285 The @samp{symbol} in a @code{c-call} or @code{c-raw-call} function is
2286 currently the name of a C function or macro that is invoked by the
2287 generated semantic code.
2288 @end table
2289
2290 @node Macro-expressions
2291 @section Macro-expressions
2292 @cindex Macro-expressions
2293
2294 Macro RTL expressions started out by wanting to not have to always
2295 specify a mode for every expression (and sub-expression
2296 thereof).  Whereas the formal way to specify, say, an add is @code{(add
2297 SI arg1 arg2)} if SI is the default mode of `arg1' then this can be
2298 simply written as @code{(add arg1 arg2)}.  This gets expanded to
2299 @code{(add DFLT arg1 arg2)} where @code{DFLT} means ``default mode''.
2300
2301 It might be possible to replace macro expressions with preprocessor macros,
2302 however for the nonce there is no plan to do this.