cgen/doc/porting.texi

   1 @c Copyright (C) 2000, 2009 Red Hat, Inc.
   2 @c This file is part of the CGEN manual.
   3 @c For copying conditions, see the file cgen.texi.
   4
   5 @node Porting
   6 @chapter Porting
   7 @cindex Porting
   8
   9 This chapter describes how to do a CGEN port.
  10 It focuses on doing binutils and simulator ports, but the
  11 procedure should be generally applicable.
  12
  13 @menu
  14 * Introduction to porting::
  15 * Supported Guile versions::
  16 * Running configure::
  17 * Writing a CPU description file::
  18 * Doing an opcodes port::
  19 * Doing a GAS port::
  20 * Building a GAS test suite::
  21 * Doing a simulator port::
  22 * Building a simulator test suite::
  23 @end menu
  24
  25 @node Introduction to porting
  26 @section Introduction to porting
  27
  28 Doing a GNU tools port for a new processor basically consists of porting the
  29 following components more or less in order.  The order can be changed,
  30 of course, but the following order is reasonable.  Certainly things like
  31 BFD and opcodes need to be finished earlier than others.  Bugs in
  32 earlier pieces are often not found until testing later pieces so each
  33 piece isn't necessarily finished until they all are.
  34
  35 @itemize @bullet
  36 @item DejaGNU
  37 @item BFD
  38 @item CGEN
  39 @item Opcodes
  40 @item GAS
  41 @item Binutils
  42 @item Linker (@code{ld})
  43 @item newlib
  44 @item libgloss
  45 @item simulator
  46 @item GCC
  47 @item GDB
  48 @end itemize
  49
  50 The use of CGEN affects the opcodes, GAS, and simulator portions only.
  51 As always, the M32R port is a good reference base.
  52
  53 One goal of CGEN is to describe the CPU in an application independent manner
  54 so that program generators can do all the repetitive work of generating
  55 code and tables for each CPU that is ported.
  56
  57 For opcodes, several files are generated.  No additional code need be
  58 written in the opcodes directory although as an escape hatch the user
  59 can add target specific code to file <arch>.opc in the CGEN cpu source
  60 directory.  These functions will be included in the relevant generated
  61 files.  An example of when you need to create an <arch>.opc file is when
  62 there are special pseudo-ops that need to be parsed, for example the
  63 high/shigh pseudo-ops of the M32R.
  64 @xref{Doing an opcodes port}.
  65
  66 For GAS, no files are generated (except test cases!) so the port is done
  67 more or less like the other GAS ports except that the assembler uses the
  68 CGEN-built opcode table plus @file{toplevel/gas/cgen.[ch]}.
  69
  70 For the simulator, several files are built, and other support files need
  71 to be written.  @xref{Doing a simulator port}.
  72
  73 @node Supported Guile versions
  74 @section Supported Guile versions
  75
  76 In order to avoid suffering from the bug of the day when using
  77 snapshots, CGEN development has been confined to Guile releases only.
  78 CGEN has been tested with Guile versions @code{1.4.1}, @code{1.6.8}
  79 and @code{1.8.5}.
  80 As time passes older versions of Guile will no longer be supported.
  81
  82 @node Running configure
  83 @section Running @code{configure}
  84
  85 When doing porting or maintenance activity with CGEN, it's a good idea
  86 to configure the build tree with the @code{--enable-cgen-maint} option.
  87 This adds the necessary dependencies to the @file{toplevel/opcodes} and
  88 @file{toplevel/sim} directories so that when the @file{.cpu} file is
  89 changed the makefiles will regenerated the corresponding sources.
  90
  91 CGEN uses Guile so it must be installed.
  92
  93 @node Writing a CPU description file
  94 @section Writing a CPU description file
  95
  96 The first step in doing a CGEN port is writing a CPU description file.
  97 The best way to do that is to take an existing file (such as the M32R)
  98 and use it as a template.
  99
 100 Writing a CPU description file generally involves writing each of the
 101 following types of entries, in order.  @xref{RTL}, for detailed
 102 descriptions of each type of entry that appears in the description file.
 103
 104 @menu
 105 * Conventions::                      Programming style conventions
 106 * simplify.inc::                     Simplifying writing @file{.cpu} files
 107 * Writing define-arch::              Architecture wide specs
 108 * Writing define-isa::               Instruction set characteristics
 109 * Writing define-cpu::               CPU families
 110 * Writing define-mach::              Machine variants
 111 * Writing define-model::             Models of each machine variant
 112 * Writing define-hardware::          Hardware elements
 113 * Writing define-ifield::            Instruction fields
 114 * Writing define-normal-insn-enum::  Instruction enums
 115 * Writing define-operand::           Instruction operands
 116 * Writing define-insn::              Instructions
 117 * Writing define-macro-insn::        Macro instructions
 118 * Using define-pmacro::              Preprocessor macros
 119 * Splicing list arguments::          List arguments in macros
 120 * Interactive development::          Useful things to do in a Guile shell
 121 @end menu
 122
 123 @node Conventions
 124 @subsection Conventions
 125
 126 First a digression on conventions and programming style.
 127
 128 @itemize @bullet
 129
 130 @item @code{define-foo} vs. @code{define-normal-foo}
 131
 132 Each CPU description @code{define-} entry generally provides two forms:
 133 the normal form and the general form.  The normal form has a simple,
 134 fixed-argument syntax that allows one to specify the most popular
 135 elements.  When one needs to specify more obscure elements of the
 136 entry one uses the long form which is a list of name/value pairs.  The
 137 naming convention is to call the normal form @code{define-normal-foo}
 138 and the general form @code{define-foo}.
 139
 140 @item Parentheses placement
 141
 142 Consider:
 143
 144 @example
 145 (define-normal-insn-enum
 146   insn-op1 "insn format enums" () f-op1 OP1_
 147   (ADD ADDC SUB SUBC
 148    AND OR   XOR INV)
 149 )
 150 @end example
 151
 152 All Lisp/Scheme code I've read puts the trailing parenthesis on the
 153 previous line.  CGEN programming style says the last trailing
 154 parenthesis goes on a line by itself.  If someone wants to put forth an
 155 argument of why this should change, please do.  I like putting the
 156 very last parenthesis on a line by itself in column 1 because it makes
 157 it easier to traverse the file with a parenthesis matching keystroke.
 158
 159 @item @code{StudlyCaps} vs. @code{_} vs. @code{-}
 160
 161 The convention is to have most things lowercase with words separated by
 162 @samp{-}.  Things that are uppercase are fixed and well defined: enum
 163 values and mode names.
 164 @c FIXME: Seems to me there's a few others.
 165 This convention must be followed.
 166
 167 @item Integers
 168
 169 There are two things to keep in mind regarding integers in CGEN.
 170
 171 @enumerate
 172
 173 @item Unspecified width
 174
 175 Integers in CGEN generally don't specify a width.
 176 The width is imposed by context.
 177
 178 @item RTL canonicalization
 179
 180 Integers in RTL may simply be written as a number,
 181 or in the full canonical form as
 182 @samp{(const [<option-list>] [<mode>] <value>)}.
 183
 184 The ``option list'', if specified, must be @samp{()}
 185 as there are currently no options for constants.
 186 It is optional and is generally elided when written.
 187
 188 The ``mode'' of the number specifies the precision.
 189 The default mode is @samp{INT} meaning arbitrary precision.
 190
 191 In RTL, whether to write just the number, e.g. @samp{24},
 192 or the full canonical form, e.g., @samp{(const () INT 24)},
 193 or anything in between is a matter of style.
 194
 195 @end enumerate
 196
 197 @end itemize
 198
 199 @node simplify.inc
 200 @subsection simplify.inc
 201 @cindex simplify.inc
 202
 203 The file @file{simplify.inc} provides several pmacros that help simplify
 204 writing @file{.cpu} files.
 205
 206 To use it add the following to your @file{.cpu} file.
 207
 208 @smallexample
 209 (include "simplify.inc")
 210 @end smallexample
 211
 212 @file{simplify.inc} provides the following pmacros:
 213
 214 @itemize @bullet
 215
 216 @item define-normal-enum
 217 (@pxref{a-define-normal-enum, define-normal-enum})
 218
 219 @item define-normal-insn-enum
 220 (@pxref{a-define-normal-insn-enum, define-normal-insn-enum})
 221
 222 @c ??? Would have been nice to have called this define-simple-ifield.
 223 @item define-normal-ifield
 224 (@pxref{a-define-normal-ifield, define-normal-ifield})
 225
 226 @item df
 227 (@pxref{a-df, df})
 228
 229 @item dnf
 230 (@pxref{a-dnf, dnf})
 231
 232 @item define-normal-multi-ifield
 233 (@pxref{a-define-normal-multi-ifield, define-normal-multi-ifield})
 234
 235 @item dnmf
 236 (@pxref{a-dnmf, dnmf})
 237
 238 @item dsmf
 239 (@pxref{a-dsmf, dsmf})
 240
 241 @item define-normal-hardware
 242 (@pxref{a-define-normal-hardware, define-normal-hardware})
 243
 244 @item dnh
 245 (@pxref{a-dnh, dnh})
 246
 247 @item define-simple-hardware
 248 (@pxref{a-define-simple-hardware, define-simple-hardware})
 249
 250 @item dsh
 251 (@pxref{a-dsh, dsh})
 252
 253 @item define-normal-operand
 254 (@pxref{a-define-normal-operand, define-normal-operand})
 255
 256 @item dno
 257 (@pxref{a-dno, dno})
 258
 259 @item dnop
 260 (@pxref{a-dnop, dnop})
 261
 262 @item dndo
 263 @c (@pxref{a-dndo, dndo})
 264
 265 @item define-normal-insn
 266 (@pxref{a-define-normal-insn, define-normal-insn})
 267
 268 @item dni
 269 (@pxref{a-dni, dni})
 270
 271 @item define-normal-macro-insn
 272 (@pxref{a-define-normal-macro-insn, define-normal-macro-insn})
 273
 274 @item dnmi
 275 (@pxref{a-dnmi, dnmi})
 276
 277 @end itemize
 278
 279 @node Writing define-arch
 280 @subsection Writing define-arch
 281
 282 Various simple and architecture-wide common things like the name of the
 283 processor must be defined somewhere, so all of this stuff is put under
 284 @code{define-arch}.
 285
 286 This must be the first entry in the description file.
 287
 288 @xref{Architecture variants}, for details.
 289
 290 Here's an example from @file{m32r.cpu}:
 291
 292 @example
 293 (define-arch
 294   (name m32r) ; name of cpu family
 295   (comment "Renesas M32R")
 296   (default-alignment aligned)
 297   (insn-lsb0? #f)
 298   (machs m32r m32rx m32r2)
 299   (isas m32r)
 300 )
 301 @end example
 302
 303 @node Writing define-isa
 304 @subsection Writing define-isa
 305
 306 There are two purposes to @code{define-isa}.
 307 The first is to specify parameters needed to decode instructions.
 308
 309 The second is to give the instruction set a name.  This is important for
 310 architectures like the ARM where one CPU can execute multiple
 311 instruction sets.
 312
 313 @xref{Architecture variants}, for details.
 314
 315 Here's an example from @file{arm.cpu}:
 316
 317 @example
 318 (define-isa
 319   (name thumb)
 320   (comment "ARM Thumb instruction set (16 bit insns)")
 321   (base-insn-bitsize 16)
 322   (decode-assist (15 14 13 12 11 10 9 8))
 323   (setup-semantics (set-quiet (reg h-gr 15) (add pc 4)))
 324 )
 325 @end example
 326
 327 @node Writing define-cpu
 328 @subsection Writing define-cpu
 329
 330 CPU families are an internal and artificial classification designed to
 331 collect processor variants that are sufficiently similar together under
 332 one roof for the simulator.  What is ``sufficiently similar'' is up to
 333 the programmer.  For example, if the only difference between two
 334 processor variants is that one has a few extra instructions, there's no
 335 point in treating them separately in the simulator.
 336
 337 When simulating the variant without the extra instructions, said
 338 instructions are marked as ``invalid''.  On the other hand, putting 32
 339 and 64 bit variants of an architecture under one roof is problematic
 340 since the word size is different.  What ``under one roof'' means is left
 341 fuzzy for now, but basically the simulator engine has a collection of
 342 structures defining internal state, and ``CPU families'' minimize the
 343 number of copies of generated code that manipulate this state.
 344
 345 @xref{Architecture variants}, for details.
 346
 347 Here's an example from @file{openrisc.cpu}:
 348
 349 @example
 350 (define-cpu
 351   ; CPU names must be distinct from the architecture name and machine names.
 352   ; The "b" suffix stands for "base" and is the convention.
 353   ; The "f" suffix stands for "family" and is the convention.
 354   (name openriscbf)
 355   (comment "OpenRISC base family")
 356   (endian big)
 357   (word-bitsize 32)
 358 )
 359 @end example
 360
 361 @node Writing define-mach
 362 @subsection Writing define-mach
 363
 364 CGEN uses ``mach'' in the same sense that BFD uses ``mach''.
 365 ``Mach'', which is short for `machine', defines a variant of
 366 the architecture.
 367 @c There may be a need for a many-to-one correspondence between CGEN
 368 @c machs and BFD machs.
 369
 370 @xref{Architecture variants}, for details.
 371
 372 Here's an example from @file{m32r.cpu}:
 373
 374 @example
 375 (define-mach
 376   (name m32rx)
 377   (comment "M32RX cpu")
 378   (cpu m32rxf)
 379 )
 380 @end example
 381
 382 @node Writing define-model
 383 @subsection Writing define-model
 384
 385 When describing a CPU, in any context, there is ``architecture'' and
 386 there is ``implementation''.  In CGEN parlance, a ``model'' is an
 387 implementation of a ``mach''.  Models specify pipeline and other
 388 performance related characteristics of the implementation.
 389
 390 Some architectures bring pipeline details up into the architecture
 391 (rather than making them an implementation detail).  It's not clear
 392 yet how to handle all the various possibilities so at present this is
 393 done on a case-by-case basis.  Maybe a straightforward solution will
 394 emerge.
 395
 396 @xref{Model variants}, for details.
 397
 398 Here's an example from @file{arm.cpu}:
 399 @c A poor example.  Later.
 400
 401 @example
 402 (define-model
 403   (name arm710)
 404   (comment "ARM 710 microprocessor")
 405   (mach arm7tdmi)
 406   (unit u-exec "Execution Unit" ()
 407         1 1 ; issue done
 408         () () () ())
 409 )
 410 @end example
 411
 412 @node Writing define-hardware
 413 @subsection Writing define-hardware
 414
 415 The registers of the processor are specified with
 416 @code{define-hardware}.  Also, immediate constants and addresses are
 417 defined to be ``hardware''.  By convention, all hardware elements names
 418 are prefaced with @samp{h-}.  This convention must be followed.
 419
 420 Pre-defined hardware elements are:
 421
 422 @table @code
 423 @item h-memory
 424 Normal CPU memory@footnote{A temporary simplifying assumption is to treat all
 425 memory identically.  Being able to specify various kinds of memory
 426 (e.g. on-chip RAM,ROM) is work-in-progress.}
 427 @item h-sint
 428 signed integer
 429 @item h-uint
 430 unsigned integer
 431 @item h-addr
 432 an address
 433 @item h-iaddr
 434 an instruction address
 435 @end table
 436
 437 Where are floats you ask?  They'll be defined when the need arises.
 438
 439 The program counter is named @samp{h-pc} and must be specified.
 440 It is not a builtin element as sometimes architectures need to
 441 modify its behaviour (in the get/set specs).
 442
 443 @xref{Hardware elements}, for details.
 444
 445 Here's an example from @file{arm.cpu}:
 446
 447 @example
 448 (define-hardware
 449   (name h-gr)
 450   (comment "general registers")
 451   (attrs PROFILE CACHE-ADDR)
 452   (type register WI (16))
 453   (indices extern-keyword gr-names)
 454 )
 455 @end example
 456
 457 @node Writing define-ifield
 458 @subsection Writing define-ifield
 459
 460 Writing instruction field entries involves analyzing the instruction set
 461 and creating an entry for each field.  If a field has multiple purposes,
 462 one can create separate entries for each intended purpose.  The names
 463 should generally follow the names used by the architecture reference manual.
 464
 465 By convention, all instruction field names are prefaced with @samp{f-}.  This
 466 convention must be followed.
 467
 468 CGEN tries to allow the use of the bit numbering as found in the architecture
 469 reference manual.  This minimizes transcription errors both when writing the
 470 @samp{.cpu} file and later when communicating field info to people.
 471
 472 There are two key pieces of data that CGEN uses to organize field
 473 specification: the default insn word size (in bits), and whether bit number
 474 0 is the LSB (least significant bit) or the MSB (most significant bit).
 475
 476 In the general case, fields are described with 4 numbers: word-offset,
 477 word-length, start, and length.
 478 All instruction fields live in exactly one word and must
 479 be contiguous.@footnote{This doesn't include fields like multi-ifields.}
 480 Non-contiguous fields are specified with ``multi-ifields'' which are fields
 481 built up out of several smaller typically disjoint fields.
 482 The size of the word depends on the context.  @samp{word-offset} specifies
 483 the offset in bits from the start of the insn to the word containing the field,
 484 it must be a multiple of 8.
 485 @samp{word-length} specifies the size in bits of the word containing the field,
 486 it also must be a multiple of 8.
 487 @samp{start} specifies the position of the MSB of the field in the word.
 488 @samp{length} specifies the size in bits of the field.
 489
 490 @xref{Instruction fields}, for details.
 491
 492 Example.
 493
 494 Suppose an ISA has instructions that are normally 16 bits,
 495 but has instructions that may take an additional 32 bit immediate
 496 and optionally an additional 16 bit immediate after that.
 497 Also suppose the ISA numbers the bits starting from the LSB.
 498
 499 default-insn-word-bitsize = 16, lsb0? = #t
 500
 501 An instruction with four 4 bit fields, one 32 bit immediate
 502 and one 16 bit immediate might be:
 503
 504 @example
 505
 506   +-----+-----+----+----+--------+--------+
 507   | op1 | op2 | r1 | r2 | simm32 | simm16 |
 508   +-----+-----+----+----+--------+--------+
 509
 510             word-offset  word-length  start  length
 511 f-op1:           0            16        15      4
 512 f-op2:           0            16        11      4
 513 f-r1:            0            16         7      4
 514 f-r2:            0            16         3      4
 515 f-simm32:       16            32        31     32
 516 f-simm16:       48            16        15     16
 517
 518 @end example
 519
 520 If lsb0? = #f, then the example becomes:
 521
 522 @example
 523
 524             word-offset  word-length  start  length
 525 f-op1:           0            16         0      4
 526 f-op2:           0            16         4      4
 527 f-r1:            0            16         8      4
 528 f-r2:            0            16        12      4
 529 f-simm32:       16            32         0     32
 530 f-simm16:       48            16         0     16
 531
 532 @end example
 533
 534 Endianness for the purposes of this example is irrelevant.
 535 In the word containing op1,op2,r1,r2, op1 is in the most significant nibble
 536 and r2 is in the least significant nibble.
 537
 538 For a large number of cases specifying all four numbers is excessive.
 539 With careful redefinition of the starting bit number, one can get away with
 540 only specifying start,length.
 541 Imagine several words of the default insn word size laid out from the start of
 542 the insn.  On top of that lay the field.  Now pick the minimal set of words
 543 that are required to contain the field.  That is the ``word'' we use.
 544 The @samp{start} value is basically computed by adding the offset of the first
 545 containing word to the starting bit of the field in the word.  It's slightly
 546 more complicated than that because lsb0? and the word's size must be taken
 547 into account.  This is best illustrated by rewriting the above example:
 548
 549 @example
 550
 551 lsb0? = #t
 552
 553             start  length
 554 f-op1:        15      4
 555 f-op2:        11      4
 556 f-r1:          7      4
 557 f-r2:          3      4
 558 f-simm32:     47     32
 559 f-simm16:     63     16
 560
 561 lsb0? = #f
 562
 563             start  length
 564 f-op1:         0      4
 565 f-op2:         4      4
 566 f-r1:          8      4
 567 f-r2:         12      4
 568 f-simm32:     16     32
 569 f-simm16:     48     16
 570
 571 @end example
 572
 573 Note: This simpler definition doesn't work in all cases.  Where it doesn't
 574 the full-blown definition must be used.
 575
 576 There are currently no shorthand macros for specifying the full-blown
 577 definition.  It is recommended that if you have to use one that you write
 578 a macro to reduce typing.
 579
 580 Written out the full blown way, the f-op1 field would be specified as:
 581
 582 @example
 583
 584 (define-ifield
 585   (name f-op1)
 586   (comment "f-op1")
 587   (attrs) ; no attributes, could be elided if one wants
 588   (word-offset 0)
 589   (word-length 16)
 590   (start 15)
 591   (length 4)
 592   (mode UINT)
 593   (encode #f) ; no special encoding, could be elided if one wants
 594   (decode #f) ; no special encoding, could be elided if one wants
 595 )
 596
 597 @end example
 598
 599 A macro to simplify that could be written as:
 600
 601 @example
 602
 603 ; dwf: define-word-field (??? pick a better name)
 604
 605 (define-pmacro (dwf x-name x-comment x-attrs
 606                     x-word-offset x-word-length x-start x-length
 607                     x-mode x-encode x-decode)
 608   "Define a field including its containing word."
 609   (define-ifield
 610     (name x-name)
 611     (comment x-comment)
 612     (.splice attrs (.unsplice x-attrs))
 613     (word-offset x-word-offset)
 614     (word-length x-word-length)
 615     (start x-start)
 616     (length x-length)
 617     (mode x-mode)
 618     (.splice encode (.unsplice x-encode))
 619     (.splice decode (.unsplice x-decode))
 620     )
 621 )
 622
 623 @end example
 624
 625 The @samp{.splice} is necessary because @samp{attrs}, @samp{encode},
 626 and @samp{decode} take a list as an argument.
 627
 628 One would then write f-op1 as:
 629
 630 @example
 631
 632 (dwf f-op1 "f-op1" () 0 16 15 4 UINT #f #f)
 633
 634 @end example
 635
 636 @node Writing define-normal-insn-enum
 637 @subsection Writing define-normal-insn-enum
 638
 639 Writing instruction enum entries involves analyzing the instruction set
 640 and attaching names to the opcode fields.  For example, if a field named
 641 @samp{op1} is used to select which of add, addc, sub, subc, and, or,
 642 xor, and inv instructions, one could write something like the following:
 643
 644 @example
 645 (define-normal-insn-enum
 646   insn-op1 "insn format enums" () f-op1 OP1_
 647   (ADD ADDC SUB SUBC
 648    AND OR   XOR INV)
 649 )
 650 @end example
 651
 652 These entries simplify instruction definitions by giving a name to a
 653 particular value for a particular instruction field.  By convention,
 654 enum names are uppercase.  This convention must be followed.
 655
 656 @xref{Enumerated constants}, for details.
 657
 658 @node Writing define-operand
 659 @subsection Writing define-operand
 660
 661 Operands are what instruction semantics use to refer to hardware
 662 elements.  The typical use of an operand is to map instruction fields to
 663 hardware.  For example, if field @samp{f-r2} is used to specify one of
 664 the registers defined by the @code{h-gr} hardware entry, one could write
 665 something like the following:
 666
 667 @code{(dnop sr "source register" () h-gr f-r2)}
 668
 669 @code{dnop} is short for ``define normal operand'' @footnote{A profound
 670 aversion to typing causes me to often provide brief names of things that
 671 get typed a lot.}.
 672
 673 @xref{Instruction operands}, for more information.
 674
 675 @node Writing define-insn
 676 @subsection Writing define-insn
 677
 678 A large part of writing a @file{.cpu} file is going through the CPU manual
 679 and writing an entry for each instruction.
 680 Instructions specific to a particular machine variant are
 681 indicated so with the `MACH' attribute.  Example:
 682
 683 @example
 684 (define-normal-insn
 685   add "add instruction"
 686   ((MACH mach1)) ; or (MACH mach1,mach2,...) for multiple variants
 687   ...
 688 )
 689 @end example
 690
 691 The `base' machine is a predefined machine variant that includes
 692 instructions available to all variants, and is the default if no
 693 `MACH' attribute is specified.
 694
 695 @xref{Instructions}, for details.
 696
 697 @c Seems like this part belongs elsewhere.
 698 When the @file{.cpu} file is processed, CGEN will analyze the semantics
 699 to determine:
 700
 701 @itemize @bullet
 702 @item input operands
 703
 704 The list of hardware elements read by the instruction.
 705
 706 @item output operands
 707
 708 The list of hardware elements written by the instruction.
 709
 710 @item attributes
 711
 712 Instruction attributes that can be computed from the semantics.
 713
 714 CTI: control transfer instruction, generally a branch.
 715
 716 @itemize @bullet
 717 @item UNCOND-CTI
 718
 719 The instruction unconditionally sets pc.
 720
 721 @item COND-CTI
 722
 723 The instruction conditionally sets pc.
 724
 725 @item SKIP-CTI
 726
 727 NB. This is an experimental attribute.  Its usage needs to evolve.
 728
 729 @item DELAY-SLOT
 730
 731 NB. This is an experimental attribute.  Its usage needs to evolve.
 732 @end itemize
 733
 734 @end itemize
 735
 736 CGEN will also try to simplify the semantics as much as possible:
 737
 738 @itemize @bullet
 739 @item Constant folding
 740
 741 Expressions involving constants are simplified and any resulting
 742 non-taken paths of conditional expressions are discarded.
 743 @end itemize
 744
 745 @node Writing define-macro-insn
 746 @subsection Writing define-macro-insn
 747
 748 Some instructions are really aliases for other instructions, maybe even
 749 a sequence of them.  For example, an architecture that has a general
 750 decrement-then-store instruction might have a specialized version of
 751 this instruction called @code{push} supported by the assembler.  These
 752 are handled with ``macro instructions''.
 753
 754 @xref{Macro-instructions}, for details.
 755
 756 Macro instructions are used by the assembler/disassembler only.
 757 They are not used by the simulator.
 758
 759 For example, if this was the real instruction:
 760
 761 @example
 762 (dni st-minus "st-" ()
 763      "st $src1,@-$src2"
 764      (+ OP1_2 OP2_7 src1 src2)
 765      (sequence ((WI new-src2))
 766                (set new-src2 (sub src2 (const 4)))
 767                (set (mem WI new-src2) src1)
 768                (set src2 new-src2))
 769      ()
 770 )
 771 @end example
 772
 773 One could write a @code{push} variant with:
 774
 775 @example
 776 (dnmi push "push" ()
 777   "push $src1"
 778   (emit st-minus src1 (src2 15)) ; "st %0,@-sp"
 779 )
 780 @end example
 781
 782 @node Using define-pmacro
 783 @subsection Using define-pmacro
 784
 785 When a group of entries, say instructions, share similar information, a
 786 macro (in the C preprocessor sense) can be used to simplify the
 787 description.  This can be used to save a lot of typing, which can also
 788 improve readability since often one page of code is easier to understand
 789 than four.
 790
 791 @xref{Preprocessor macros}, for details.
 792
 793 Here is an example from the M32R port.
 794
 795 @example
 796 (define-pmacro (bin-op mnemonic op2-op sem-op imm-prefix imm)
 797   (begin
 798      (dni mnemonic
 799           (.str mnemonic " reg/reg")
 800           ()
 801           (.str mnemonic " $dr,$sr")
 802           (+ OP1_0 op2-op dr sr)
 803           (set dr (sem-op dr sr))
 804           ()
 805      )
 806      (dni (.sym mnemonic "3")
 807           (.str mnemonic " reg/" imm)
 808           ()
 809           (.str mnemonic "3 $dr,$sr," imm-prefix "$" imm)
 810           (+ OP1_8 op2-op dr sr imm)
 811           (set dr (sem-op sr imm))
 812           ()
 813      )
 814    )
 815 )
 816 (bin-op add OP2_10 add "$hash" slo16)
 817 (bin-op and OP2_12 and ""      uimm16)
 818 (bin-op or  OP2_14 or  "$hash" ulo16)
 819 (bin-op xor OP2_13 xor ""      uimm16)
 820 @end example
 821
 822 @code{.sym/.str} are short for Scheme's @code{symbol-append} and
 823 @code{string-append} operations and are conceptually the same as the C
 824 preprocessor's @code{##} concatenation operator.  @xref{Symbol
 825 concatenation}, and @xref{String concatenation}, for details.
 826
 827 @node Splicing list arguments
 828 @subsection Splicing arguments
 829
 830 Several cpu description elements take a list as an argument (as opposed
 831 to a scalar).
 832 When constructing a call to define-* in a pmacro, these elements must have
 833 their arguments spliced in to achieve the proper syntax.
 834
 835 This is best explained with an example.
 836 Here's a simplifying macro for writing ifield definitions with every
 837 element specified.
 838
 839 @xref{List splicing}, for details.
 840
 841 @example
 842
 843 ; dwf: define-word-field
 844
 845 (define-pmacro (dwf x-name x-comment x-attrs
 846                     x-word-offset x-word-length x-start x-length
 847                     x-mode x-encode x-decode)
 848   "Define a field including its containing word."
 849   (define-ifield
 850     (name x-name)
 851     (comment x-comment)
 852     (.splice attrs (.unsplice x-attrs))
 853     (word-offset x-word-offset)
 854     (word-length x-word-length)
 855     (start x-start)
 856     (length x-length)
 857     (mode x-mode)
 858     (.splice encode (.unsplice x-encode))
 859     (.splice decode (.unsplice x-decode))
 860     )
 861 )
 862
 863 @end example
 864
 865 The @samp{.splice} is necessary because @samp{attrs}, @samp{encode},
 866 and @samp{decode} take a list as an argument.
 867
 868 One would then write f-op1 as:
 869
 870 @example
 871
 872 (dwf f-op1 "f-op1" () 0 16 15 4 UINT #f #f)
 873
 874 @end example
 875
 876 @node Interactive development
 877 @subsection Interactive development
 878
 879 The normal way@footnote{Normal for some anyway, certainly each person will have
 880 their own preference.} of writing a CPU description file involves starting Guile
 881 and developing the .CPU file interactively.  The basic steps are:
 882
 883 @enumerate
 884 @item Run @code{guile}.
 885 @item @code{(load "dev.scm")}
 886 @item Load application, e.g. @code{(load-opc)} or @code{(load-sim)}
 887 @item Load CPU description file, e.g. @code{(cload #:arch "cpu/m32r.cpu")}
 888 @item Run generators until output looks reasonable, e.g. @code{(cgen-opc.c)}
 889 @end enumerate
 890
 891 To assist in the development process and to cut down on some typing,
 892 @file{dev.scm} looks for @file{$HOME/.cgenrc} and, if present, loads it.
 893 Typical things that @file{.cgenrc} contains are definitions of procedures
 894 that combine steps 3 and 4 above.
 895
 896 Example:
 897
 898 @example
 899 (define (m32r-opc)
 900   (load-opc)
 901   (cload #:arch "cpu/m32r.cpu")
 902 )
 903 (define (m32r-sim)
 904   (load-sim)
 905   (cload #:arch "cpu/m32r.cpu" #:options "with-scache with-profile=fn")
 906 )
 907 (define (m32rbf-sim)
 908   (load-sim)
 909   (cload #:arch "cpu/m32r.cpu" #:machs "m32r" #:options "with-scache with-profile=fn")
 910 )
 911 (define (m32rxf-sim)
 912   (load-sim)
 913   (cload #:arch "cpu/m32r.cpu" #:machs "m32rx" #:options "with-scache with-profile=fn")
 914 )
 915 @end example
 916
 917 CPU description files are loaded into an interactive guile session with
 918 @code{cload}.  The syntax is:
 919
 920 @example
 921 (cload #:arch "cpu-file-path"
 922        [#:machs "mach-list"]
 923        [#:isas "isa-list"]
 924        [#:options "option-list"])
 925 @end example
 926
 927 Only the @code{#:arch} argument is mandatory.
 928
 929 @samp{cpu-file} is the path to the @file{.cpu} file.
 930
 931 @samp{mach-list} is a comma separated string of machines to keep.
 932
 933 @samp{isa-list} is a comma separated string of isas to keep.
 934
 935 @samp{options} is a space separated string of options for the application.
 936
 937 @node Doing an opcodes port
 938 @section Doing an opcodes port
 939
 940 The best way to begin a port is to take an existing one (preferably one
 941 that is similar to the new port) and use it as a template.
 942
 943 @enumerate
 944 @item Run @code{guile}.
 945 @item @code{(load "dev.scm")}. This loads in a set of interactive
 946 development routines.
 947 @item @code{(load-opc)}. Load the opcodes support.
 948 @item Edit your @file{cpu/<arch>.cpu} and @file{cpu/<arch>.opc} files.
 949         @itemize @bullet
 950         @item The @file{.cpu} file is the main description file.
 951         @item The @file{.opc} file provides additional C support code.
 952         @end itemize
 953 @item @code{(cload #:arch "cpu/<arch>.cpu")}
 954 @item Run each of:
 955         @itemize @bullet
 956         @item @code{(cgen-desc.h)}
 957         @item @code{(cgen-desc.c)}
 958         @item @code{(cgen-opc.h)}
 959         @item @code{(cgen-opc.c)}
 960         @item @code{(cgen-ibld.in)}
 961         @item @code{(cgen-asm.in)}
 962         @item @code{(cgen-dis.in)}
 963         @item @code{(cgen-opinst.c)} -- [optional]
 964         @end itemize
 965 @item Repeat steps 4, 5 and 6 until the output looks reasonable.
 966 @item Add dependencies to @file{opcodes/Makefile.am} to generate the
 967 eight opcodes files (use the M32R port as an example).
 968 @item Run @code{make dep} from the @file{opcodes} build directory.
 969 @item Run @code{make all-opcodes} from the top level build directory.
 970 @end enumerate
 971
 972 @node Doing a GAS port
 973 @section Doing a GAS port
 974
 975 A GAS CGEN port is essentially no different than a normal port except
 976 that the CGEN opcode table is used, and there are extra supporting
 977 routines available in @file{gas/cgen.[ch]}.  As always, a good way to
 978 get started is to take the M32R port as a template and go from there.
 979
 980 The important CGEN-specific things to keep in mind are:
 981 @c to be expanded on as time permits
 982
 983 @itemize @bullet
 984 @item Several support routines are provided by @file{gas/cgen.c}.  Some
 985 must be used, others are available to use if you want to (in general
 986 they should be used unless it's not possible).
 987
 988         @itemize @bullet
 989         @item @code{gas_cgen_init_parse}
 990                 @itemize @minus
 991                 @item Call from @code{md_assemble} before doing anything
 992                         else.
 993                 @item Must be used.
 994                 @end itemize
 995         @item @code{gas_cgen_record_fixup}
 996                 @itemize @minus
 997                 @item Cover function to @code{fix_new}.
 998                 @end itemize
 999         @item @code{gas_cgen_record_fixup_exp}
1000                 @itemize @minus
1001                 @item Cover function to @code{fix_new_exp}.
1002                 @end itemize
1003         @item @code{gas_cgen_parse_operand}
1004                 @itemize @minus
1005                 @item Callback for opcode table based parser, set in
1006                         @code{md_begin}.
1007                 @end itemize
1008         @item @code{gas_cgen_finish_insn}
1009                 @itemize @minus
1010                 @item After parsing an instruction, call this to add the
1011                         instruction to the frag and queue any fixups.
1012                 @end itemize
1013         @item @code{gas_cgen_md_apply_fix}
1014                 @itemize @minus
1015                 @item Provides basic @code{md_apply_fix} support.
1016                 @item @code{#define md_apply_fix
1017                         gas_cgen_md_apply_fix} if you're able to use
1018                         it.
1019                 @end itemize
1020         @item @code{gas_cgen_tc_gen_reloc}
1021                 @itemize @minus
1022                 @item Provides basic @code{tc_gen_reloc} support in function.
1023                 @item @code{#define tc_gen_reloc gas_cgen_tc_gen_reloc}
1024                         if you're able to use it.
1025                 @end itemize
1026         @end itemize
1027
1028 @item @code{md_begin} should contain the following (plus anything else you
1029 want of course):
1030
1031 @example
1032   /* Set the machine number and endianness.  */
1033   gas_cgen_opcode_desc =
1034     <arch>_cgen_cpu_open (CGEN_CPU_OPEN_MACHS,
1035                           0 /* mach number */,
1036                           CGEN_CPU_OPEN_ENDIAN,
1037                           (target_big_endian
1038                             ? CGEN_ENDIAN_BIG
1039                             : CGEN_ENDIAN_LITTLE),
1040                           CGEN_CPU_OPEN_END);
1041
1042   <arch>_cgen_init_asm (gas_cgen_opcode_desc);
1043
1044   /* This is a callback from cgen to gas to parse operands.  */
1045   cgen_set_parse_operand_fn (gas_cgen_opcode_desc, gas_cgen_parse_operand);
1046 @end example
1047
1048 @item @code{md_assemble} should contain the following basic framework:
1049
1050 @example
1051 @{
1052   const CGEN_INSN *insn;
1053   char *errmsg;
1054   CGEN_FIELDS fields;
1055 #if CGEN_INT_INSN_P
1056   cgen_insn_t buffer[CGEN_MAX_INSN_SIZE / sizeof (CGEN_INSN_INT)];
1057 #else
1058   char buffer[CGEN_MAX_INSN_SIZE];
1059 #endif
1060
1061   gas_cgen_init_parse ();
1062
1063   insn = m32r_cgen_assemble_insn (gas_cgen_opcode_desc, str,
1064                                   &fields, buffer, &errmsg);
1065
1066   if (! insn)
1067     @{
1068       as_bad (errmsg);
1069       return;
1070     @}
1071
1072   gas_cgen_finish_insn (insn, buffer, CGEN_FIELDS_BITSIZE (&fields),
1073      relax_p, /* non-zero to allow relaxable insns */
1074      result); /* non-null if results needed for later */
1075 @}
1076 @end example
1077
1078 @end itemize
1079
1080 @node Building a GAS test suite
1081 @section Building a GAS test suite
1082
1083 CGEN can also build the template for test cases for all instructions.  In
1084 some cases it can also generate the actual instructions.  The result is
1085 then assembled, disassembled, verified, and checked into CVS.  Further
1086 changes are usually done by hand as it's easier.  The goal here is to
1087 save the enormous amount of initial typing that is required.
1088
1089 @enumerate
1090 @item @code{cd} to the CGEN build directory
1091 @item @code{make gas-test}
1092
1093 At this point two files have been created in the CGEN build directory:
1094 @file{gas-allinsn.exp} and @file{gas-build.sh}.  The @file{gas-build.sh}
1095 script normally requires one command line argument: the location of your
1096 @file{gas} build directory.  If this argument is omitted, the script
1097 searches in @file{../gas} automatically.
1098
1099 @item Copy @file{gas-allinsn.exp} to @file{toplevel/gas/testsuite/gas/<arch>/allinsn.exp}.
1100 @item @code{sh gas-build.sh}
1101
1102 At this point directory tmpdir contains two files: @file{allinsn.s} and
1103 @file{allinsn.d}.  File @file{allinsn.d} usually needs a bit of massaging.
1104
1105 @item Copy @file{tmpdir/allinsn.[sd]} to @file{toplevel/gas/testsuite/gas/<arch>}
1106 @item Run @code{make check} in the @file{gas} build directory and
1107 massage things until you're satisfied the files are correct.
1108 @item Check files into CVS.
1109 @end enumerate
1110
1111 At this point further additions/modifications are usually done by hand.
1112
1113 @node Doing a simulator port
1114 @section Doing a simulator port
1115
1116 The same basic procedure for opcodes porting applies here.
1117
1118 @enumerate
1119 @item Run @code{guile}.
1120 @item @code{(load "dev.scm")}
1121 @item @code{(load-sim)}
1122 @item Edit your @file{cpu/<arch>.cpu} file.
1123 @item @code{(cload #:arch "cpu/<arch>.cpu")}
1124 @item Run each of:
1125         @itemize @bullet
1126         @item @code{(cgen-arch.h)}
1127         @item @code{(cgen-arch.c)}
1128         @item @code{(cgen-cpuall.h)}
1129         @end itemize
1130 @item Repeat steps 4,5,6 until the output looks reasonable.
1131 @item Edit your cpu/<arch>.cpu file.
1132 @item @code{(cload #:arch "cpu/<arch>.cpu" #:machs "mach1[,mach2[,...]]")}
1133 @item Run each of:
1134         @itemize @bullet
1135         @item @code{(cgen-cpu.h)}
1136         @item @code{(cgen-cpu.c)}
1137         @item @code{(cgen-decode.h)}
1138         @item @code{(cgen-decode.c)}
1139         @item @code{(cgen-semantics.c)}
1140         @item @code{(cgen-sem-switch.c)} -- only if using a switch()
1141                 version of semantics.
1142         @item @code{(cgen-model.c)}
1143         @end itemize
1144 @item Repeat steps 8, 9 and 10 until the output looks reasonable.
1145 @end enumerate
1146
1147 The following additional files are also needed. These live in the
1148 @file{sim/<arch>} directory. Administrivia files like
1149 @file{configure.in} and @file{Makefile.in} are omitted.
1150
1151 @itemize @bullet
1152 @item @file{sim-main.h}
1153
1154 Main include file required by the ``common'' (@file{sim/common})
1155 support, and by each target's @file{.c} file.
1156 This file includes the relevant other headers.
1157 The order is fairly important.
1158 @file{m32r/sim-main.h} is a good starting point.
1159
1160 @file{sim-main.h} also defines several types:
1161
1162 @itemize @minus
1163 @item @code{_sim_cpu} -- a struct containing all state for a
1164 particular CPU.
1165 @item @code{sim_state} -- contains all state of the simulator.
1166 A @code{SIM_DESC} (which is the result of sim_open and is akin
1167 to a file descriptor) points to one of these.
1168 @item @code{sim_cia} -- type of an instruction address.  For
1169 CGEN this is generally ``word mode'', in GCC parlance.
1170 @end itemize
1171
1172 @file{sim-main.h} also defines several macros:
1173
1174 @itemize @minus
1175 @item @code{CIA_GET(cpu)} -- return ``cia'' of the CPU
1176 @item @code{CIA_SET(cpu,cia)} -- set the ``cia'' of the CPU
1177 @end itemize
1178
1179 ``cia'' is short for "current instruction address".
1180
1181 The definition of @code{sim_state} is fairly simple.  Just copy the M32R
1182 case.  The definition of @code{_sim_cpu} is not simple, so pay
1183 attention.  The complexity comes from trying to create a ``derived
1184 class'' of @code{sim_cpu} for each CPU family.  What is done is define a
1185 different version of @code{sim_cpu} in each CPU family's set of files,
1186 with a common ``base class'' structure ``leading part'' for each
1187 @code{sim_cpu} definition used by non-CPU-family specific files.  The
1188 way this is done is by defining @code{WANT_CPU_<CPU-FAMILY-NAME>} at the
1189 top of CPU family specific files. The definition of @code{_sim_cpu} is
1190 then:
1191
1192 @example
1193         struct _sim_cpu @{
1194           /* sim/common CPU base */
1195           sim_cpu_base base;
1196           /* Static parts of CGEN.  */
1197           CGEN_CPU cgen_CPU;
1198         #if defined (WANT_CPU_CPUFAM1)
1199           CPUFAM1_CPU_DATA CPU_data;
1200         #elif defined (WANT_CPU_CPUFAM2)
1201           CPUFAM2_CPU_DATA CPU_data;
1202         #endif
1203         @};
1204 @end example
1205
1206 @item @file{tconfig.in}
1207
1208 This file predates @file{sim-main.h} and was/is intended to contain
1209 macros that configure the simulator sources.
1210
1211 @itemize @bullet
1212 @item @code{SIM_HAVE_MODEL} -- enable @file{common/sim-model.[ch]}
1213 support.
1214 @item @code{SIM_HANDLES_LMA} -- makes @file{sim-hload.c} do the right
1215 thing.
1216 @item @code{WITH_SCACHE_PBB} -- define this to 1 if using pbb scaching.
1217 @end itemize
1218
1219 @item @file{<arch>-sim.h}
1220
1221 This file predates @file{sim-main.h} and contains miscellaneous macros
1222 and definitions used by the simulator.
1223
1224 @item @file{mloop.in}
1225
1226 This file contains code to implement the fetch/execute process.  There
1227 are various ways to do this, and several are supported.  Which one to
1228 choose depends on the environment in which the CPU will be used.  For
1229 example when executing a program in a single-CPU environment without
1230 devices, most or all available cycles can be devoted to simulation of the
1231 target CPU.  However, in an environment with devices or multiple cpus, one
1232 may wish the CPU to execute one instruction then relinquish control so a
1233 device operation may be done or an instruction can be simulated on a
1234 second cpu.  Efficient techniques for the former aren't necessarily the best
1235 for the latter.
1236
1237 Three versions are currently supported:
1238
1239 @enumerate
1240 @item simple -- fetch/decode/execute one insn
1241 @item scache -- same as simple but results of decoding are cached
1242 @item pbb -- same as scache but several insns are handled each iteration
1243 pbb stands for pseudo basic block.
1244 @end enumerate
1245
1246 This file is processed by @file{common/genmloop.sh} at build time. The
1247 result is two files: @file{mloop.c} and @file{eng.h}.
1248
1249 @item @file{sim-if.c}
1250
1251 By convention this file contains @code{sim_open}, @code{sim_close},
1252 @code{sim_create_inferior}, @code{sim_do_command}.  These functions can
1253 live in any file of course.  They're here because they're the parts of
1254 the @code{remote-sim.h} interface that aren't provided by the common
1255 directory.
1256
1257 @item @file{<cpufam>.c}
1258
1259 By convention this file contains register access and model support
1260 functions for a CPU family (the name of this file is misnamed in the
1261 M32R case).  The register access functions implement the
1262 @code{sim_fetch_register} and @code{sim_store_register} interface
1263 functions (named @code{<cpufam>_@{fetch,store@}_register}), and support
1264 code for register get/set rtl.  The model support functions implement the
1265 before/after handlers (functions that handle tracing/profiling) and
1266 timing for each function unit.
1267
1268 @item Other files
1269
1270 The M32R port has two other handwritten files: @file{devices.c} and
1271 @file{traps.c}.  How you wish to organize this is up to you.
1272 @end itemize
1273
1274 @node Building a simulator test suite
1275 @section Building a simulator test suite
1276
1277 CGEN can also build the template for test cases for all instructions.  In
1278 some cases it can also generate the actual instructions
1279 @footnote{Although this hasn't been implemented yet.}.  The result is
1280 then verified and checked into CVS.  Further changes are usually done by
1281 hand as it's easier.  The goal here is to save the enormous amount of
1282 initial typing that is required.
1283
1284 @enumerate
1285 @item @code{cd} to the CGEN build directory
1286 @item @code{make sim-test ISA=<arch>}
1287
1288 At this point two files have been created in the CGEN build directory:
1289 @file{sim-allinsn.exp} and @file{sim-build.sh}.
1290
1291 @item Copy @file{sim-allinsn.exp} to
1292 @file{toplevel/sim/testsuite/sim/<arch>/allinsn.exp}.
1293 @item @code{sh sim-build.sh}
1294
1295 At this point a new subdirectory called @file{tmpdir} will be created
1296 and will contain one test case for each instruction.  The framework has
1297 been filled in but not the actual test case.  It's handy to write an
1298 ``include file'' containing assembler macros that simplify writing test
1299 cases.  See @file{toplevel/sim/testsuite/sim/m32r/testutils.inc} for an
1300 example.
1301
1302 @item write testutils.inc
1303 @item finish each test case
1304 @item copy @file{tmpdir/*.cgs} to @file{toplevel/sim/testsuite/sim/<arch>}
1305 @item run @code{make check} in the sim build directory and massage things until you're satisfied the files are correct
1306 @item Check files into CVS.
1307 @end enumerate
1308
1309 @noindent At this point further additions/modifications are usually done
1310 by hand.