1 @c Copyright (C) 2000, 2009 Red Hat, Inc.
2 @c This file is part of the CGEN manual.
3 @c For copying conditions, see the file cgen.texi.
9 This chapter describes how to do a CGEN port.
10 It focuses on doing binutils and simulator ports, but the
11 procedure should be generally applicable.
14 * Introduction to porting::
15 * Supported Guile versions::
17 * Writing a CPU description file::
18 * Doing an opcodes port::
20 * Building a GAS test suite::
21 * Doing a simulator port::
22 * Building a simulator test suite::
25 @node Introduction to porting
26 @section Introduction to porting
28 Doing a GNU tools port for a new processor basically consists of porting the
29 following components more or less in order. The order can be changed,
30 of course, but the following order is reasonable. Certainly things like
31 BFD and opcodes need to be finished earlier than others. Bugs in
32 earlier pieces are often not found until testing later pieces so each
33 piece isn't necessarily finished until they all are.
42 @item Linker (@code{ld})
50 The use of CGEN affects the opcodes, GAS, and simulator portions only.
51 As always, the M32R port is a good reference base.
53 One goal of CGEN is to describe the CPU in an application independent manner
54 so that program generators can do all the repetitive work of generating
55 code and tables for each CPU that is ported.
57 For opcodes, several files are generated. No additional code need be
58 written in the opcodes directory although as an escape hatch the user
59 can add target specific code to file <arch>.opc in the CGEN cpu source
60 directory. These functions will be included in the relevant generated
61 files. An example of when you need to create an <arch>.opc file is when
62 there are special pseudo-ops that need to be parsed, for example the
63 high/shigh pseudo-ops of the M32R.
64 @xref{Doing an opcodes port}.
66 For GAS, no files are generated (except test cases!) so the port is done
67 more or less like the other GAS ports except that the assembler uses the
68 CGEN-built opcode table plus @file{toplevel/gas/cgen.[ch]}.
70 For the simulator, several files are built, and other support files need
71 to be written. @xref{Doing a simulator port}.
73 @node Supported Guile versions
74 @section Supported Guile versions
76 In order to avoid suffering from the bug of the day when using
77 snapshots, CGEN development has been confined to Guile releases only.
78 CGEN has been tested with Guile versions @code{1.4.1}, @code{1.6.8}
80 As time passes older versions of Guile will no longer be supported.
82 @node Running configure
83 @section Running @code{configure}
85 When doing porting or maintenance activity with CGEN, it's a good idea
86 to configure the build tree with the @code{--enable-cgen-maint} option.
87 This adds the necessary dependencies to the @file{toplevel/opcodes} and
88 @file{toplevel/sim} directories so that when the @file{.cpu} file is
89 changed the makefiles will regenerated the corresponding sources.
91 CGEN uses Guile so it must be installed.
93 @node Writing a CPU description file
94 @section Writing a CPU description file
96 The first step in doing a CGEN port is writing a CPU description file.
97 The best way to do that is to take an existing file (such as the M32R)
98 and use it as a template.
100 Writing a CPU description file generally involves writing each of the
101 following types of entries, in order. @xref{RTL}, for detailed
102 descriptions of each type of entry that appears in the description file.
105 * Conventions:: Programming style conventions
106 * simplify.inc:: Simplifying writing @file{.cpu} files
107 * Writing define-arch:: Architecture wide specs
108 * Writing define-isa:: Instruction set characteristics
109 * Writing define-cpu:: CPU families
110 * Writing define-mach:: Machine variants
111 * Writing define-model:: Models of each machine variant
112 * Writing define-hardware:: Hardware elements
113 * Writing define-ifield:: Instruction fields
114 * Writing define-normal-insn-enum:: Instruction enums
115 * Writing define-operand:: Instruction operands
116 * Writing define-insn:: Instructions
117 * Writing define-macro-insn:: Macro instructions
118 * Using define-pmacro:: Preprocessor macros
119 * Splicing list arguments:: List arguments in macros
120 * Interactive development:: Useful things to do in a Guile shell
124 @subsection Conventions
126 First a digression on conventions and programming style.
130 @item @code{define-foo} vs. @code{define-normal-foo}
132 Each CPU description @code{define-} entry generally provides two forms:
133 the normal form and the general form. The normal form has a simple,
134 fixed-argument syntax that allows one to specify the most popular
135 elements. When one needs to specify more obscure elements of the
136 entry one uses the long form which is a list of name/value pairs. The
137 naming convention is to call the normal form @code{define-normal-foo}
138 and the general form @code{define-foo}.
140 @item Parentheses placement
145 (define-normal-insn-enum
146 insn-op1 "insn format enums" () f-op1 OP1_
152 All Lisp/Scheme code I've read puts the trailing parenthesis on the
153 previous line. CGEN programming style says the last trailing
154 parenthesis goes on a line by itself. If someone wants to put forth an
155 argument of why this should change, please do. I like putting the
156 very last parenthesis on a line by itself in column 1 because it makes
157 it easier to traverse the file with a parenthesis matching keystroke.
159 @item @code{StudlyCaps} vs. @code{_} vs. @code{-}
161 The convention is to have most things lowercase with words separated by
162 @samp{-}. Things that are uppercase are fixed and well defined: enum
163 values and mode names.
164 @c FIXME: Seems to me there's a few others.
165 This convention must be followed.
169 There are two things to keep in mind regarding integers in CGEN.
173 @item Unspecified width
175 Integers in CGEN generally don't specify a width.
176 The width is imposed by context.
178 @item RTL canonicalization
180 Integers in RTL may simply be written as a number,
181 or in the full canonical form as
182 @samp{(const [<option-list>] [<mode>] <value>)}.
184 The ``option list'', if specified, must be @samp{()}
185 as there are currently no options for constants.
186 It is optional and is generally elided when written.
188 The ``mode'' of the number specifies the precision.
189 The default mode is @samp{INT} meaning arbitrary precision.
191 In RTL, whether to write just the number, e.g. @samp{24},
192 or the full canonical form, e.g., @samp{(const () INT 24)},
193 or anything in between is a matter of style.
200 @subsection simplify.inc
203 The file @file{simplify.inc} provides several pmacros that help simplify
204 writing @file{.cpu} files.
206 To use it add the following to your @file{.cpu} file.
209 (include "simplify.inc")
212 @file{simplify.inc} provides the following pmacros:
216 @item define-normal-enum
217 (@pxref{a-define-normal-enum, define-normal-enum})
219 @item define-normal-insn-enum
220 (@pxref{a-define-normal-insn-enum, define-normal-insn-enum})
222 @c ??? Would have been nice to have called this define-simple-ifield.
223 @item define-normal-ifield
224 (@pxref{a-define-normal-ifield, define-normal-ifield})
232 @item define-normal-multi-ifield
233 (@pxref{a-define-normal-multi-ifield, define-normal-multi-ifield})
236 (@pxref{a-dnmf, dnmf})
239 (@pxref{a-dsmf, dsmf})
241 @item define-normal-hardware
242 (@pxref{a-define-normal-hardware, define-normal-hardware})
247 @item define-simple-hardware
248 (@pxref{a-define-simple-hardware, define-simple-hardware})
253 @item define-normal-operand
254 (@pxref{a-define-normal-operand, define-normal-operand})
260 (@pxref{a-dnop, dnop})
263 @c (@pxref{a-dndo, dndo})
265 @item define-normal-insn
266 (@pxref{a-define-normal-insn, define-normal-insn})
271 @item define-normal-macro-insn
272 (@pxref{a-define-normal-macro-insn, define-normal-macro-insn})
275 (@pxref{a-dnmi, dnmi})
279 @node Writing define-arch
280 @subsection Writing define-arch
282 Various simple and architecture-wide common things like the name of the
283 processor must be defined somewhere, so all of this stuff is put under
286 This must be the first entry in the description file.
288 @xref{Architecture variants}, for details.
290 Here's an example from @file{m32r.cpu}:
294 (name m32r) ; name of cpu family
295 (comment "Renesas M32R")
296 (default-alignment aligned)
298 (machs m32r m32rx m32r2)
303 @node Writing define-isa
304 @subsection Writing define-isa
306 There are two purposes to @code{define-isa}.
307 The first is to specify parameters needed to decode instructions.
309 The second is to give the instruction set a name. This is important for
310 architectures like the ARM where one CPU can execute multiple
313 @xref{Architecture variants}, for details.
315 Here's an example from @file{arm.cpu}:
320 (comment "ARM Thumb instruction set (16 bit insns)")
321 (base-insn-bitsize 16)
322 (decode-assist (15 14 13 12 11 10 9 8))
323 (setup-semantics (set-quiet (reg h-gr 15) (add pc 4)))
327 @node Writing define-cpu
328 @subsection Writing define-cpu
330 CPU families are an internal and artificial classification designed to
331 collect processor variants that are sufficiently similar together under
332 one roof for the simulator. What is ``sufficiently similar'' is up to
333 the programmer. For example, if the only difference between two
334 processor variants is that one has a few extra instructions, there's no
335 point in treating them separately in the simulator.
337 When simulating the variant without the extra instructions, said
338 instructions are marked as ``invalid''. On the other hand, putting 32
339 and 64 bit variants of an architecture under one roof is problematic
340 since the word size is different. What ``under one roof'' means is left
341 fuzzy for now, but basically the simulator engine has a collection of
342 structures defining internal state, and ``CPU families'' minimize the
343 number of copies of generated code that manipulate this state.
345 @xref{Architecture variants}, for details.
347 Here's an example from @file{openrisc.cpu}:
351 ; CPU names must be distinct from the architecture name and machine names.
352 ; The "b" suffix stands for "base" and is the convention.
353 ; The "f" suffix stands for "family" and is the convention.
355 (comment "OpenRISC base family")
361 @node Writing define-mach
362 @subsection Writing define-mach
364 CGEN uses ``mach'' in the same sense that BFD uses ``mach''.
365 ``Mach'', which is short for `machine', defines a variant of
367 @c There may be a need for a many-to-one correspondence between CGEN
368 @c machs and BFD machs.
370 @xref{Architecture variants}, for details.
372 Here's an example from @file{m32r.cpu}:
377 (comment "M32RX cpu")
382 @node Writing define-model
383 @subsection Writing define-model
385 When describing a CPU, in any context, there is ``architecture'' and
386 there is ``implementation''. In CGEN parlance, a ``model'' is an
387 implementation of a ``mach''. Models specify pipeline and other
388 performance related characteristics of the implementation.
390 Some architectures bring pipeline details up into the architecture
391 (rather than making them an implementation detail). It's not clear
392 yet how to handle all the various possibilities so at present this is
393 done on a case-by-case basis. Maybe a straightforward solution will
396 @xref{Model variants}, for details.
398 Here's an example from @file{arm.cpu}:
399 @c A poor example. Later.
404 (comment "ARM 710 microprocessor")
406 (unit u-exec "Execution Unit" ()
412 @node Writing define-hardware
413 @subsection Writing define-hardware
415 The registers of the processor are specified with
416 @code{define-hardware}. Also, immediate constants and addresses are
417 defined to be ``hardware''. By convention, all hardware elements names
418 are prefaced with @samp{h-}. This convention must be followed.
420 Pre-defined hardware elements are:
424 Normal CPU memory@footnote{A temporary simplifying assumption is to treat all
425 memory identically. Being able to specify various kinds of memory
426 (e.g. on-chip RAM,ROM) is work-in-progress.}
434 an instruction address
437 Where are floats you ask? They'll be defined when the need arises.
439 The program counter is named @samp{h-pc} and must be specified.
440 It is not a builtin element as sometimes architectures need to
441 modify its behaviour (in the get/set specs).
443 @xref{Hardware elements}, for details.
445 Here's an example from @file{arm.cpu}:
450 (comment "general registers")
451 (attrs PROFILE CACHE-ADDR)
452 (type register WI (16))
453 (indices extern-keyword gr-names)
457 @node Writing define-ifield
458 @subsection Writing define-ifield
460 Writing instruction field entries involves analyzing the instruction set
461 and creating an entry for each field. If a field has multiple purposes,
462 one can create separate entries for each intended purpose. The names
463 should generally follow the names used by the architecture reference manual.
465 By convention, all instruction field names are prefaced with @samp{f-}. This
466 convention must be followed.
468 CGEN tries to allow the use of the bit numbering as found in the architecture
469 reference manual. This minimizes transcription errors both when writing the
470 @samp{.cpu} file and later when communicating field info to people.
472 There are two key pieces of data that CGEN uses to organize field
473 specification: the default insn word size (in bits), and whether bit number
474 0 is the LSB (least significant bit) or the MSB (most significant bit).
476 In the general case, fields are described with 4 numbers: word-offset,
477 word-length, start, and length.
478 All instruction fields live in exactly one word and must
479 be contiguous.@footnote{This doesn't include fields like multi-ifields.}
480 Non-contiguous fields are specified with ``multi-ifields'' which are fields
481 built up out of several smaller typically disjoint fields.
482 The size of the word depends on the context. @samp{word-offset} specifies
483 the offset in bits from the start of the insn to the word containing the field,
484 it must be a multiple of 8.
485 @samp{word-length} specifies the size in bits of the word containing the field,
486 it also must be a multiple of 8.
487 @samp{start} specifies the position of the MSB of the field in the word.
488 @samp{length} specifies the size in bits of the field.
490 @xref{Instruction fields}, for details.
494 Suppose an ISA has instructions that are normally 16 bits,
495 but has instructions that may take an additional 32 bit immediate
496 and optionally an additional 16 bit immediate after that.
497 Also suppose the ISA numbers the bits starting from the LSB.
499 default-insn-word-bitsize = 16, lsb0? = #t
501 An instruction with four 4 bit fields, one 32 bit immediate
502 and one 16 bit immediate might be:
506 +-----+-----+----+----+--------+--------+
507 | op1 | op2 | r1 | r2 | simm32 | simm16 |
508 +-----+-----+----+----+--------+--------+
510 word-offset word-length start length
515 f-simm32: 16 32 31 32
516 f-simm16: 48 16 15 16
520 If lsb0? = #f, then the example becomes:
524 word-offset word-length start length
534 Endianness for the purposes of this example is irrelevant.
535 In the word containing op1,op2,r1,r2, op1 is in the most significant nibble
536 and r2 is in the least significant nibble.
538 For a large number of cases specifying all four numbers is excessive.
539 With careful redefinition of the starting bit number, one can get away with
540 only specifying start,length.
541 Imagine several words of the default insn word size laid out from the start of
542 the insn. On top of that lay the field. Now pick the minimal set of words
543 that are required to contain the field. That is the ``word'' we use.
544 The @samp{start} value is basically computed by adding the offset of the first
545 containing word to the starting bit of the field in the word. It's slightly
546 more complicated than that because lsb0? and the word's size must be taken
547 into account. This is best illustrated by rewriting the above example:
573 Note: This simpler definition doesn't work in all cases. Where it doesn't
574 the full-blown definition must be used.
576 There are currently no shorthand macros for specifying the full-blown
577 definition. It is recommended that if you have to use one that you write
578 a macro to reduce typing.
580 Written out the full blown way, the f-op1 field would be specified as:
587 (attrs) ; no attributes, could be elided if one wants
593 (encode #f) ; no special encoding, could be elided if one wants
594 (decode #f) ; no special encoding, could be elided if one wants
599 A macro to simplify that could be written as:
603 ; dwf: define-word-field (??? pick a better name)
605 (define-pmacro (dwf x-name x-comment x-attrs
606 x-word-offset x-word-length x-start x-length
607 x-mode x-encode x-decode)
608 "Define a field including its containing word."
612 (.splice attrs (.unsplice x-attrs))
613 (word-offset x-word-offset)
614 (word-length x-word-length)
618 (.splice encode (.unsplice x-encode))
619 (.splice decode (.unsplice x-decode))
625 The @samp{.splice} is necessary because @samp{attrs}, @samp{encode},
626 and @samp{decode} take a list as an argument.
628 One would then write f-op1 as:
632 (dwf f-op1 "f-op1" () 0 16 15 4 UINT #f #f)
636 @node Writing define-normal-insn-enum
637 @subsection Writing define-normal-insn-enum
639 Writing instruction enum entries involves analyzing the instruction set
640 and attaching names to the opcode fields. For example, if a field named
641 @samp{op1} is used to select which of add, addc, sub, subc, and, or,
642 xor, and inv instructions, one could write something like the following:
645 (define-normal-insn-enum
646 insn-op1 "insn format enums" () f-op1 OP1_
652 These entries simplify instruction definitions by giving a name to a
653 particular value for a particular instruction field. By convention,
654 enum names are uppercase. This convention must be followed.
656 @xref{Enumerated constants}, for details.
658 @node Writing define-operand
659 @subsection Writing define-operand
661 Operands are what instruction semantics use to refer to hardware
662 elements. The typical use of an operand is to map instruction fields to
663 hardware. For example, if field @samp{f-r2} is used to specify one of
664 the registers defined by the @code{h-gr} hardware entry, one could write
665 something like the following:
667 @code{(dnop sr "source register" () h-gr f-r2)}
669 @code{dnop} is short for ``define normal operand'' @footnote{A profound
670 aversion to typing causes me to often provide brief names of things that
673 @xref{Instruction operands}, for more information.
675 @node Writing define-insn
676 @subsection Writing define-insn
678 A large part of writing a @file{.cpu} file is going through the CPU manual
679 and writing an entry for each instruction.
680 Instructions specific to a particular machine variant are
681 indicated so with the `MACH' attribute. Example:
685 add "add instruction"
686 ((MACH mach1)) ; or (MACH mach1,mach2,...) for multiple variants
691 The `base' machine is a predefined machine variant that includes
692 instructions available to all variants, and is the default if no
693 `MACH' attribute is specified.
695 @xref{Instructions}, for details.
697 @c Seems like this part belongs elsewhere.
698 When the @file{.cpu} file is processed, CGEN will analyze the semantics
704 The list of hardware elements read by the instruction.
706 @item output operands
708 The list of hardware elements written by the instruction.
712 Instruction attributes that can be computed from the semantics.
714 CTI: control transfer instruction, generally a branch.
719 The instruction unconditionally sets pc.
723 The instruction conditionally sets pc.
727 NB. This is an experimental attribute. Its usage needs to evolve.
731 NB. This is an experimental attribute. Its usage needs to evolve.
736 CGEN will also try to simplify the semantics as much as possible:
739 @item Constant folding
741 Expressions involving constants are simplified and any resulting
742 non-taken paths of conditional expressions are discarded.
745 @node Writing define-macro-insn
746 @subsection Writing define-macro-insn
748 Some instructions are really aliases for other instructions, maybe even
749 a sequence of them. For example, an architecture that has a general
750 decrement-then-store instruction might have a specialized version of
751 this instruction called @code{push} supported by the assembler. These
752 are handled with ``macro instructions''.
754 @xref{Macro-instructions}, for details.
756 Macro instructions are used by the assembler/disassembler only.
757 They are not used by the simulator.
759 For example, if this was the real instruction:
762 (dni st-minus "st-" ()
764 (+ OP1_2 OP2_7 src1 src2)
765 (sequence ((WI new-src2))
766 (set new-src2 (sub src2 (const 4)))
767 (set (mem WI new-src2) src1)
773 One could write a @code{push} variant with:
778 (emit st-minus src1 (src2 15)) ; "st %0,@-sp"
782 @node Using define-pmacro
783 @subsection Using define-pmacro
785 When a group of entries, say instructions, share similar information, a
786 macro (in the C preprocessor sense) can be used to simplify the
787 description. This can be used to save a lot of typing, which can also
788 improve readability since often one page of code is easier to understand
791 @xref{Preprocessor macros}, for details.
793 Here is an example from the M32R port.
796 (define-pmacro (bin-op mnemonic op2-op sem-op imm-prefix imm)
799 (.str mnemonic " reg/reg")
801 (.str mnemonic " $dr,$sr")
802 (+ OP1_0 op2-op dr sr)
803 (set dr (sem-op dr sr))
806 (dni (.sym mnemonic "3")
807 (.str mnemonic " reg/" imm)
809 (.str mnemonic "3 $dr,$sr," imm-prefix "$" imm)
810 (+ OP1_8 op2-op dr sr imm)
811 (set dr (sem-op sr imm))
816 (bin-op add OP2_10 add "$hash" slo16)
817 (bin-op and OP2_12 and "" uimm16)
818 (bin-op or OP2_14 or "$hash" ulo16)
819 (bin-op xor OP2_13 xor "" uimm16)
822 @code{.sym/.str} are short for Scheme's @code{symbol-append} and
823 @code{string-append} operations and are conceptually the same as the C
824 preprocessor's @code{##} concatenation operator. @xref{Symbol
825 concatenation}, and @xref{String concatenation}, for details.
827 @node Splicing list arguments
828 @subsection Splicing arguments
830 Several cpu description elements take a list as an argument (as opposed
832 When constructing a call to define-* in a pmacro, these elements must have
833 their arguments spliced in to achieve the proper syntax.
835 This is best explained with an example.
836 Here's a simplifying macro for writing ifield definitions with every
839 @xref{List splicing}, for details.
843 ; dwf: define-word-field
845 (define-pmacro (dwf x-name x-comment x-attrs
846 x-word-offset x-word-length x-start x-length
847 x-mode x-encode x-decode)
848 "Define a field including its containing word."
852 (.splice attrs (.unsplice x-attrs))
853 (word-offset x-word-offset)
854 (word-length x-word-length)
858 (.splice encode (.unsplice x-encode))
859 (.splice decode (.unsplice x-decode))
865 The @samp{.splice} is necessary because @samp{attrs}, @samp{encode},
866 and @samp{decode} take a list as an argument.
868 One would then write f-op1 as:
872 (dwf f-op1 "f-op1" () 0 16 15 4 UINT #f #f)
876 @node Interactive development
877 @subsection Interactive development
879 The normal way@footnote{Normal for some anyway, certainly each person will have
880 their own preference.} of writing a CPU description file involves starting Guile
881 and developing the .CPU file interactively. The basic steps are:
884 @item Run @code{guile}.
885 @item @code{(load "dev.scm")}
886 @item Load application, e.g. @code{(load-opc)} or @code{(load-sim)}
887 @item Load CPU description file, e.g. @code{(cload #:arch "cpu/m32r.cpu")}
888 @item Run generators until output looks reasonable, e.g. @code{(cgen-opc.c)}
891 To assist in the development process and to cut down on some typing,
892 @file{dev.scm} looks for @file{$HOME/.cgenrc} and, if present, loads it.
893 Typical things that @file{.cgenrc} contains are definitions of procedures
894 that combine steps 3 and 4 above.
901 (cload #:arch "cpu/m32r.cpu")
905 (cload #:arch "cpu/m32r.cpu" #:options "with-scache with-profile=fn")
909 (cload #:arch "cpu/m32r.cpu" #:machs "m32r" #:options "with-scache with-profile=fn")
913 (cload #:arch "cpu/m32r.cpu" #:machs "m32rx" #:options "with-scache with-profile=fn")
917 CPU description files are loaded into an interactive guile session with
918 @code{cload}. The syntax is:
921 (cload #:arch "cpu-file-path"
922 [#:machs "mach-list"]
924 [#:options "option-list"])
927 Only the @code{#:arch} argument is mandatory.
929 @samp{cpu-file} is the path to the @file{.cpu} file.
931 @samp{mach-list} is a comma separated string of machines to keep.
933 @samp{isa-list} is a comma separated string of isas to keep.
935 @samp{options} is a space separated string of options for the application.
937 @node Doing an opcodes port
938 @section Doing an opcodes port
940 The best way to begin a port is to take an existing one (preferably one
941 that is similar to the new port) and use it as a template.
944 @item Run @code{guile}.
945 @item @code{(load "dev.scm")}. This loads in a set of interactive
946 development routines.
947 @item @code{(load-opc)}. Load the opcodes support.
948 @item Edit your @file{cpu/<arch>.cpu} and @file{cpu/<arch>.opc} files.
950 @item The @file{.cpu} file is the main description file.
951 @item The @file{.opc} file provides additional C support code.
953 @item @code{(cload #:arch "cpu/<arch>.cpu")}
956 @item @code{(cgen-desc.h)}
957 @item @code{(cgen-desc.c)}
958 @item @code{(cgen-opc.h)}
959 @item @code{(cgen-opc.c)}
960 @item @code{(cgen-ibld.in)}
961 @item @code{(cgen-asm.in)}
962 @item @code{(cgen-dis.in)}
963 @item @code{(cgen-opinst.c)} -- [optional]
965 @item Repeat steps 4, 5 and 6 until the output looks reasonable.
966 @item Add dependencies to @file{opcodes/Makefile.am} to generate the
967 eight opcodes files (use the M32R port as an example).
968 @item Run @code{make dep} from the @file{opcodes} build directory.
969 @item Run @code{make all-opcodes} from the top level build directory.
972 @node Doing a GAS port
973 @section Doing a GAS port
975 A GAS CGEN port is essentially no different than a normal port except
976 that the CGEN opcode table is used, and there are extra supporting
977 routines available in @file{gas/cgen.[ch]}. As always, a good way to
978 get started is to take the M32R port as a template and go from there.
980 The important CGEN-specific things to keep in mind are:
981 @c to be expanded on as time permits
984 @item Several support routines are provided by @file{gas/cgen.c}. Some
985 must be used, others are available to use if you want to (in general
986 they should be used unless it's not possible).
989 @item @code{gas_cgen_init_parse}
991 @item Call from @code{md_assemble} before doing anything
995 @item @code{gas_cgen_record_fixup}
997 @item Cover function to @code{fix_new}.
999 @item @code{gas_cgen_record_fixup_exp}
1001 @item Cover function to @code{fix_new_exp}.
1003 @item @code{gas_cgen_parse_operand}
1005 @item Callback for opcode table based parser, set in
1008 @item @code{gas_cgen_finish_insn}
1010 @item After parsing an instruction, call this to add the
1011 instruction to the frag and queue any fixups.
1013 @item @code{gas_cgen_md_apply_fix}
1015 @item Provides basic @code{md_apply_fix} support.
1016 @item @code{#define md_apply_fix
1017 gas_cgen_md_apply_fix} if you're able to use
1020 @item @code{gas_cgen_tc_gen_reloc}
1022 @item Provides basic @code{tc_gen_reloc} support in function.
1023 @item @code{#define tc_gen_reloc gas_cgen_tc_gen_reloc}
1024 if you're able to use it.
1028 @item @code{md_begin} should contain the following (plus anything else you
1032 /* Set the machine number and endianness. */
1033 gas_cgen_opcode_desc =
1034 <arch>_cgen_cpu_open (CGEN_CPU_OPEN_MACHS,
1035 0 /* mach number */,
1036 CGEN_CPU_OPEN_ENDIAN,
1039 : CGEN_ENDIAN_LITTLE),
1042 <arch>_cgen_init_asm (gas_cgen_opcode_desc);
1044 /* This is a callback from cgen to gas to parse operands. */
1045 cgen_set_parse_operand_fn (gas_cgen_opcode_desc, gas_cgen_parse_operand);
1048 @item @code{md_assemble} should contain the following basic framework:
1052 const CGEN_INSN *insn;
1056 cgen_insn_t buffer[CGEN_MAX_INSN_SIZE / sizeof (CGEN_INSN_INT)];
1058 char buffer[CGEN_MAX_INSN_SIZE];
1061 gas_cgen_init_parse ();
1063 insn = m32r_cgen_assemble_insn (gas_cgen_opcode_desc, str,
1064 &fields, buffer, &errmsg);
1072 gas_cgen_finish_insn (insn, buffer, CGEN_FIELDS_BITSIZE (&fields),
1073 relax_p, /* non-zero to allow relaxable insns */
1074 result); /* non-null if results needed for later */
1080 @node Building a GAS test suite
1081 @section Building a GAS test suite
1083 CGEN can also build the template for test cases for all instructions. In
1084 some cases it can also generate the actual instructions. The result is
1085 then assembled, disassembled, verified, and checked into CVS. Further
1086 changes are usually done by hand as it's easier. The goal here is to
1087 save the enormous amount of initial typing that is required.
1090 @item @code{cd} to the CGEN build directory
1091 @item @code{make gas-test}
1093 At this point two files have been created in the CGEN build directory:
1094 @file{gas-allinsn.exp} and @file{gas-build.sh}. The @file{gas-build.sh}
1095 script normally requires one command line argument: the location of your
1096 @file{gas} build directory. If this argument is omitted, the script
1097 searches in @file{../gas} automatically.
1099 @item Copy @file{gas-allinsn.exp} to @file{toplevel/gas/testsuite/gas/<arch>/allinsn.exp}.
1100 @item @code{sh gas-build.sh}
1102 At this point directory tmpdir contains two files: @file{allinsn.s} and
1103 @file{allinsn.d}. File @file{allinsn.d} usually needs a bit of massaging.
1105 @item Copy @file{tmpdir/allinsn.[sd]} to @file{toplevel/gas/testsuite/gas/<arch>}
1106 @item Run @code{make check} in the @file{gas} build directory and
1107 massage things until you're satisfied the files are correct.
1108 @item Check files into CVS.
1111 At this point further additions/modifications are usually done by hand.
1113 @node Doing a simulator port
1114 @section Doing a simulator port
1116 The same basic procedure for opcodes porting applies here.
1119 @item Run @code{guile}.
1120 @item @code{(load "dev.scm")}
1121 @item @code{(load-sim)}
1122 @item Edit your @file{cpu/<arch>.cpu} file.
1123 @item @code{(cload #:arch "cpu/<arch>.cpu")}
1126 @item @code{(cgen-arch.h)}
1127 @item @code{(cgen-arch.c)}
1128 @item @code{(cgen-cpuall.h)}
1130 @item Repeat steps 4,5,6 until the output looks reasonable.
1131 @item Edit your cpu/<arch>.cpu file.
1132 @item @code{(cload #:arch "cpu/<arch>.cpu" #:machs "mach1[,mach2[,...]]")}
1135 @item @code{(cgen-cpu.h)}
1136 @item @code{(cgen-cpu.c)}
1137 @item @code{(cgen-decode.h)}
1138 @item @code{(cgen-decode.c)}
1139 @item @code{(cgen-semantics.c)}
1140 @item @code{(cgen-sem-switch.c)} -- only if using a switch()
1141 version of semantics.
1142 @item @code{(cgen-model.c)}
1144 @item Repeat steps 8, 9 and 10 until the output looks reasonable.
1147 The following additional files are also needed. These live in the
1148 @file{sim/<arch>} directory. Administrivia files like
1149 @file{configure.in} and @file{Makefile.in} are omitted.
1152 @item @file{sim-main.h}
1154 Main include file required by the ``common'' (@file{sim/common})
1155 support, and by each target's @file{.c} file.
1156 This file includes the relevant other headers.
1157 The order is fairly important.
1158 @file{m32r/sim-main.h} is a good starting point.
1160 @file{sim-main.h} also defines several types:
1163 @item @code{_sim_cpu} -- a struct containing all state for a
1165 @item @code{sim_state} -- contains all state of the simulator.
1166 A @code{SIM_DESC} (which is the result of sim_open and is akin
1167 to a file descriptor) points to one of these.
1168 @item @code{sim_cia} -- type of an instruction address. For
1169 CGEN this is generally ``word mode'', in GCC parlance.
1172 @file{sim-main.h} also defines several macros:
1175 @item @code{CIA_GET(cpu)} -- return ``cia'' of the CPU
1176 @item @code{CIA_SET(cpu,cia)} -- set the ``cia'' of the CPU
1179 ``cia'' is short for "current instruction address".
1181 The definition of @code{sim_state} is fairly simple. Just copy the M32R
1182 case. The definition of @code{_sim_cpu} is not simple, so pay
1183 attention. The complexity comes from trying to create a ``derived
1184 class'' of @code{sim_cpu} for each CPU family. What is done is define a
1185 different version of @code{sim_cpu} in each CPU family's set of files,
1186 with a common ``base class'' structure ``leading part'' for each
1187 @code{sim_cpu} definition used by non-CPU-family specific files. The
1188 way this is done is by defining @code{WANT_CPU_<CPU-FAMILY-NAME>} at the
1189 top of CPU family specific files. The definition of @code{_sim_cpu} is
1194 /* sim/common CPU base */
1196 /* Static parts of CGEN. */
1198 #if defined (WANT_CPU_CPUFAM1)
1199 CPUFAM1_CPU_DATA CPU_data;
1200 #elif defined (WANT_CPU_CPUFAM2)
1201 CPUFAM2_CPU_DATA CPU_data;
1206 @item @file{tconfig.in}
1208 This file predates @file{sim-main.h} and was/is intended to contain
1209 macros that configure the simulator sources.
1212 @item @code{SIM_HAVE_MODEL} -- enable @file{common/sim-model.[ch]}
1214 @item @code{SIM_HANDLES_LMA} -- makes @file{sim-hload.c} do the right
1216 @item @code{WITH_SCACHE_PBB} -- define this to 1 if using pbb scaching.
1219 @item @file{<arch>-sim.h}
1221 This file predates @file{sim-main.h} and contains miscellaneous macros
1222 and definitions used by the simulator.
1224 @item @file{mloop.in}
1226 This file contains code to implement the fetch/execute process. There
1227 are various ways to do this, and several are supported. Which one to
1228 choose depends on the environment in which the CPU will be used. For
1229 example when executing a program in a single-CPU environment without
1230 devices, most or all available cycles can be devoted to simulation of the
1231 target CPU. However, in an environment with devices or multiple cpus, one
1232 may wish the CPU to execute one instruction then relinquish control so a
1233 device operation may be done or an instruction can be simulated on a
1234 second cpu. Efficient techniques for the former aren't necessarily the best
1237 Three versions are currently supported:
1240 @item simple -- fetch/decode/execute one insn
1241 @item scache -- same as simple but results of decoding are cached
1242 @item pbb -- same as scache but several insns are handled each iteration
1243 pbb stands for pseudo basic block.
1246 This file is processed by @file{common/genmloop.sh} at build time. The
1247 result is two files: @file{mloop.c} and @file{eng.h}.
1249 @item @file{sim-if.c}
1251 By convention this file contains @code{sim_open}, @code{sim_close},
1252 @code{sim_create_inferior}, @code{sim_do_command}. These functions can
1253 live in any file of course. They're here because they're the parts of
1254 the @code{remote-sim.h} interface that aren't provided by the common
1257 @item @file{<cpufam>.c}
1259 By convention this file contains register access and model support
1260 functions for a CPU family (the name of this file is misnamed in the
1261 M32R case). The register access functions implement the
1262 @code{sim_fetch_register} and @code{sim_store_register} interface
1263 functions (named @code{<cpufam>_@{fetch,store@}_register}), and support
1264 code for register get/set rtl. The model support functions implement the
1265 before/after handlers (functions that handle tracing/profiling) and
1266 timing for each function unit.
1270 The M32R port has two other handwritten files: @file{devices.c} and
1271 @file{traps.c}. How you wish to organize this is up to you.
1274 @node Building a simulator test suite
1275 @section Building a simulator test suite
1277 CGEN can also build the template for test cases for all instructions. In
1278 some cases it can also generate the actual instructions
1279 @footnote{Although this hasn't been implemented yet.}. The result is
1280 then verified and checked into CVS. Further changes are usually done by
1281 hand as it's easier. The goal here is to save the enormous amount of
1282 initial typing that is required.
1285 @item @code{cd} to the CGEN build directory
1286 @item @code{make sim-test ISA=<arch>}
1288 At this point two files have been created in the CGEN build directory:
1289 @file{sim-allinsn.exp} and @file{sim-build.sh}.
1291 @item Copy @file{sim-allinsn.exp} to
1292 @file{toplevel/sim/testsuite/sim/<arch>/allinsn.exp}.
1293 @item @code{sh sim-build.sh}
1295 At this point a new subdirectory called @file{tmpdir} will be created
1296 and will contain one test case for each instruction. The framework has
1297 been filled in but not the actual test case. It's handy to write an
1298 ``include file'' containing assembler macros that simplify writing test
1299 cases. See @file{toplevel/sim/testsuite/sim/m32r/testutils.inc} for an
1302 @item write testutils.inc
1303 @item finish each test case
1304 @item copy @file{tmpdir/*.cgs} to @file{toplevel/sim/testsuite/sim/<arch>}
1305 @item run @code{make check} in the sim build directory and massage things until you're satisfied the files are correct
1306 @item Check files into CVS.
1309 @noindent At this point further additions/modifications are usually done