From 38ee6ffeb42b629f47686ad81c77cc7686929bed Mon Sep 17 00:00:00 2001 From: devans Date: Sun, 14 Jun 2009 18:32:27 +0000 Subject: [PATCH] * doc/cgenint.texi: Renamed from internals.texi. Several cleanups. * doc/app.texi: Cleanup pass. * doc/cgen.texi: Cleanup pass. * doc/glossary.texi: Add entries for ifield, iformat, sformat, insn. * doc/intro.texi: Cleanup pass. * doc/mdate-sh: New file. * doc/opcodes.texi: Cleanup pass. * doc/pmacros.texi: Cleanup pass. * doc/porting.texi: Cleanup pass. * doc/rtl.texi: Cleanup pass. * doc/running.texi: Cleanup pass. Document more runtime options. * doc/stamp-vti: Update. * doc/version.texi: Update. --- cgen/ChangeLog | 14 ++ cgen/doc/app.texi | 40 +++-- cgen/doc/cgen.texi | 9 +- cgen/doc/cgenint.texi | 281 ++++++++++++++++++++++++++++++++ cgen/doc/glossary.texi | 28 +++- cgen/doc/internals.texi | 385 -------------------------------------------- cgen/doc/intro.texi | 141 ++++++---------- cgen/doc/mdate-sh | 201 +++++++++++++++++++++++ cgen/doc/notes.texi | 2 +- cgen/doc/opcodes.texi | 4 +- cgen/doc/pmacros.texi | 15 +- cgen/doc/porting.texi | 224 ++++++++++++++++++++------ cgen/doc/rtl.texi | 172 +++++++++++--------- cgen/doc/running.texi | 418 +++++++++++++++++++++++++++++++++++++++++++++++- cgen/doc/sim.texi | 2 +- cgen/doc/stamp-vti | 5 +- cgen/doc/version.texi | 5 +- 17 files changed, 1294 insertions(+), 652 deletions(-) create mode 100644 cgen/doc/cgenint.texi delete mode 100644 cgen/doc/internals.texi create mode 100755 cgen/doc/mdate-sh diff --git a/cgen/ChangeLog b/cgen/ChangeLog index ebc87888ee..feca628b43 100644 --- a/cgen/ChangeLog +++ b/cgen/ChangeLog @@ -1,5 +1,19 @@ 2009-06-14 Doug Evans + * doc/cgenint.texi: Renamed from internals.texi. Several cleanups. + * doc/app.texi: Cleanup pass. + * doc/cgen.texi: Cleanup pass. + * doc/glossary.texi: Add entries for ifield, iformat, sformat, insn. + * doc/intro.texi: Cleanup pass. + * doc/mdate-sh: New file. + * doc/opcodes.texi: Cleanup pass. + * doc/pmacros.texi: Cleanup pass. + * doc/porting.texi: Cleanup pass. + * doc/rtl.texi: Cleanup pass. + * doc/running.texi: Cleanup pass. Document more runtime options. + * doc/stamp-vti: Update. + * doc/version.texi: Update. + * Makefile.am (AUTOMAKE_OPTIONS): Add 1.9 (GUILE): Fix definition. * Makefile.in: Regenerate with automake 1.9.6. diff --git a/cgen/doc/app.texi b/cgen/doc/app.texi index 0a2bd6c2cb..a32a978fbc 100644 --- a/cgen/doc/app.texi +++ b/cgen/doc/app.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @@ -14,7 +14,7 @@ CGEN application. * File Generation Process:: Workflow in cgen * Coding Conventions:: Coding conventions * Accessing Loaded Data:: Reading data from loaded .cpu files -* Name References:: Architecture names in generated code +* Arch Name References:: Architecture names in generated code * String Building:: Building long strings and writing them out * COS:: Cgen's Object System @end menu @@ -27,12 +27,18 @@ number of source files grows the entire layout may be changed, but until then this is how things are.} It makes it easy to find things. @itemize @bullet + @item top level file is cgen-.scm + The best way to create this file is to copy an existing application's file (e.g. cgen-opc.scm) and modify to suit. + @item file .scm contains general app-specific utilities + @item other files are -foo.scm + @item add entry to dev.scm (load-) + @end itemize @node File Generation Process @@ -48,10 +54,15 @@ options @item source code is loaded @itemize @minus + @item application independent code is loaded if not compiled in + @item application specific code is loaded Currently app-specific code is never compiled in. +@footnote{This dates from a time when CGEN supported being compiled with +Hobbit. That support is gone, though something else may take its place.} + @itemize @minus @item doesn't affect speed as much as application independent stuff @item subject to more frequent changes @@ -126,7 +137,7 @@ a line by itself beginning in column one @item definitions internal to a source file begin with '-' @item global state variables are named *foo-bar* [FIXME: current code needs updating] -@item avoid uppercase (except for ???) +@item avoid uppercase, except for constants (e.g. @code{*UNSPECIFIED*}) @item procedures that return a boolean result end in '?' @item procedures that modify something end in '!' @item classes are named @@ -138,7 +149,7 @@ a line by itself beginning in column one Each kind of description file entry (defined with `define-foo') is recorded in an object of class .@footnote{not true for but will be RSN} All the data is collected together in an object of class -.@footnote{got a better name?} +. @footnote{modes aren't recorded here, should they be?} Data for the currently selected architecture is obtained with several @@ -169,8 +180,8 @@ access functions. (current-arch-isa-name-list) - return a list of names (as symbols) of all isas in the architecture - - for most of the remaining elements, there are three main accessors - [foo is sometimes abbreviated] + For most of the remaining elements, there are three main accessors: + [foo is sometimes abbreviated] - current-foo-list - returns list of objects in the architecture - current-foo-add! - add a object to the architecture - current-foo-lookup - lookup the object based on its name @@ -244,8 +255,8 @@ access functions. [there are a few more to be documented, not sure they'll remain as is] @end smallexample -@node Name References -@section Name References +@node Arch Name References +@section Arch Name References To simplify writing code generators, system names can be specified with fixed strings rather than having to compute them. @@ -305,20 +316,23 @@ as arguments. For small arguments it's just as well to use @code{string-append}. This is a standard Scheme procedure. The output is also easier to read when developing interactively. And some subroutines are used in multiple -contexts including some where strings are required. +contexts including some where strings are required, so sometimes you +have to use @code{string-append}. @end itemize @node COS @section COS -COS is Cgen's Object System. It's a simple OO system for Guile that +COS is CGEN's Object System. It's a simple OO system for Guile that was written to provide something useful until Guile had its own. -COS will be replaced with GOOPs if the Scheme implementation of cgen is kept. +COS will be replaced with GOOPs if the Scheme implementation of CGEN is kept. The pure Scheme implementation of COS uses vectors to record objects and -classes. The C implementation uses smobs (though classes are still -implemented with vectors). +classes. +@c There no longer is a C implementation, but keep this for awhile. +@c The C implementation uses smobs (though classes are still +@c implemented with vectors). A complete list of user-visible functions is at the top of @file{cos.scm}. diff --git a/cgen/doc/cgen.texi b/cgen/doc/cgen.texi index 1114fbcb47..c351f08c85 100644 --- a/cgen/doc/cgen.texi +++ b/cgen/doc/cgen.texi @@ -12,7 +12,7 @@ END-INFO-DIR-ENTRY @end ifinfo @copying -Copyright @copyright{} 2000, 2007 Red Hat, Inc. +Copyright @copyright{} 2000, 2007, 2009 Red Hat, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice @@ -31,7 +31,7 @@ into another language, under the above conditions for modified versions. @c @c This file documents the Cpu tools GENerator, CGEN. @c -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c @setchapternewpage odd @@ -43,7 +43,6 @@ into another language, under the above conditions for modified versions. @sp 1 @subtitle @value{UPDATED} @author Douglas J. Evans -@author Red Hat, Inc. @page @tex @@ -52,7 +51,7 @@ into another language, under the above conditions for modified versions. @end tex @vskip 0pt plus 1filll -Copyright @copyright{} 2000 Red Hat, Inc. +Copyright @copyright{} 2000, 2009 Red Hat, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice @@ -71,7 +70,7 @@ into another language, under the above conditions for modified versions. @top Introduction @cindex version -This brief manual contains preliminary documentation for the CGEN program, +This manual documents the CGEN program, version @value{VERSION}. @menu diff --git a/cgen/doc/cgenint.texi b/cgen/doc/cgenint.texi new file mode 100644 index 0000000000..5fa875b2c3 --- /dev/null +++ b/cgen/doc/cgenint.texi @@ -0,0 +1,281 @@ +\input texinfo @c -*- Texinfo -*- +@setfilename cgenint.info + +@c This file is work in progress. + +@c Automake requires this to have a different name than version.texi +@c since we already use version.texi in cgen.texi. +@c But there's no real point to having a version file here. +@c @include versionint.texi + +@copying +Copyright @copyright{} 2000, 2007, 2009 Red Hat, Inc. + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that +the entire resulting derived work is distributed under the terms of a +permission notice identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions. +@end copying + +@synindex ky cp +@c +@c This file documents the internals of the Cpu tools GENerator, CGEN. +@c +@c Copyright (C) 2000 Red Hat, Inc. +@c + +@setchapternewpage odd +@settitle CGEN +@titlepage +@finalout +@title The Cpu tools GENerator, CGEN. +@c @subtitle Version @value{VERSION} +@sp 1 +@subtitle @value{UPDATED} +@author Ben Elliston +@author Red Hat, Inc. +@page + +@tex +{\parskip=0pt \hfill Red Hat, Inc.\par \hfill +\TeX{}info \texinfoversion\par } +@end tex + +@vskip 0pt plus 1filll +Copyright @copyright{} 2000 Red Hat, Inc. + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that +the entire resulting derived work is distributed under the terms of a +permission notice identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions. +@end titlepage + +@node Top +@top Introduction + +@cindex version +This manual documents the internals of CGEN. +@c version @value{VERSION}. + +@menu +* Introduction:: Introduction +* Conventions:: Coding conventions +* Applications:: Applications of CGEN +* Source file overview:: Introduction to each source file +* Parsing:: Parsing of .cpu files +* Debugging:: Debugging applications +* Version numbering:: CGEN version numbering +* Glossary:: Glossary +* Index:: Index +@end menu + +@node Introduction +@chapter Introduction + +This document details the implementation and internals of CGEN, the +``Cpu tools GENerator''. It focuses on theory of operation and concepts +rather than extensive details of the implementation -- these details +date too quickly. + +@node Conventions +@chapter Conventions + +There are a number of conventions used in the CGEN source code. If you +take the time to absorb these now, the code will be much easier to +understand. + +@itemize @bullet + +@item Procedures and variables local to a file are named @code{-foo}. + +@item Only routines that emit application code begin with @code{gen-}. + +@item Symbols beginning with @code{c-} are either variables containing C code +or procedures that generate C code, similarly for C++ and @code{c++-}. + +@item Variables containing C code begin with @code{c-}. + +@item Only routines that emit an entire file begin with @code{cgen-}. + +@item All @file{.cpu} file elements shall have @code{-foo-parse} and +@code{-foo-read} procedures. These procedures all follow the same +basic style for processing entries. + +@item Global variables containing class definitions shall be named +@code{}. + +@item Procedures related to a particular class shall be named +@code{class-name-proc-name}, where @code{class-name} may be abbreviated. + +@item Procedures that test whether something is an object of a +particular class shall be named @code{class-name?}. + +@item In keeping with Scheme conventions, predicates shall have a +@code{?} suffix. + +@item In keeping with Scheme conventions, methods and procedures that +modify an argument or have other side effects shall have a +@code{!} suffix, usually these procs return @code{*UNSPECIFIED*}. + +@item All @code{-foo-parse}, @code{parse-foo} procs shall have @code{context} +as the first argument. [FIXME: not all such procs have been converted] + +@end itemize + +@node Applications +@chapter Applications + +Applications of CGEN generate source code for various cpu related tools. +@footnote{One of the neglected concepts in CGEN is that it is not just +an assembler/disassembler or simulator generator. Those have just been +the ones immediately needed and are straightforward to do.} + +When you want to run the CGEN framework, an application-specific source +file is loaded into the Guile interpreter to get CGEN running. The main +job of this source file is to load in any other source files it needs and +then, ultimately, call the @code{cgen} procedure. + +Here's an example of the invocation of @code{cgen} from @file{cgen-sim.scm}. + +@example + (cgen #:argv argv + #:app-name "sim" + #:arg-spec sim-arguments + #:init sim-init! + #:finish sim-finish! + #:analyze sim-analyze!)) +@end example + +This gets the whole framework started by: + +@enumerate +@item processing argv +@item loading the @file{.cpu} file(s) +@item analyzing the instruction set +@item generating the source code for the app +@end enumerate + +@node Source file overview +@chapter Source file overview + +This table is a list of noteworthy files in CGEN. + +@table @file + +@item *.cpu, *.opc, *.sim +Files belonging to each CPU description. +@file{.opc} and @file{.sim} files are automatically +included if they are defined for the given architecture. + +@item doc/*.texi +Texinfo documentation for CGEN. + +@item slib/*.scm +Third-party libraries written in Scheme. For example, sort.scm is a +collection of procedures to sort lists. + +@item cgen-gas.scm +Top-level for GAS testsuite generation. + +@item cgen-opc.scm +Top-level for opcodes generation. + +@item cgen-sid.scm +Top-level for SID simulator generation. + +@item cgen-sim.scm +Top-level for older simulator generation. + +@item cgen-stest.scm +Top-level for simulator testsuite generation. + +@item cos.scm +CGEN object system. Adds object oriented features to the Scheme +language. See the top of @file{cos.scm} for the user-visible +procedures. Note that this was written before goops. +Switching to goops is not out of the question, it's just a question +of prioritization. + +@item read.scm +Read and parse @file{.cpu} files. @code{maybe-load} is used to load in files +for required symbols if they are not already present in the environment +(say, because it was compiled). + +This file contains @code{cgen}, is the main entry point called by +application file generators. +It just calls @code{-cgen}, but it does so wrapped inside a +@code{catch-with-backtrace} procedure to make debugging easier. + +@item simplify.inc +Preprocessor macros to simplify CPU description files. This file is not +loaded by the Scheme interpreter, but is instead included by each +@file{.cpu} file. + +@end table + +@node Version numbering +@chapter Version numbering + +There are two version numbers: the version number of CGEN itself and a +version number for the description language it accepts. These are kept +in the symbols @code{-CGEN-VERSION} and @code{-CGEN-LANG-VERSION} in +@file{read.scm}. + +@node Parsing +@chapter Parsing + +Parsing of @file{.cpu} files is very consistent. +Each element of the cpu description is handled in the same way. + +There are two forms for each cpu description element: + +@enumerate +@item key/value pairs +@item fixed order arguments +@end enumerate + +The key/value parser is named @code{--read}. +For example, see @code{-arch-read} in @file{mach.scm}. +It sets up default values for all elements of the object, parses the parameters +that have been provided, and then calls the fixed-order parser. + +The fixed order parser is named @code{--parse}. +For example, see @code{-arch-parse} in @file{mach.scm}. +It validates the parameters and then builds the requested object. + +@node Debugging +@chapter Debugging + +The best way to debug your application @emph{at this point} is to +use the @code{logit} function to get a log of what cgen is doing. +Or if you need a backtrace at a certain point then insert @code{error} +function calls at select places to cause the interpreter to output a +stack backtrace. This can be useful for answering the +``How did I get here?'' question. + +Guile 1.8 provides better debugging facilities than previous versions. +These need to be investigated and documented here. + +@node Index +@unnumbered Index + +@include glossary.texi + +@printindex cp + +@contents +@bye diff --git a/cgen/doc/glossary.texi b/cgen/doc/glossary.texi index 932efec1ee..5170417d44 100644 --- a/cgen/doc/glossary.texi +++ b/cgen/doc/glossary.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @@ -6,6 +6,7 @@ @chapter Glossary @table @asis + @item arch This is the overall architecture. It is the same as BFD's use of @emph{arch}. @@ -14,16 +15,33 @@ This is the overall architecture. It is the same as BFD's use of Acronym for Instruction Set Architecture. @item mach -This is a variant of the architecture, short for machine. It is +This is a variant of the architecture, short for machine. It is essentially the same as BFD's use of @emph{mach}. -@item CPU family -A group of related mach's. Simulator support is organized along ``CPU -family'' lines to keep related mach's together under one roof to +@item cpu family +A group of related mach's. Simulator support is organized along ``cpu +family'' lines to keep related machs together under one roof to simplify things. The organization is semi-arbitrary and is up to the programmer. @item model An implementation of a mach. It is essentially akin to the argument to @code{-mtune=} in SPARC GCC (and other GCC ports). + +@item ifield +An instruction field. + +@item iformat +An instruction format, as specified by the instruction's fields. + +@item sformat +An instruction semantic format. +This is different from @code{iformat}. +For example, if an operand is referred to in one mode by +one instruction and in a different mode by another instruction, then these +two insns would have different sformats even if they have the same iformat. + +@item insn +An instruction. + @end table diff --git a/cgen/doc/internals.texi b/cgen/doc/internals.texi deleted file mode 100644 index ed25bcd61c..0000000000 --- a/cgen/doc/internals.texi +++ /dev/null @@ -1,385 +0,0 @@ -\input texinfo @c -*- Texinfo -*- - -@c This file is work in progress. -@c Don't expect it to go through texinfo just yet. --bje - -@include version.texi - -@copying -Copyright @copyright{} 2000, 2007 Red Hat, Inc. - -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. - -Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided also that -the entire resulting derived work is distributed under the terms of a -permission notice identical to this one. - -Permission is granted to copy and distribute translations of this manual -into another language, under the above conditions for modified versions. -@end copying - -@synindex ky cp -@c -@c This file documents the internals of the Cpu tools GENerator, CGEN. -@c -@c Copyright (C) 2000 Red Hat, Inc. -@c - -@setchapternewpage odd -@settitle CGEN -@titlepage -@finalout -@title The Cpu tools GENerator, CGEN. -@subtitle Version @value{VERSION} -@sp 1 -@subtitle @value{UPDATED} -@author Ben Elliston -@author Red Hat, Inc. -@page - -@tex -{\parskip=0pt \hfill Red Hat, Inc.\par \hfill -\TeX{}info \texinfoversion\par } -@end tex - -@vskip 0pt plus 1filll -Copyright @copyright{} 2000 Red Hat, Inc. - -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. - -Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided also that -the entire resulting derived work is distributed under the terms of a -permission notice identical to this one. - -Permission is granted to copy and distribute translations of this manual -into another language, under the above conditions for modified versions. -@end titlepage - -@node Top -@top Introduction - -@cindex version -This manual documents the internals of CGEN, version @value{VERSION}. - -@menu -* Introduction:: Introduction -* Guile:: -* Conventions:: Coding conventions -* Applications:: -* Source file overview:: -* Option processing:: -* Parsing:: -* Debugging:: Debugging applications -* Version numbering:: -* Glossary:: Glossary -* Index:: Index -@end menu - -@node Introduction -@chapter Introduction - -This document details the implementation and internals of CGEN, the -``Cpu tools GENerator''. It focuses on theory of operation and concepts -rather than extensive details of the implementation--these details -date too quickly. - -@node Conventions -@chapter Conventions - -There are a number of conventions used in the cgen source code. If you -take the time to absorb these now, the code will be much easier to -understand. - -@itemize @bullet -@item Procedures and variables local to a file are named @code{-foo}. -@item Only routines that emit application code begin with @code{gen-}. -@item Symbols beginning with @code{c-} are either variables containing C code - or procedures that generate C code, similarly for C++ and @code{c++-}. -@item Variables containing C code begin with @code{c-}. -@item Only routines that emit an entire file begin with @code{cgen-}. -@item All @file{.cpu} file elements shall have @code{-foo-parse} and - @code{-foo-read} procedures. -@item Global variables containing class definitions shall be named - @code{}. -@item Procedures related to a particular class shall be named - @code{class-name-proc-name}, where @code{class-name} may be abbreviated. -@item Procedures that test whether something is an object of a - particular class shall be named @code{class-name?}. -@item In keeping with Scheme conventions, predicates shall have a - @code{?} suffix. -@item In keeping with Scheme conventions, methods and procedures that - modify an argument or have other side effects shall have a - @code{!} suffix, usually these procs return @code{*UNSPECIFIED*}. -@item All @code{-foo-parse}, @code{parse-foo} procs shall have @code{context} - as the first argument. [FIXME: not all such procs have been - converted] -@end itemize - -@node Applications -@chapter Applications - -One of the most importance concepts to grasp with CGEN is that it is not -a simulator generator. It's a generic tool generator--it can be used to -generate a simulator, an assembler, a disassembler and so on. These -``applications'' can then produce different outputs from the same CPU -description. - -When you want to run the cgen framework, an application-specific source -file is loaded into the Guile interpreter to get cgen running. This -source file loads in any other source files it needs and then, for -example, calls: - -@example - (cgen #:argv argv - #:app-name "sim" - #:arg-spec sim-arguments - #:init sim-init! - #:finish sim-finish! - #:analyze sim-analyze!) - ) -@end example - -This gets the whole framework started, in an application-specific way. - -node Source file overview -@chapter Source file overview - -@table @file - -@item *.cpu, *.opc, *.sim -Files belonging to each CPU description. .sim files are automatically -included if they are defined for the given architecture. - -@item doc/*.texi -Texinfo documentation for cgen. - -@item slib/*.scm -Third-party libraries written in Scheme. For example, sort.scm is a -collection of procedures to sort lists. - -@item Makefile.am -automake Makefile for cgen. - -@item NEWS -News about cgen. - -@item README -Notes to read abot cgen. - -@item attr.scm -Handling of cgen attributes. - -@item cgen-gas.scm -Top-level for GAS testsuite generation. - -@item cgen-opc.scm -Top-level for opcodes generation. - -@item cgen-sid.scm -Top-level for SID simulator generation. - -@item cgen-sim.scm -Top-level for older simulator generation. - -@item cgen-stest.scm -Top-level for simulator testsuite generation. - -@item configure.in -Template for `configure'--process with autoconf. - -@item cos.scm -cgen object system. Adds object oriented features to the Scheme -language. See the top of @file{cos.scm} for the user-visible -procedures. - -@item decode.scm -Generic decoder routines. - -@item desc-cpu.scm -??? - -@item desc.scm -??? - -@item dev.scm -Debugging support. - -@item enum.scm -Enumerations. - -@item fixup.scm -Some procedure definitions to patch up possible differences between -older and newer versions of Guile: - - * define a (load..) procedure that uses - primitive-load-path if load-from-path is not known. - - * define =? and >=? if they aren't already known. - - * define %stat, reverse! and debug-enable in terms of - older equivalent procedures, if they aren't already - known. - -@item gas-test.scm -GAS testsuite generator. - -@item hardware.scm -Hardware description routines. - -@item ifield.scm -Instruction fields. - -@item insn.scm -Instruction definitions. - -@item mach.scm -Architecture description routines. - -@item minsn.scm -Macro instructions. - -@item mode.scm -Modes. - -@item model.scm -Model specification. - -@item opc-asmdis.scm -For the opcodes applications. - -@item opc-ibld.scm -Ditto. - -@item opc-itab.scm -Ditto. - -@item opc-opinst.scm -Ditto. - -@item opcodes.scm -Ditto. - -@item operand.scm -Operands. - -@item pgmr-tools.scm -Programmer tools--debugging tools, mainly. - -@item pmacros.scm -Preprocessor macros. - -@item profile.scm -Unused? - -@item read.scm -Read and parse .cpu files. @code{maybe_load} is used to load in files -for required symbols if they are not already present in the environment -(say, because it was compiled). - -@item rtl-c.scm -RTL to C translation. - -@item rtl.scm -RTL support. - -@item rtx-funcs.scm -RTXs. - -@item sem-frags.scm -Semantic fragments. - -@item semantics.scm -Semantic analysis for the CPU descriptions. - -@item sid-cpu.scm -For the SID application. - -@item sid-decode.scm -Ditto. - -@item sid-model.scm -Ditto. - -@item sid.scm -Ditto. - -@item sim-arch.scm -For the simulator application. - -@item sim-cpu.scm -Ditto. - -@item sim-decode.scm -Ditto. - -@item sim-model.scm -Ditto. - -@item sim-test.scm -For the simulator testsuite application. - -@item sim.scm -For the simulator application. - -@item simplify.inc -Preprocessor macros to simplify CPU description files. This file is not -loaded by the Scheme interpreter, but is instead included by the .cpu -file. - -@item types.scm -Low-level types. - -@item utils-cgen.scm -cgen-specific utilities. - -@item utils-gen.scm -Code generation specific utilities. - -@item utils-sim.scm -Simulator specific utilities. - -@item utils.scm -Miscellaneous utilities. - -@end table - -@code{cgen} is the main entry point called by application file -generators. It just calls @code{-cgen}, but it does so wrapped inside a -@code{catch-with-backtrace} procedure to make debugging easier. - -@node Version numbering -@chapter Version numbering - -There are two version numbers: the version number of cgen itself and a -version number for the description language it accepts. These are kept -in the symbols @code{-CGEN-VERSION} and @code{-CGEN-LANG-VERSION} in -@file{read.scm}. - -@node Debugging -@chapter Debugging - -Debugging can be difficult in Guile. Guile 1.4 (configured with the ---enable-guile-debug option) seems unable to produce a stack backtrace -when errors are triggered in Scheme code. You should use Guile 1.3 in -the meantime. So far, the best way to debug your application is to -insert (error) function applications at select places to cause the -interpreter to output a stack backtrace. This can be useful for -answering the ``How did I get here?'' question. - -CGEN includes a (logit) function which logs error messages at different -diagnostic levels. If you want to produce debugging output, use -(logit). - -@node Index -@unnumbered Index - -@printindex cp - -@contents -@bye diff --git a/cgen/doc/intro.texi b/cgen/doc/intro.texi index 9e7f75c77e..fa980871de 100644 --- a/cgen/doc/intro.texi +++ b/cgen/doc/intro.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @@ -12,7 +12,6 @@ * Opcodes support:: * Simulator support:: * Testing support:: -* Implementation language:: @end menu @node Overview @@ -25,7 +24,7 @@ CGEN is a project to provide a framework and toolkit for writing cpu tools. * Why do it?:: * Maybe it should not be done?:: * How ambitious is CGEN?:: -* What is missing that should be there soon?:: +* What is missing that should be there someday?:: @end menu @node Goal @@ -48,6 +47,8 @@ The description language itself is thus also left to evolve over time! Achieving the goal also involves having a toolkit, libcgen, that contains a compiled form of the cpu description plus a suite of routines for working with the data. +@footnote{@file{libcgen} currently doesn't exist, but that was the +original plan.} CGEN is not a new idea. Some GNU ports have done something like this -- for example, the SH port in its early days. However, the idea never really @@ -240,19 +241,22 @@ Related to profiling tools are static program analysis tools. By this I mean taking machine code as input and analyzing it in some way. Except for symbolic information (which could come from BFD or elsewhere), CGEN provides enough information to analyze machine code, both the -the raw instructions *and* their semantics. Libcgen should contain +raw instructions *and* their semantics. Libcgen should contain all the basic tools for doing this. +@footnote{Today this is libopcodes to some degree.} @node ABI description @subsubsection ABI description Several tools need knowledge of not only a cpu's ISA but also of the ABI -in use. I believe it makes sense to apply the same goals that went into +in use. I think(!) it makes sense to apply the same goals that went into CGEN's architecture description language to an ABI description language: specify the ABI in an application independent way and then have a basic -toolkit/library that uses that data and allow the writing of program -generators for applications that want more than what the toolkit/library -provides. +toolkit/library that provides ways of using that data. +It might be useful to also allow the writing of program generators +for applications that want more than what the toolkit/library provides. +Perhaps not, but the basic toolkit/library should, again I think, +be useful. Part of what an ABI defines is the file format and relocations. This is something that BFD is built for. I think a BFD rewrite @@ -323,13 +327,15 @@ essence most of what is contained in the System V ABI documentation. That leaves the "miscellaneous" part. Essentially this is a catchall for whatever else is needed. This would include things like -include file directory locations, ???. There's probably no need to -add these to the CGEN description language. +include file directory locations, port-specific language features, ???. +There's not much need to include this info in CGEN, it's pretty +esoteric and generally useful to only a few applications. One can even envision a day when GCC emits object files directly. The instruction description contains enough information to build the instructions and the ABI support would provide enough information on relocations and object file formats. + Debugging information should be treated as an orthogonal concept. At present it is outside the scope of CGEN, though clearly the same reasoning behind CGEN applies to debugging support as well. @@ -344,17 +350,13 @@ be created that would assist hw/sw codesign. Another related application is to have a feedback mechanism from the compilation system that helps improve the architecture description (both CGEN and HDL). -For example, the compiler could determine what instructions would have -made a significant benefit for a particular application. CGEN descriptions -for these instructions could be generated, resulting in a new set of -compilation tools from which the hypothesis of adding the new instructions -could then be validated. Note that adding these new instructions only -required writing CGEN descriptions of them (setting aside HDL concerns). -Once done, all relevant tools would be automagically updated to support -the new instructions. - -@node What is missing that should be there soon? -@subsection What's missing that should be there soon? +CGEN descriptions for experimental instructions could be added, +and a new set of compilation tools quickly regenerated. +Then experiments could be run analyzing the effectiveness of the +new instructions. + +@node What is missing that should be there someday? +@subsection What's missing that should be there someday? @itemize @bullet @item Support for complex ISA's (i386, m68k). @@ -386,10 +388,11 @@ libbfd would be a user of this library. Instruction semantics should also be recorded in libcgen, probably in bytecode form. Operand usage tables, needed for example by the -m32r assembler, can be lazily computed at runtime. +m32r assembler, could be lazily computed at runtime. +Operand usage tables are also useful to gdb's reverse-execution support. Applications can either make use of libcgen or given the application -independence of the description language they can write their won code +independence of the description language they can write their own code generators to tailor the output as needed. @end itemize @@ -414,8 +417,9 @@ this is using RTL to describe instruction semantics rather than, say, C. The assembler can also make use of the instruction semantics. It doesn't make use of the semantics, per se, but what it does use is the input and output operand information that is machine generated from the semantics. -Grokking operand usage from C is possible I guess, but a lot harder. -So by writing the semantics in RTL multiple applications can make use if it. +Grokking operand usage from C is possible, but harder. +@footnote{By this I mean analyzing the C and understanding what it's doing.} +So by writing the semantics in RTL multiple applications can make use of it. One can also generate from the RTL code in languages other than C. @menu @@ -436,7 +440,7 @@ The CPU description file needs to provide at least the following: @item semantic specification in a way that is amenable to being understood and manipulated @item performance measurement parameters -@item support for multiple ISA variants +@item support for multiple architecture and implementation variants @item assembler syntax of the instruction set @item how that syntax maps to the bits of the instruction word, and back @item support for generating test files @@ -451,7 +455,8 @@ for obvious reasons. @item file format @item relocations @item function calling conventions -@item ??? +@item structure layout +@item ... and all the other usual stuff @end itemize Some architectures require knowledge of the pipeline in order to do @@ -460,7 +465,8 @@ interlocks) so that will be required as well, as opposed to being solely for performance measurement. Pipeline knowledge is also needed in order to achieve accurate profiling information. However, I haven't spent much time on this yet. The current design/implementation is a first -pass in order to get something working, and will be revisited. +pass in order to get something reasonable, and will be revisited +as necessary. Support for generating test files is not complete. Currently the GAS test suite generator gets by (barely) without them. The simulator test @@ -491,7 +497,7 @@ language used by CGEN. Extensibility is achieved by specifying everything as name/value pairs. This allows new elements to be added and even CPU specific elements to be added without complicating the language or requiring a new element in -a @code{define_insn} type entry to be added to each existing port. +a @code{define_insn}-like entry to be added to each existing port. Macros can be used to eliminate the verbosity of repetitively specifying the ``name'' part, so one can have it both ways. Imagine GCC's @file{.md} file elements specified as name/value pairs with macro's @@ -548,7 +554,7 @@ Each of these elements is explained in more detail in @ref{RTL}. There are at least two potential problem areas in the language's design. The first problem is variation in assembly language syntax. Examples of -this are Intel vs AT&T i386 syntax, and Motorola vs MIT M68k syntax. +this are Intel vs AT&T i386 syntax, and Motorola vs MIT m68k syntax. I think there isn't a sufficient number of important cases to warrant handling this efficiently. One could either ignore the issue for situations where divergence is sufficient to dissuade one from handling @@ -571,7 +577,7 @@ should it prove reasonable to do so. The CPU description file won't change, which is the important thing.}, so if one wanted to implement the disassembler/assembler via other means one can. -The other potential problem area is relocations. Clearly part of +The second problem area is relocations. Clearly part of processing assembly code is dealing with the relocations involved (e.g. GOT table specification). Relocation support necessarily requires BFD and GAS support, both of which need cleanup in this area. Rewriting @@ -634,7 +640,7 @@ data was being used to generate both the program and the test cases. An error might be propagated to both and thus nullify the test. For example if an opcode field was supposed to have the value 1 and the description file had the value 2, then this error wouldn't be caught. -However, this assumes test cases are generated during the testing run! +However, this assumes test cases are always generated during the testing run! And it ignores the profound amount of typing that is saved by machine generating test cases! (I discount the argument that this kind of exhaustive testing is unnecessary). @@ -643,10 +649,11 @@ One solution to the above problem is to not generate the test cases during the testing run (which was implicit in the proposal, but perhaps should have been explicit). Another solution is to generate the test cases during the test run but first verify them by some external -means before actually using them in any test. The latter solution is -only mentioned for completeness sake; its implementation is problematic -as any external means would necessarily be computer driven and the level -of confidence in the result isn't 100%. +means before actually using them in any test. Another solution is +to have some trust in the generated tests. Yes, some bugs may be missed, +but given the quantity of testing that can be done, some bugs may still +be caught that would otherwise have been missed. Plus it's all +machine-driven, minimal human interaction is required. So how are machine generated test cases verified? By machine, by hand, and by time. The test cases are checked into CVS and are not regenerated @@ -673,8 +680,8 @@ This is no different than the non-machine generated case again except in the perceived difference in quantity of test cases. Note that no claim is made that manually generated test cases aren't -needed. Clearly there will be some cases that the description file -doesn't describe and thus can't machine generate. +useful or needed. The goal here is to enhance existing forms of testing, +not replace them. @node Simulator testing @subsection Simulator testing @@ -690,8 +697,7 @@ there will still be a large percentage of instructions amenable to having test cases machine generated for them. Such test cases can certainly be hand generated, but it is believed that this is a large amount of unnecessary typing that typically won't be done due to the -amount. Again, I discount the argument that this kind of exhaustive -testing isn't necessary. +amount. An example is the simple arithmetic instructions. These take zero, one, or more arguments and produce a result. The description file contains @@ -704,56 +710,5 @@ Certainly at the very least all the administrivia for each test case can be machine generated (i.e. a template file can be generated for each instruction, leaving the programmer to fill in the details). -The strategy used for assembler/disassembler test cases is also used here. -Test cases are kept in CVS and are not regenerated without care. - -@node Implementation language -@section Implementation language - -The chosen implementation language is Scheme. The reasons for this are: - -@itemize @bullet -@item Parsing RTL in Scheme is real easy, though I did make some albeit -minor changes to make it easier. While it doesn't take more than a few -dozen lines of C to parse RTL, it doesn't take any lines of Scheme - -the parser is built into the interpreter. - -@item An interactive environment is a better environment to work in, -especially in the early stages of an ambitious project like this. - -@item Guile is developing as an embeddable interpreter. -I wanted room for growth in many dimensions, and having the implementation -language be an embeddable interpreter supports this. - -@item I wanted to learn Scheme (Yes, not a technical reason, blah blah blah). - -@item Numbers in Scheme can have arbitrary precision so representing 64 -bit (or higher) numbers on a 32 bit host is well defined. - -@item It seemed useful to have an implementation language similar to the -CPU description language. The Scheme implementation seems simpler -than a C implementation would be. -@end itemize - -One issue that arises with the use of Scheme as the implementation -language is whether to generate files in the source tree, with the -issues that involves, or generate the files in the build tree (and thus -require Guile to build Binutils and the issues that involves). Trying -to develop something like this is easier in an interactive environment, -so Scheme as the first implementation language is, to me, a better -choice than C or C++. In such a big project it also helps to have a -more expressive language so relatively complex code and be written with -fewer lines of code. - -One consequence is maintenance is more difficult in that the -generated files (e.g. @file{opcodes/m32r-*.[ch]}) are checked into CVS -at Red Hat, and a change to a CPU description requires rebuilding the -generated files and checking them in as well. And a change that affects -each port requires each port to be regenerated and checked in. -This is more palatable for maintainer tools such as @code{bison}, -@code{flex}, @code{autoconf} and @code{automake}, as their input files -don't change as often. - - -Whether to continue with Scheme, convert the code to a compiled -language, or have both is an important, open issue. +The strategies mentioned for assembler/disassembler machine-generated +test cases also apply here. diff --git a/cgen/doc/mdate-sh b/cgen/doc/mdate-sh new file mode 100755 index 0000000000..cd916c0a34 --- /dev/null +++ b/cgen/doc/mdate-sh @@ -0,0 +1,201 @@ +#!/bin/sh +# Get modification time of a file or directory and pretty-print it. + +scriptversion=2005-06-29.22 + +# Copyright (C) 1995, 1996, 1997, 2003, 2004, 2005 Free Software +# Foundation, Inc. +# written by Ulrich Drepper , June 1995 +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2, or (at your option) +# any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software Foundation, +# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + +# As a special exception to the GNU General Public License, if you +# distribute this file as part of a program that contains a +# configuration script generated by Autoconf, you may include it under +# the same distribution terms that you use for the rest of that program. + +# This file is maintained in Automake, please report +# bugs to or send patches to +# . + +case $1 in + '') + echo "$0: No file. Try \`$0 --help' for more information." 1>&2 + exit 1; + ;; + -h | --h*) + cat <<\EOF +Usage: mdate-sh [--help] [--version] FILE + +Pretty-print the modification time of FILE. + +Report bugs to . +EOF + exit $? + ;; + -v | --v*) + echo "mdate-sh $scriptversion" + exit $? + ;; +esac + +# Prevent date giving response in another language. +LANG=C +export LANG +LC_ALL=C +export LC_ALL +LC_TIME=C +export LC_TIME + +# GNU ls changes its time format in response to the TIME_STYLE +# variable. Since we cannot assume `unset' works, revert this +# variable to its documented default. +if test "${TIME_STYLE+set}" = set; then + TIME_STYLE=posix-long-iso + export TIME_STYLE +fi + +save_arg1=$1 + +# Find out how to get the extended ls output of a file or directory. +if ls -L /dev/null 1>/dev/null 2>&1; then + ls_command='ls -L -l -d' +else + ls_command='ls -l -d' +fi + +# A `ls -l' line looks as follows on OS/2. +# drwxrwx--- 0 Aug 11 2001 foo +# This differs from Unix, which adds ownership information. +# drwxrwx--- 2 root root 4096 Aug 11 2001 foo +# +# To find the date, we split the line on spaces and iterate on words +# until we find a month. This cannot work with files whose owner is a +# user named `Jan', or `Feb', etc. However, it's unlikely that `/' +# will be owned by a user whose name is a month. So we first look at +# the extended ls output of the root directory to decide how many +# words should be skipped to get the date. + +# On HPUX /bin/sh, "set" interprets "-rw-r--r--" as options, so the "x" below. +set x`ls -l -d /` + +# Find which argument is the month. +month= +command= +until test $month +do + shift + # Add another shift to the command. + command="$command shift;" + case $1 in + Jan) month=January; nummonth=1;; + Feb) month=February; nummonth=2;; + Mar) month=March; nummonth=3;; + Apr) month=April; nummonth=4;; + May) month=May; nummonth=5;; + Jun) month=June; nummonth=6;; + Jul) month=July; nummonth=7;; + Aug) month=August; nummonth=8;; + Sep) month=September; nummonth=9;; + Oct) month=October; nummonth=10;; + Nov) month=November; nummonth=11;; + Dec) month=December; nummonth=12;; + esac +done + +# Get the extended ls output of the file or directory. +set dummy x`eval "$ls_command \"\$save_arg1\""` + +# Remove all preceding arguments +eval $command + +# Because of the dummy argument above, month is in $2. +# +# On a POSIX system, we should have +# +# $# = 5 +# $1 = file size +# $2 = month +# $3 = day +# $4 = year or time +# $5 = filename +# +# On Darwin 7.7.0 and 7.6.0, we have +# +# $# = 4 +# $1 = day +# $2 = month +# $3 = year or time +# $4 = filename + +# Get the month. +case $2 in + Jan) month=January; nummonth=1;; + Feb) month=February; nummonth=2;; + Mar) month=March; nummonth=3;; + Apr) month=April; nummonth=4;; + May) month=May; nummonth=5;; + Jun) month=June; nummonth=6;; + Jul) month=July; nummonth=7;; + Aug) month=August; nummonth=8;; + Sep) month=September; nummonth=9;; + Oct) month=October; nummonth=10;; + Nov) month=November; nummonth=11;; + Dec) month=December; nummonth=12;; +esac + +case $3 in + ???*) day=$1;; + *) day=$3; shift;; +esac + +# Here we have to deal with the problem that the ls output gives either +# the time of day or the year. +case $3 in + *:*) set `date`; eval year=\$$# + case $2 in + Jan) nummonthtod=1;; + Feb) nummonthtod=2;; + Mar) nummonthtod=3;; + Apr) nummonthtod=4;; + May) nummonthtod=5;; + Jun) nummonthtod=6;; + Jul) nummonthtod=7;; + Aug) nummonthtod=8;; + Sep) nummonthtod=9;; + Oct) nummonthtod=10;; + Nov) nummonthtod=11;; + Dec) nummonthtod=12;; + esac + # For the first six month of the year the time notation can also + # be used for files modified in the last year. + if (expr $nummonth \> $nummonthtod) > /dev/null; + then + year=`expr $year - 1` + fi;; + *) year=$3;; +esac + +# The result. +echo $day $month $year + +# Local Variables: +# mode: shell-script +# sh-indentation: 2 +# eval: (add-hook 'write-file-hooks 'time-stamp) +# time-stamp-start: "scriptversion=" +# time-stamp-format: "%:y-%02m-%02d.%02H" +# time-stamp-end: "$" +# End: diff --git a/cgen/doc/notes.texi b/cgen/doc/notes.texi index 8ccf5954ff..d16d6fbe94 100644 --- a/cgen/doc/notes.texi +++ b/cgen/doc/notes.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. diff --git a/cgen/doc/opcodes.texi b/cgen/doc/opcodes.texi index 4085aa2058..7b2485f6a0 100644 --- a/cgen/doc/opcodes.texi +++ b/cgen/doc/opcodes.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @@ -12,7 +12,7 @@ well as supporting routines. @menu * Generated files:: List of generated files * The .opc file:: Target specific C code -* Special assembler parsing needs:: +* Special assembler parsing needs:: Support for unusual syntax @end menu @node Generated files diff --git a/cgen/doc/pmacros.texi b/cgen/doc/pmacros.texi index d74f9a8408..0cd45199c2 100644 --- a/cgen/doc/pmacros.texi +++ b/cgen/doc/pmacros.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @@ -12,7 +12,7 @@ Preprocessor macros provide a way of simplifying the writing of @menu * Defining a preprocessor macro:: @code{define-pmacro} -* Using preprocessor macros:: +* Using preprocessor macros:: Using preprocessor macros * Macro expansion:: The @code{pmacro-expand} procedure * Default argument values:: Specifying default values of arguments * Multiple output expressions:: Using @code{begin} @@ -77,7 +77,7 @@ and arguments by name. @node Macro expansion @section Macro expansion -At the implementation level, pmacros are expand with the +At the implementation level, pmacros are expanded with the @code{pmacro-expand} Scheme procedure. The following is executed from a Guile shell, as opposed to @@ -420,18 +420,15 @@ This is only supported when passing macros as arguments to other macros. (load-op uh OP2_11 HI (.pmacro (mode expr) (zext: mode expr))) @end smallexample -Currently, .pmacro's don't bind the way Scheme lambda expressions do. -For example, arg2 in the second pmacro is not bound to the arg2 argument -of the first pmacro. +.pmacro's don't bind the way Scheme lambda expressions do. +In the following example, arg2 in the second pmacro is not bound +to the arg2 argument of the first pmacro. @smallexample (define-pmacro (foo arg1 arg2) ((.pmacro (bar) (+ arg2 bar)) arg1)) (foo 3 4) ==> (+ arg2 3) @end smallexample -One can make an argument either way. I'm not sure what the right thing -to do here is (leave things as is, or have lexical binding like Scheme). - @node Passing macros as arguments @section Passing macros as arguments diff --git a/cgen/doc/porting.texi b/cgen/doc/porting.texi index 2cd3d24df1..e5633251ec 100644 --- a/cgen/doc/porting.texi +++ b/cgen/doc/porting.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @@ -7,7 +7,7 @@ @cindex Porting This chapter describes how to do a CGEN port. -It focuses on doing binutils and simulator ports, but the general +It focuses on doing binutils and simulator ports, but the procedure should be generally applicable. @menu @@ -75,26 +75,20 @@ to be written. @xref{Doing a simulator port}. In order to avoid suffering from the bug of the day when using snapshots, CGEN development has been confined to Guile releases only. -As of this writing (1999-04-26) only Guile 1.2 and 1.3 are supported. -At some point in the future older versions of Guile will no longer be -supported. - -If using Guile 1.2, configure it with @code{--enable-guile-debug ---enable-dynamic-linking} to work around an unknown bug in this version -of Guile. I ran into this on Solaris 2.6. +CGEN has been tested with Guile versions @code{1.4.1}, @code{1.6.8} +and @code{1.8.5}. +As time passes older versions of Guile will no longer be supported. @node Running configure @section Running @code{configure} -When doing porting or maintenance activity with CGEN, the build tree -must be configured with the @code{--enable-cgen-maint} option. This -adds the necessary dependencies to the @file{toplevel/opcodes} and -@file{toplevel/sim} directories. +When doing porting or maintenance activity with CGEN, it's a good idea +to configure the build tree with the @code{--enable-cgen-maint} option. +This adds the necessary dependencies to the @file{toplevel/opcodes} and +@file{toplevel/sim} directories so that when the @file{.cpu} file is +changed the makefiles will regenerated the corresponding sources. -CGEN uses Guile so it must be installed. At present the CGEN configury -requires that if Guile isn't installed in @file{/usr/local} then the -@code{--with-guile=/guile/install/dir} option must be passed to -@file{configure} to specify where Guile is installed. +CGEN uses Guile so it must be installed. @node Writing a CPU description file @section Writing a CPU description file @@ -130,7 +124,7 @@ descriptions of each type of entry that appears in the description file. First a digression on conventions and programming style. -@enumerate 1 +@enumerate @item @code{define-foo} vs. @code{define-normal-foo} Each CPU description @code{define-} entry generally provides two forms: @@ -178,6 +172,21 @@ processor must be defined somewhere, so all of this stuff is put under This must be the first entry in the description file. +@xref{Architecture variants}, for details. + +Here's an example from @file{m32r.cpu}: + +@example +(define-arch + (name m32r) ; name of cpu family + (comment "Renesas M32R") + (default-alignment aligned) + (insn-lsb0? #f) + (machs m32r m32rx m32r2) + (isas m32r) +) +@end example + @node Writing define-isa @subsection Writing define-isa @@ -188,6 +197,20 @@ The second is to give the instruction set a name. This is important for architectures like the ARM where one CPU can execute multiple instruction sets. +@xref{Architecture variants}, for details. + +Here's an example from @file{arm.cpu}: + +@example +(define-isa + (name thumb) + (comment "ARM Thumb instruction set (16 bit insns)") + (base-insn-bitsize 16) + (decode-assist (15 14 13 12 11 10 9 8)) + (setup-semantics (set-quiet (reg h-gr 15) (add pc 4))) +) +@end example + @node Writing define-cpu @subsection Writing define-cpu @@ -206,21 +229,48 @@ fuzzy for now, but basically the simulator engine has a collection of structures defining internal state, and ``CPU families'' minimize the number of copies of generated code that manipulate this state. +@xref{Architecture variants}, for details. + +Here's an example from @file{openrisc.cpu}: + +@example +(define-cpu + ; CPU names must be distinct from the architecture name and machine names. + ; The "b" suffix stands for "base" and is the convention. + ; The "f" suffix stands for "family" and is the convention. + (name openriscbf) + (comment "OpenRISC base family") + (endian big) + (word-bitsize 32) +) +@end example + @node Writing define-mach @subsection Writing define-mach CGEN uses ``mach'' in the same sense that BFD uses ``mach''. ``Mach'', which is short for `machine', defines a variant of the architecture. - @c There may be a need for a many-to-one correspondence between CGEN @c machs and BFD machs. +@xref{Architecture variants}, for details. + +Here's an example from @file{m32r.cpu}: + +@example +(define-mach + (name m32rx) + (comment "M32RX cpu") + (cpu m32rxf) +) +@end example + @node Writing define-model @subsection Writing define-model When describing a CPU, in any context, there is ``architecture'' and -there is ``implementation''. In CGEN parlance a ``model'' is an +there is ``implementation''. In CGEN parlance, a ``model'' is an implementation of a ``mach''. Models specify pipeline and other performance related characteristics of the implementation. @@ -230,6 +280,22 @@ yet how to handle all the various possibilities so at present this is done on a case-by-case basis. Maybe a straightforward solution will emerge. +@xref{Model variants}, for details. + +Here's an example from @file{arm.cpu}: +@c A poor example. Later. + +@example +(define-model + (name arm710) + (comment "ARM 710 microprocessor") + (mach arm7tdmi) + (unit u-exec "Execution Unit" () + 1 1 ; issue done + () () () ()) +) +@end example + @node Writing define-hardware @subsection Writing define-hardware @@ -261,6 +327,20 @@ The program counter is named @samp{h-pc} and must be specified. It is not a builtin element as sometimes architectures need to modify its behaviour (in the get/set specs). +@xref{Hardware elements}, for details. + +Here's an example from @file{arm.cpu}: + +@example +(define-hardware + (name h-gr) + (comment "general registers") + (attrs PROFILE CACHE-ADDR) + (type register WI (16)) + (indices extern-keyword gr-names) +) +@end example + @node Writing define-ifield @subsection Writing define-ifield @@ -282,15 +362,20 @@ specification: the default insn word size (in bits), and whether bit number In the general case, fields are described with 4 numbers: word-offset, word-length, start, and length. -All instruction fields (*) live in exactly one word and must be contiguous. +All instruction fields live in exactly one word and must +be contiguous.@footnote{This doesn't include fields like multi-ifields.} Non-contiguous fields are specified with ``multi-ifields'' which are fields built up out of several smaller typically disjoint fields. The size of the word depends on the context. @samp{word-offset} specifies -the offset in bits from the start of the insn to the word containing the field. -@samp{word-length} specifies the size in bits of the word containing the field. +the offset in bits from the start of the insn to the word containing the field, +it must be a multiple of 8. +@samp{word-length} specifies the size in bits of the word containing the field, +it also must be a multiple of 8. @samp{start} specifies the position of the MSB of the field in the word. @samp{length} specifies the size in bits of the field. +@xref{Instruction fields}, for details. + Example. Suppose an ISA has instructions that are normally 16 bits, @@ -300,7 +385,8 @@ Also suppose the ISA numbers the bits starting from the LSB. default-insn-word-bitsize = 16, lsb0? = #t -An instruction with four 4 bit fields and one 32 bit immediate might be: +An instruction with four 4 bit fields, one 32 bit immediate +and one 16 bit immediate might be: @example @@ -336,7 +422,7 @@ Endianness for the purposes of this example is irrelevant. In the word containing op1,op2,r1,r2, op1 is in the most significant nibble and r2 is in the least significant nibble. -For a large number of cases specifying all 4 numbers is excessive. +For a large number of cases specifying all four numbers is excessive. With careful redefinition of the starting bit number, one can get away with only specifying start,length. Imagine several words of the default insn word size laid out from the start of @@ -434,15 +520,13 @@ One would then write f-op1 as: @end example -(*) This doesn't include fields like multi-ifields. - @node Writing define-normal-insn-enum @subsection Writing define-normal-insn-enum Writing instruction enum entries involves analyzing the instruction set and attaching names to the opcode fields. For example, if a field named @samp{op1} is used to select which of add, addc, sub, subc, and, or, -xor, and inv instructions, one would write something like the following: +xor, and inv instructions, one could write something like the following: @example (define-normal-insn-enum @@ -456,26 +540,31 @@ These entries simplify instruction definitions by giving a name to a particular value for a particular instruction field. By convention, enum names are uppercase. This convention must be followed. +@xref{Enumerated constants}, for details. + @node Writing define-operand @subsection Writing define-operand Operands are what instruction semantics use to refer to hardware elements. The typical use of an operand is to map instruction fields to hardware. For example, if field @samp{f-r2} is used to specify one of -the registers defined by the @code{h-gr} hardware entry, one would -write: +the registers defined by the @code{h-gr} hardware entry, one could write +something like the following: @code{(dnop sr "source register" () h-gr f-r2)} @code{dnop} is short for ``define normal operand'' @footnote{A profound aversion to typing causes me to often provide brief names of things that -get typed a lot.}. @xref{RTL}, for more information. +get typed a lot.}. + +@xref{Instruction operands}, for more information. @node Writing define-insn @subsection Writing define-insn -This involves going through the CPU manual and writing an entry for each -instruction. Instructions specific to a particular machine variant are +A large part of writing a @file{.cpu} file is going through the CPU manual +and writing an entry for each instruction. +Instructions specific to a particular machine variant are indicated so with the `MACH' attribute. Example: @example @@ -490,6 +579,9 @@ The `base' machine is a predefined machine variant that includes instructions available to all variants, and is the default if no `MACH' attribute is specified. +@xref{Instructions}, for details. + +@c Seems like this part belongs elsewhere. When the @file{.cpu} file is processed, CGEN will analyze the semantics to determine: @@ -544,17 +636,46 @@ Some instructions are really aliases for other instructions, maybe even a sequence of them. For example, an architecture that has a general decrement-then-store instruction might have a specialized version of this instruction called @code{push} supported by the assembler. These -are handled with ``macro instructions''. Macro instructions are used by -the assembler/disassembler only. They are not used by the simulator. +are handled with ``macro instructions''. + +@xref{Macro-instructions}, for details. + +Macro instructions are used by the assembler/disassembler only. +They are not used by the simulator. + +For example, if this was the real instruction: + +@example +(dni st-minus "st-" () + "st $src1,@-$src2" + (+ OP1_2 OP2_7 src1 src2) + (sequence ((WI new-src2)) + (set new-src2 (sub src2 (const 4))) + (set (mem WI new-src2) src1) + (set src2 new-src2)) + () +) +@end example + +One could write a @code{push} variant with: + +@example +(dnmi push "push" () + "push $src1" + (emit st-minus src1 (src2 15)) ; "st %0,@-sp" +) +@end example @node Using define-pmacro @subsection Using define-pmacro When a group of entries, say instructions, share similar information, a macro (in the C preprocessor sense) can be used to simplify the -description. This can be used to save a lot of typing, which also -improves readability since often 1 page of code is easier to understand -than 4. +description. This can be used to save a lot of typing, which can also +improve readability since often one page of code is easier to understand +than four. + +@xref{Preprocessor macros}, for details. Here is an example from the M32R port. @@ -602,9 +723,11 @@ This is best explained with an example. Here's a simplifying macro for writing ifield definitions with every element specified. +@xref{List splicing}, for details. + @example -; dwf: define-word-field (??? pick a better name) +; dwf: define-word-field (define-pmacro (dwf x-name x-comment x-attrs x-word-offset x-word-length x-start x-length @@ -640,11 +763,11 @@ One would then write f-op1 as: @node Interactive development @subsection Interactive development -The normal way@footnote{Normal for me anyway, certainly each person will have -their own preference} of writing a CPU description file involves starting Guile -and developing the .CPU file interactively. The basic steps are +The normal way@footnote{Normal for some anyway, certainly each person will have +their own preference.} of writing a CPU description file involves starting Guile +and developing the .CPU file interactively. The basic steps are: -@enumerate 1 +@enumerate @item Run @code{guile}. @item @code{(load "dev.scm")} @item Load application, e.g. @code{(load-opc)} or @code{(load-sim)} @@ -702,7 +825,7 @@ Only the @code{#:arch} argument is mandatory. The best way to begin a port is to take an existing one (preferably one that is similar to the new port) and use it as a template. -@enumerate 1 +@enumerate @item Run @code{guile}. @item @code{(load "dev.scm")}. This loads in a set of interactive development routines. @@ -731,11 +854,6 @@ eight opcodes files (use the M32R port as an example). @item Run @code{make all-opcodes} from the top level build directory. @end enumerate -Note that Guile is not currently shipped with Binutils, etc. Until -Guile is shipped with Binutils, etc. or a C implementation of CGEN is -done, the generated files are installed in the source directory and -checked into CVS. - @node Doing a GAS port @section Doing a GAS port @@ -853,7 +971,7 @@ then assembled, disassembled, verified, and checked into CVS. Further changes are usually done by hand as it's easier. The goal here is to save the enormous amount of initial typing that is required. -@enumerate 1 +@enumerate @item @code{cd} to the CGEN build directory @item @code{make gas-test} @@ -882,7 +1000,7 @@ At this point further additions/modifications are usually done by hand. The same basic procedure for opcodes porting applies here. -@enumerate 1 +@enumerate @item Run @code{guile}. @item @code{(load "dev.scm")} @item @code{(load-sim)} @@ -1003,7 +1121,7 @@ for the latter. Three versions are currently supported: -@enumerate 1 +@enumerate @item simple -- fetch/decode/execute one insn @item scache -- same as simple but results of decoding are cached @item pbb -- same as scache but several insns are handled each iteration @@ -1048,7 +1166,7 @@ then verified and checked into CVS. Further changes are usually done by hand as it's easier. The goal here is to save the enormous amount of initial typing that is required. -@enumerate 1 +@enumerate @item @code{cd} to the CGEN build directory @item @code{make sim-test ISA=} diff --git a/cgen/doc/rtl.texi b/cgen/doc/rtl.texi index 32a1cdbb20..c8c6ad366f 100644 --- a/cgen/doc/rtl.texi +++ b/cgen/doc/rtl.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000, 2003 Red Hat, Inc. +@c Copyright (C) 2000, 2003, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @@ -21,13 +21,13 @@ its CPU description language. * Hardware elements:: Elements of a CPU * Instruction fields:: Fields of an instruction * Enumerated constants:: Assigning useful names to important numbers -* Instruction operands:: +* Instruction operands:: Operands of instructions * Derived operands:: Operands for CISC-like architectures -* Instructions:: -* Macro-instructions:: -* Modes:: -* Expressions:: -* Macro-expressions:: +* Instructions:: Instructions +* Macro-instructions:: Macro instructions +* Modes:: Operand types in expressions +* Expressions:: Expressions in the language +* Macro-expressions:: A simplification of arithmetic expressions @end menu @node RTL Introduction @@ -116,28 +116,26 @@ treated as optional so a shorthand form of @code{(add dr sr)} works. A few basic guidelines for all entries: @itemize @bullet -@item names must be valid Scheme symbols. -@item comments are used, for example, to comment the generated C code +@item Names must be valid Scheme symbols. +@item Comments are used, for example, to comment the generated C code @footnote{It is possible to produce a reference manual from @file{.cpu} files and such an application wouldn't be a bad idea.}. -@item comments may be any number of lines, though generally succinct comments +@item Comments may be any number of lines, though generally succinct comments are preferable@footnote{It would be reasonable to have a short form and a long form of comment. Either as two entries are as one entry with the short form separated from the long form via some delimiter (say the first newline).}. -@item everything is case sensitive.@footnote{??? This is true in RTL, +@item Everything is case sensitive.@footnote{??? This is true in RTL, though some apps add symbols and convert case that can cause collisions.} -@item while "_" is a valid character to use in symbols, "-" is preferred -@item except for the @samp{comment} and @samp{attrs} fields and unless +@item While "_" is a valid character to use in symbols, "-" is preferred +@item Except for the @samp{comment} and @samp{attrs} fields and unless otherwise specified all fields must be present. -@end itemize - -Symbols and strings - -Symbols in CGEN are the same as in Scheme. -Symbols can be used anywhere a string can be used. -The reverse is not true, and in general strings can't be used in place +@item Symbols used to be allowed anywhere a string can be used. +This is what earlier versions of Guile supported. +Guile is more strict now, so this relaxation is gone. +The reverse has always not been allowed, strings can't be used in place of symbols. +@end itemize @node Definitions @section Definitions @@ -147,7 +145,7 @@ Each entry has the same format: @code{(define-foo arg1 arg2 ...)}, where @samp{foo} designates the type of entry (e.g. @code{define-insn}). In the general case each argument is a name/value pair expressed as @code{(name value)}. -(*note: Another style in common use is `:name value' and doesn't require +(*Note: Another style in common use is `:name value' and doesn't require parentheses. Maybe that would be a better way to go here. The current style is easier to construct from macros though.) @@ -156,7 +154,7 @@ the normal case. To reduce this verbosity, a second version of most define-foo's exists that takes positional arguments. To further reduce this verbosity, preprocessor macros can be written to simplify things further for the normal case. See sections titled ``Simplification -macros'' below. +macros'' later in this manual. @node Attributes @section Attributes @@ -179,19 +177,20 @@ Another useful addition might be functional attributes (the attribute is computed at run-time - currently all attributes are computed at compile time). One way to implement functional attributes would be to record the attributes as byte-code and lazily evaluate them, caching the -results as appropriate. The syntax has been carefully done to not +results as appropriate. The syntax has been done to not preclude either as an upward compatible extension. Attributes must be defined before they can be used. There are several predefined attributes for entry types that need them (instruction field, hardware, operand, and instruction). Predefined -attributes are documented in each relevant section below. +attributes are documented in each relevant section. In C applications an enum is created that defines all the attributes. Applications that wish to be architecture independent need the attribute to have the same value across all architectures. This is achieved by -giving the attribute the INDEX attribute, which specifies the enum value -must be fixed across all architectures. +giving the attribute the INDEX attribute +@footnote{Yes, attributes can have attributes.}, +which specifies the enum value must be fixed across all architectures. @c FIXME: Give an example here. @c FIXME: Need a better name than `INDEX'. @@ -216,9 +215,11 @@ Boolean attributes are defined with: The default value of boolean attributes is always false. This can be relaxed, but it's one extra complication that is currently unnecessary. -Boolean attributes are specified in either of two forms: (NAME expr), -and NAME, !NAME. The first form is the canonical form. The latter two -are shorthand versions. `NAME' means "true" and `!NAME' means "false". +Boolean attributes are specified in either of two forms: +@code{(NAME expr)}, @code{NAME}, and @code{!NAME}. +The first form is the canonical form. The latter two +are shorthand versions. +@code{NAME} means "true" and @code{!NAME} means "false". @samp{expr} is an expression that evaluates to 0 for false and non-zero for true @footnote{The details of @code{expr} is still undecided.}. @@ -246,10 +247,10 @@ Integer attributes are defined with: If omitted, the default is 0. -(*note: The details of `expr' is still undecided. For now it must be -an integer.) +(*Note: The details of @code{expr} is still undecided. +For now it must be an integer.) -Integer attributes are specified with (NAME expr). +Integer attributes are specified with @code{(NAME expr)}. @subsection Enumerated Attributes @cindex Attributes, enumerated @@ -272,10 +273,10 @@ Enumerated attributes are defined with If omitted, the default is the first specified value. -(*note: The details of `expr' is still undecided. For now it must be the -name of one of the specified values.) +(*Note: The details of @code{expr} is still undecided. +For now it must be the name of one of the specified values.) -Enum attributes are specified with (NAME expr). +Enum attributes are specified with @code{(NAME expr)}. @subsection Bitset Attributes @cindex Attributes, bitset @@ -284,7 +285,7 @@ Bitset attributes are for situations where you want to indicate something is a subset of a small set of possibilities. The MACH attribute uses this for example to allow specifying which of the various machines support a particular insn. -(*note: At present the maximum number of possibilities is 32. +(*Note: At present the maximum number of possibilities is 32. This is an implementation restriction which can be relaxed, but there's currently no rush.) @@ -309,12 +310,12 @@ Bitset attributes are specified with @code{(NAME val1,val2,...)}. There must be no spaces in ``@code{val1,val2,...}'' and each value must be a valid Scheme symbol. -(*note: it's not clear whether allowing arbitrary expressions will be +(*Note: It's not clear whether allowing arbitrary expressions will be useful here, but doing so is not precluded. For now each value must be the name of one of the specified values.) @node Architecture variants -@section Architecture Variants +@section Architecture variants @cindex Architecture variants The base architecture and its variants are described in four parts: @@ -354,6 +355,7 @@ The syntax of @code{define-arch} is: Specify the default alignment to use when fetching data (and instructions) from memory. At present this can't be overridden, but support can be added if necessary. The default is @code{aligned}. +@c Definately need to say more here. @subsubsection insn-lsb0? @cindex insn-lsb0? @@ -648,11 +650,11 @@ mach is used. List of names of ISA's the machine supports. @node Model variants -@section Model Variants +@section Model variants For each `machine', as defined here, there is one or more `models'. There must be at least one model for each machine. -(*note: There could be a default, but requiring one doesn't involve that much +(*Note: There could be a default, but requiring one doesn't involve that much extra typing and forces the programmer to at least think about such things.) @example @@ -760,7 +762,7 @@ behind the operands must be marked with the attribute @code{PROFILE} and the hardware item must not be scalar. @node Hardware elements -@section Hardware Elements +@section Hardware elements The elements of hardware that make up a CPU are defined with @code{define-hardware}. Examples of hardware elements include @@ -991,6 +993,7 @@ How @samp{function_name} is used is application specific, but in general it is the name of a function to call. The only application that uses this at present is Opcodes. See the Opcodes documentation for a description of each function's expected prototype. +@c FIXME: Need ref here. @subsection get @@ -1135,7 +1138,7 @@ or @code{handlers} specs. @end example @node Instruction fields -@section Instruction Fields +@section Instruction fields @cindex Fields, instruction Instruction fields define the raw bitfields of each instruction. @@ -1161,7 +1164,7 @@ The syntax for defining instruction fields is: ) @end example -(*note: Whether to also provide a way to specify instruction formats is not yet +(*Note: Whether to also provide a way to specify instruction formats is not yet clear. Currently they are computed from the instructions, so there's no current *need* to provided them. However, providing the ability as an option may simplify other tools CGEN is used to generate. This @@ -1171,7 +1174,7 @@ may also simplify expression of more complicated instruction sets. Providing instruction formats may also simplify the support of really complex ISAs like i386 and m68k). -(*note: Positional specification simplifies instruction description somewhat +(*Note: Positional specification simplifies instruction description somewhat in that there is no required order of fields, and a disjunct set of fields can be referred to as one. On the other hand it can require knowledge of the length of the instruction which is inappropriate in cases like the M32R where @@ -1205,7 +1208,7 @@ to validate programs, either statically or dynamically. @item VIRTUAL The field does not directly contribute to the instruction's value. This -is used to simplify semantic or assembler descriptions where a fields +is used to simplify semantic or assembler descriptions where a field's value is based on other values. Multi-ifields are always virtual. @end table @@ -1213,7 +1216,7 @@ value is based on other values. Multi-ifields are always virtual. The offset in bits from the start of the instruction to the word containing the field. -NOTE: Either both of @samp{word-offset} and @samp{word-length} must be +Either both of @samp{word-offset} and @samp{word-length} must be specified or neither of them must be specified. The presence of @samp{word-offset} means the long form of specifying the field's position is being used. If absent then the short form is being used and the value for @@ -1227,7 +1230,7 @@ The bit number of the field's most significant bit in the instruction. Bit numbering is determined by the @code{insn-lsb0?} field of @code{define-arch}. -NOTE: If using the long form of specifying the field's position +If using the long form of specifying the field's position (@samp{word-offset} is present) then this value is the value within the containing word. If using the short form then this value includes the word offset. See the Porting document for more info @@ -1262,7 +1265,7 @@ rather than, say, char if the specified mode was @code{QI}). An expression to apply to convert from usable values to raw field values. The syntax is @code{(encode (value pc) expression)} or more specifically @code{(encode (( value) (IAI pc)) )}, -where @code{} is the mode of the the ``incoming'' value, and +where @code{} is the mode of the ``incoming'' value, and @code{} is an rtx to convert @code{value} to something that can be stored in the field. @@ -1324,7 +1327,7 @@ such fields is: ) @end example -(*note: insert/extract are analogous to encode/decode so maybe these +(*Note: insert/extract are analogous to encode/decode so maybe these fields are misnamed. The operations are subtly different though.) Example: @@ -1431,7 +1434,7 @@ in an instruction's @code{format} entry. ) @end example -(*note: @code{define-insn-enum} isn't implemented yet: use +(*Note: @code{define-insn-enum} isn't implemented yet: use @code{define-normal-insn-enum}) Example: @@ -1484,14 +1487,15 @@ Example: @code{(define-normal-insn-enum name comment attrs prefix ifield vals)} @node Instruction operands -@section Instruction Operands +@section Instruction operands @cindex Operands, instruction Instruction operands provide: @itemize @bullet @item a layer between the assembler and the raw hardware description -@item the main means of manipulating instruction fields in the semantic code +@item the main means of making an instruction's fields useful to +the semantic code @c More? @end itemize @@ -1540,6 +1544,7 @@ a relaxable/relaxed instruction. Use the SEM-ONLY attribute for cases where the operand will only be used in semantic specification, and not assembly code specification. A typical example is condition codes. +@c Does this attribute need to exist? @end table To refer to a hardware element in semantic code one must either use an @@ -1565,7 +1570,7 @@ hardware element. @subsection index The index of the hardware element. This is used to mate the hardware element with the instruction field that selects it, and must be the name -of an ifield entry. (*note: The index may be other things besides +of an ifield entry. (*Note: The index may be other things besides ifields in the future.) It must not be a multi-ifield, currently. @subsection asm @@ -1585,7 +1590,7 @@ where @code{asm-spec} is one or more of: These functions are intended to be provided in a separate @file{.opc} file. The prototype of a parse function depends on the hardware type. -See @file{cgen/*.opc} for examples. +See @file{cgen/cpu/*.opc} for examples. @c FIXME: The following needs review. @@ -1612,7 +1617,7 @@ relocations are processed after the instruction has been parsed. The result is an error message or NULL if successful. The prototype of a print function depends on the hardware type. See -@file{cgen/*.opc} for examples. For integers it is: +@file{cgen/cpu/*.opc} for examples. For integers it is: @example void print_foo (CGEN_CPU_DESC cd, @@ -1746,7 +1751,7 @@ amount of information that must be expressed, how succinct can one express it and still be clean and usable? I'm open to opinions on how to improve this, but such improvements must take everything CGEN wishes to be into account. -(*note: Of course no claim is made that the current design is the +(*Note: Of course no claim is made that the current design is the be-all and end-all or that there is one be-all and end-all.) The syntax for defining an instruction is: @@ -1791,14 +1796,14 @@ There must be no spaces in ``@code{mach1,mach2,...}''. @item UNCOND-CTI The instruction is an unconditional ``control transfer instruction''. -(*note: This attribute is derived from the semantic code. However if the +(*Note: This attribute is derived from the semantic code. However if the computed value is wrong (dunno if it ever will be) the value can be overridden by explicitly mentioning it.) @item COND-CTI The instruction is an conditional "control transfer instruction". -(*note: This attribute is derived from the semantic code. However if the +(*Note: This attribute is derived from the semantic code. However if the computed value is wrong (dunno if it ever will be) the value can be overridden by explicitly mentioning it.) @@ -2001,8 +2006,10 @@ of it that uses the stack pointer. Modes provide a simple and succinct way of specifying data types. -(*note: Should more complex types will be needed (e.g. structs? unions?), +(*Note: Should more complex types will be needed (e.g. structs? unions?), these can be handled by extending the definition of a mode to encompass them.) +@c Also, have registers as just bits and have the operand / semantic operation +@c provide the mode. Modes are similar to their usage in GCC, but there are some differences: @@ -2050,12 +2057,12 @@ specified in the operation, _not_ in the data. Ergo from this perspective Umodes don't belong in .cpu files. This is the perspective to use when writing .cpu files. -??? I'm not entirely sure these unsigned modes are needed. -They are useful in removing any ambiguity in how to sign extend constants -which has been a source of problems in GCC. -OTOH, maybe adding uconst akin to const is the way to go? - -??? Some existing ports use these modes. +@c I'm not entirely sure these unsigned modes are needed. +@c They are useful in removing any ambiguity in how to sign extend constants +@c which has been a source of problems in GCC. +@c OTOH, maybe adding uconst akin to const is the way to go? +@c +@c ?? Some existing ports use these modes. @item WI,UWI word int, unsigned word int (word_mode in gcc). @@ -2067,9 +2074,9 @@ Same as GCC. SF is a 32 bit IEEE float ("single float"). DF is a 64 bit IEEE float ("double float"). XF is either an 80 or 96 bit IEEE float ("extended float"). -(*note: XF values on m68k and i386 are different so may +(*Note: XF values on m68k and i386 are different so may wish to give them different names). -TF is a 128 bit IEEE float ("??? float"). +TF is a 128 bit IEEE float. @item AI Address integer @@ -2144,6 +2151,8 @@ The value must be from a previously defined enum. @item (subword mode value word-num) Return part of @samp{value}. Which part is determined by @samp{mode} and @samp{word-num}. There are three cases. +@c Blech. ``subword'' is a source of confusion in GCC. +@c Maybe have three separate rtxs. If @samp{mode} is the same size as the mode of @samp{value}, @samp{word-num} must be @samp{0} and the result is @samp{value} recast in the new mode. @@ -2171,7 +2180,8 @@ Concatenate @samp{arg1[,arg2[,...]]} to create a value of mode @samp{out-mode}. @samp{arg1} becomes the most significant part of the result. Each argument is interpreted in mode @samp{in-mode}. @samp{in-mode} must evenly divide @samp{out-mode}. -??? Endianness issues have yet to be decided. +@c ??? Endianness issues have yet to be decided. +@c Blech. Time to decide them. @item (sequence mode ((mode1 local1) ...) expr1 expr2 ...) Execute @samp{expr1}, @samp{expr2}, etc. sequentially. @samp{mode} is the @@ -2206,11 +2216,13 @@ always a single bit. @samp{binop-with-bit} is one of @code{addc}, @item (shiftop mode operand1 operand2) Perform a shift operation. @samp{shiftop} is one of @code{sll}, @code{srl}, @code{sra}, @code{ror}, @code{rol}. +@c Need to be precise about the semantics, and not leave it to C. @item (boolifop mode operand1 operand2) Perform a sequential boolean operation. @samp{operand2} is not processed if @samp{operand1} ``fails''. @samp{boolifop} is one of @code{andif}, @code{orif}. +@c Extend to handle more than two operands? @item (convop mode operand) Perform a mode->mode conversion operation. @samp{convop} is one of @@ -2221,6 +2233,7 @@ Perform a mode->mode conversion operation. @samp{convop} is one of Perform a comparison. @samp{cmpop} is one of @code{eq}, @code{ne}, @code{lt}, @code{le}, @code{gt}, @code{ge}, @code{ltu}, @code{leu}, @code{gtu}, @code{geu}. +@c floating point compare-unordered? @item (mathop mode operand) Perform a mathematical operation. @samp{mathop} is one of @code{sqrt}, @@ -2254,7 +2267,7 @@ An escape hook to emit a subroutine call to function named @samp{symbol} passing operands @samp{operand1}, @samp{operand2}, etc. An implicit first argument of @code{current_cpu} is passed to @samp{symbol}. @samp{mode} is the mode of the result. Be aware that @samp{symbol} will -be restricted by reserved words in the C programming language any by +be restricted by reserved words in the C programming language and by existing symbols in the generated code. @item (c-raw-call mode symbol operand1 operand2 ...) @@ -2333,7 +2346,7 @@ Return non-zero if the value of attribute @samp{attr-name} is Return the index of @samp{operand}. For registers this is the register number. @item (regno operand) -Same as @code{index-of}, but improves readability for registers +Same as @code{index-of}, but improves readability for registers. @item (error mode message) Emit an error message from CGEN RTL. Error message is specified by @samp{message}. @@ -2343,8 +2356,11 @@ A no-op. @item (ifield field-name) Return the value of field @samp{field-name}. @samp{field-name} must be a -field in the instruction. Operands can be any of: -@c ??? +field in the instruction. + +@end table + +Operands can be any of: @itemize @bullet @item an operand defined in the description file @@ -2358,18 +2374,18 @@ field in the instruction. Operands can be any of: The @samp{symbol} in a @code{c-call} or @code{c-raw-call} function is currently the name of a C function or macro that is invoked by the generated semantic code. -@end table @node Macro-expressions @section Macro-expressions @cindex Macro-expressions -Macro RTL expressions started out by wanting to not have to always +Macro RTL expressions are a way to not have to always specify a mode for every expression (and sub-expression -thereof). Whereas the formal way to specify, say, an add is @code{(add -SI arg1 arg2)} if SI is the default mode of `arg1' then this can be -simply written as @code{(add arg1 arg2)}. This gets expanded to -@code{(add DFLT arg1 arg2)} where @code{DFLT} means ``default mode''. +thereof). Whereas the formal way to specify, say, an add is +@code{(add SI arg1 arg2)} if SI is the default mode of `arg1' then +this can be simply written as @code{(add arg1 arg2)}. +This gets expanded to @code{(add DFLT arg1 arg2)} where +@code{DFLT} means ``default mode''. It might be possible to replace macro expressions with preprocessor macros, however for the nonce there is no plan to do this. diff --git a/cgen/doc/running.texi b/cgen/doc/running.texi index 07984054a6..1c1b0292c1 100644 --- a/cgen/doc/running.texi +++ b/cgen/doc/running.texi @@ -1,9 +1,421 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. @node Running CGEN @chapter Running CGEN -This chapter needs to explain how to run CGEN, how it fits together, and -what to expect when you do run it (i.e., output, resultant files, etc). +CGEN is usually run from a shell script provided by the application. +For example, in @file{libopcodes} there is @file{cgen.sh}. + +The main tasks of this script are to: + +@enumerate +@item Set up the arguments for cgen. +@item Run cgen. +@item Apply any post processing to the output files. +@end enumerate + +@subsection Set up the arguments for cgen. + +CGEN takes several standard arguments. +Each application can then add its own arguments. +By convention generic CGEN options are lowercase letters +and applications use uppercase letters for their arguments. + +@c organization of application-specific args needs work + +@menu +* a:: -a Specify path of .cpu file to load. +* b:: -b Use debugging evaluator, for backtraces. +* d:: -d Start interactive debugging session. +* f:: -f Specify a set of flags to control code generation. +* h:: -h,--help Print usage information. +* i:: -i Specify isa-list entries to keep. +* m:: -m Specify mach-list entries to keep. +* s:: -s Specify the source directory. +* v:: -v Increment the verbosity level. +* version:: --version Print version info. + +* opcodes:: Opcodes generator arguments. +* sim:: Simulator generator arguments. +* sid:: Sid generator arguments. +* html:: HTML doc generator arguments. +@end menu + +@node a +@section Specify path of architecture's .cpu file to load. @option{-a} @var{path} + +Use this option to specify the @file{.cpu} file to load. + +@node b +@section Use debugging evaluator, for backtraces. @option{-b} + +Use this option when trying to debug a cgen failure. +It turns on the debugging facilities of the underlying system, e.g. Guile, +and is typically used to produce better error messages (e.g. better +backtraces). +Guile's debugging evaluator is slower than the normal one, +so this option is off by default. + +@node d +@section Start interactive debugging session. @option{-d} + +Use this option when trying to debug a cgen failure and you +want to enter a debugging +@code{repl}@footnote{Read-Evaluate-Print-Loop} +in the underlying system, e.g. Guile. + +@node f +@section Specify a set of flags to control code generation. @option{-f} @var{flags} + +Use this option to pass various code generation options to the application. +@var{flags} is a space-separated list of options with the format +@code{name} or @code{name=value}. +Each application accepts its own set of options. + +@c Need to say more here, and for each option. + +@menu +* Opcodes Generator Options:: Opcodes Generator Options +* GDB Simulator Generator Options:: GDB Generator Simulator +* SID Simulator Generator Options:: SID Generator Simulator +* HTML Doc Generator Options:: HTML Doc Generator Options +@end menu + +@node Opcodes Generator Options +@subsection Opcodes Generator Options + +The @code{Opcodes} generator accepts the following options: + +@table @code + +@item opinst +Include the operand instance table in the generated code. + +@item copyright= +The argument is the copyright to add to the generated code. +It must be one of @code{fsf} or @code{redhat}. + +@item package= +The argument is the package the opcodes files are being generated for. +It must be one of @code{binutils}, @code{gnusim} (the simulators in GDB +releases) or @code{cygsim} (SID simulators). + +@end table + +@node GDB Simulator Generator Options +@subsection GDB Simulator Generator Options + +@table @code + +@item with-scache + +Specify this option to enable the ``semantic cache'' of the simulator. +The simulator uses the semantic cache to speed up simulation by caching +the decoding of instructions. + +@item with-profile= + +Specify this option to enable basic profiling support. + +fn - do profiling in the semantic function + +sw - do profiling in the semantic switch + +@item with-multiple-isa + +Specify this option to enable multiple-isa support. +This is useful for the arm+thumb simulator, +and allows the simulator to simulator programs that use both ISAs. + +@item with-generic-write + +This option is for architectures that can execute multiple +instructions in parallel. +Instruction semantics are performed by recording the results +in a generic buffer, and doing a post-semantics writeback pass. +@c What happens if this option is left off? + +@item with-parallel-only +@c Only generate parallel versions of each insn. + +@item copyright= +The argument is the copyright to add to the generated code. +It must be one of @code{fsf} or @code{redhat}. + +@item package= +The argument is the package the simulator files are being generated for. +It must be one of @code{gnusim} (the simulators in GDB +releases) or @code{cygsim} (SID simulators). +@c Is cygsim old or what? SID has its own generators. + +@end table + +@node SID Simulator Generator Options +@subsection SID Simulator Generator Options + +@table @code + +@item with-scache + +Specify this option to enable the ``semantic cache'' of the simulator. +The simulator uses the semantic cache to speed up simulation by caching +the decoding of instructions. + +@emph{NOTE:} Not all targets support this option. + +@item with-pbb + +Specify this option to enable the ``pseudo basic block'' engine. +The simulator uses the pbb engine to speed up simulation by analyzing +the instruction stream a pseudo basic block at a time. + +@emph{NOTE:} Not all targets support this option. + +@item with-sem-frags + +Specify this option to enable the semantic fragment engine. + +@emph{NOTE:} This option requires @code{with-pbb}. + +@emph{NOTE:} Not all targets support this option. + +@item with-profile= + +Specify this option to enable basic profiling support. + +fn - do profiling in the semantic function + +sw - do profiling in the semantic switch + +@item with-multiple-isa + +Specify this option to enable multiple-isa support. +This is useful for the arm+thumb simulator, +and allows the simulator to simulator programs that use both ISAs. + +@item copyright= +The argument is the copyright to add to the generated code. +It must be one of @code{fsf} or @code{redhat}. + +@item package= +The argument is the package the simulator files are being generated for. +It must be one of @code{gnusim} (the simulators in GDB +releases) or @code{cygsim} (SID simulators). +@c What's gnusim doing here? + +@end table + +@node HTML Doc Generator Options +@subsection HTML Doc Generator Options + +@table @code + +@item copyright= +The argument is the copyright to add to the generated code. +It must be @code{doc}. + +@item package= +The argument is the package the opcodes files are being generated for. +It must be @code{cgen}. + +@end table + +@node h +@section Print usage information. @option{-h,--help} + +The standard --help option. + +@node i +@section Specify isa-list entries to keep. @option{-i} @var{isa-list} + +Use this option select a subset of the ISAs for the architecture. +This is useful, for example, to generate only Thumb support from an +arm+thumb description. + +@node m +@section Specify mach-list entries to keep. @option{-m} @var{mach-list} + +Use this option to select a subset of the machines of the architecture. +This is useful, for example, to generate a simulator for a specific +variant of the architecture. + +@node s +@section Specify the source directory. @option{-s} @var{srcdir} + +Use this to specify where the rest of CGEN's files are. + +For example in @code{Binutils} CGEN is typically a sibling +of @file{src/opcodes}, i.e., @file{src/cgen}. + +@node v +@section Increment the verbosity level. @option{-v} + +Specifying multiple @code{-v} options will increase the verbosity. + +@node version +@section Print version info. @option{--version} + +The standard --version option. + +@node opcodes +@section Opcodes generator arguments + +The opcodes generator accepts these arguments. + +@table @code +@item @code{-OPC} @var{FILE} +Specify the path to the @file{.opc} file. +The @file{.opc} file contains C code that is copied to the output. +It's useful for providing non-standard or non-straightforward +parsers and printers. + +@item @code{-H} @var{FILE} +Generate $arch-desc.h in FILE. + +@item @code{-C} @file{FILE} +Generate $arch-desc.c in FILE. + +@item @code{-O} @var{FILE} +Generate $arch-opc.h in FILE. + +@item @code{-P} @var{FILE} +Generate $arch-opc.c in FILE. + +@item @code{-Q} @var{FILE} +Generate $arch-opinst.c in FILE. + +@item @code{-B} @var{FILE} +Generate $arch-ibld.h in FILE. + +@item @code{-L} @var{FILE} +Generate $arch-ibld.in in FILE. + +@item @code{-A} @var{FILE} +Generate $arch-asm.in in FILE. + +@item @code{-D} @var{FILE} +Generate $arch-dis.in in FILE. + +@end table + +@node sim +@section Simulator generator arguments + +The simulator generator accepts these arguments. + +@table @code + +@item @code{-A} @var{FILE} +Generate arch.h in FILE. + +@item @code{-B} @var{FILE} +Generate arch.c in FILE. + +@item @code{-C} @var{FILE} +Generate cpu-.h in FILE. + +@item @code{-U} @var{FILE} +Generate cpu-.c in FILE. + +@item @code{-N} @var{FILE} +Generate cpu-all.h in FILE. + +@item @code{-F} @var{FILE} +Generate memops.h in FILE. + +@item @code{-G} @var{FILE} +Generate defs.h in FILE. + +@item @code{-P} @var{FILE} +Generate semops.h in FILE. + +@item @code{-T} @var{FILE} +Generate decode.h in FILE. + +@item @code{-D} @var{FILE} +Generate decode.c in FILE. + +@item @code{-E} @var{FILE} +Generate extract.c in FILE. + +@item @code{-R} @var{FILE} +Generate read.c in FILE. + +@item @code{-W} @var{FILE} +Generate write.c in FILE. + +@item @code{-S} @var{FILE} +Generate semantics.c in FILE. + +@item @code{-X} @var{FILE} +Generate sem-switch.c in FILE. + +@item @code{-O} @var{FILE} +Generate ops.c in FILE. + +@item @code{-M} @var{FILE} +Generate model.c in FILE. + +@item @code{-L} @var{FILE} +Generate mainloop.in in FILE. + +@end table + +@node sid +@section Sid generator arguments + +The SID simulator generator accepts these arguments. + +@table @code + +@item @code{-H} @var{FILE} +Generate desc.h in FILE. + +@item @code{-C} @var{FILE} +Generate cpu.h in FILE. + +@item @code{-E} @var{FILE} +Generate defs.h in FILE. + +@item @code{-T} @var{FILE} +Generate decode.h in FILE. + +@item @code{-D} @var{FILE} +Generate decode.cxx in FILE. + +@item @code{-W} @var{FILE} +Generate write.cxx in FILE. + +@item @code{-S} @var{FILE} +Generate semantics.cxx in FILE. + +@item @code{-X} @var{FILE} +Generate sem-switch.cxx in FILE. + +@item @code{-M} @var{FILE} +Generate model.cxx in FILE. + +@item @code{-N} @var{FILE} +Generate model.h in FILE. + +@end table + +@node html +@section HTML doc generator arguments + +The HTML doc generator accepts these arguments. + +@table @code + +@item @code{-H} @var{FILE} +Generate $arch.html in FILE. + +@item @code{-I} @var{FILE} +Generate $arch-insn.html in FILE. + +@item @code{-N} @var{FILE} +Set the name of the insn.html file as FILE. + +@end table diff --git a/cgen/doc/sim.texi b/cgen/doc/sim.texi index f52848ce10..d78b2ed1fb 100644 --- a/cgen/doc/sim.texi +++ b/cgen/doc/sim.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2000 Red Hat, Inc. +@c Copyright (C) 2000, 2009 Red Hat, Inc. @c This file is part of the CGEN manual. @c For copying conditions, see the file cgen.texi. diff --git a/cgen/doc/stamp-vti b/cgen/doc/stamp-vti index 168ca37c5e..dde65c58b9 100644 --- a/cgen/doc/stamp-vti +++ b/cgen/doc/stamp-vti @@ -1,3 +1,4 @@ -@set UPDATED 28 March 2001 +@set UPDATED 9 June 2009 +@set UPDATED-MONTH June 2009 @set EDITION 1.1 -@set VERSION 1.0 +@set VERSION 1.1 diff --git a/cgen/doc/version.texi b/cgen/doc/version.texi index 168ca37c5e..dde65c58b9 100644 --- a/cgen/doc/version.texi +++ b/cgen/doc/version.texi @@ -1,3 +1,4 @@ -@set UPDATED 28 March 2001 +@set UPDATED 9 June 2009 +@set UPDATED-MONTH June 2009 @set EDITION 1.1 -@set VERSION 1.0 +@set VERSION 1.1 -- 2.11.0