* JAN-FEB 2023\r
* WITH COMPILER SECURITY\r
* AND VARIABLE LENGTH NAMES\r
-* Try again with RTS mode\r
+* Flattening the RTS mode\r
+*\r
+* When I got to BUILDS/DOES in FIG68KSB.S,\r
+* I realized that the far branch kludge\r
+* is more of a paradigm-breaker than I had been expecting.\r
+* So I came back to FIG68KRT.S to flatten\r
+* the indirect-threading, subroutine-call inner interpreter.\r
+* Subroutine-call inner interpreter should allow\r
+* seamless interface with CPU native runtime libraries,\r
*\r
* Adapted by Joel Matthew Rees \r
* from fig-FORTH for 6800 (via buggy fig-FORTH for 6809) by Dave Lion, et. al.\r
*\r
* The original version was developed on an AMI EVK 300 PROTO\r
* system using an ACIA for the I/O.\r
-* This version is developed targeting the Atar ST.\r
+* This version is developed targeting the Atari ST.\r
\r
* All terminal 1/0\r
* is done in three subroutines:\r
* are not supported as ORG arguments, etc.\r
*\r
* These will be defined elsewhere:\r
+* Except the buffers must be defined before being used.\r
+*\r
+\r
+* The following buffered I/O definitions must be resolved before being used in the first pass,\r
+* for assemblers that insist.\r
+*\r
+* Traditional SCREEN size is 1024.\r
+* But it must be a power of 2 multiple of BLOCK size.\r
+* The model I/O routines from fig Forth may work better at 8 BLOCKs per SCREEN,\r
+* but only 4 buffers (half a SCREEN) in memory\r
+* -- because of certain tightly-coupled design features (bugs).\r
+*\r
+* * If your block I/O works directly on sector buffers, \r
+* BLOCK size should be SECTOR size.\r
+* * If your sector buffer memory is not in Forth-controlled memory,\r
+* BLOCK size can be decoupled from SECTOR size. But your low-level routines\r
+* have to handle the copying between correctly.\r
*\r
-* NBLK EQU 4 # of disc buffer blocks for virtual memory\r
+* This version of the model does not handle BLOCK 0 buffering well.\r
+*\r
+* And, of course, this should be in a table with entries for each block I/O device.\r
+*\r
+NBLK EQU 4 ; # of disc buffer blocks for "virtual memory"\r
+*\r
+SCRSZ EQU 1024 \r
+*\r
+RSECSZ EQU 256 ; size of the RAM-resident emulated block I/O sector\r
+RBLKSZ EQU SCRSZ/NBLK ; \r
+* each block buffer is RBLKSZ+SECTRL bytes in size,\r
+* holding RBLKSZ characters\r
+SECTRL EQU 2*NATWID ; Currently held sector number, etc.\r
+BUFSZ EQU (RBLKSZ+SECTRL)*NBLK\r
+*\r
+\r
+\r
* MEMEND EQU 132*NBLK+ENDofCODE end of ram\r
* each block is 132 bytes in size,\r
* holding 128 characters\r
* ######>> screen 13 <<\r
* These are of questionable use anyway, \r
* and are too much trouble to use with native subroutine call anyway.\r
-POPD0X MOVE.L (PSP)+,D0 ; These may actually not end up being used.\r
-STD0X MOVE.L D0,(A0)\r
- BRA.S NEXT\r
-GETX MOVE.L (A0),D0\r
+*POPD0X MOVE.L (PSP)+,D0 ; These may actually not end up being used.\r
+*STD0X MOVE.L D0,(A0)\r
+* BRA.S NEXT\r
+*GETX MOVE.L (A0),D0\r
PUSHD0 MOVE.L D0,-(PSP) ; fall through to NEXT\r
\r
* "NEXT" takes ?? cycles if TRACE is removed,\r
DC.B 'U' ; 'U/'\r
DC.B '/'|$80\r
DC.L USTAR-3-NATWID\r
-* Using the bit divide to reduce testing burden, working in registers.\r
USLASH:\r
DC.L *+NATWID\r
+ MOVEM.L (PSP),D0/D1/D2 ; divisor in D0\r
+ TST.L D0 ; divisor 0?\r
+ BEQ.S USL0 ; dodge divide-by-zero exception\r
+ CMP.L #$10000,D0 ; 16-bit divisor?\r
+ BHS.S USLD32 ; no, can't use the easy way\r
+ TST.L D1 ; dividend greater than 32-bit?\r
+ BNE.S USLH64 ; handle the high word\r
+ DIVU.W D0,D2 ; result remainder in high 16 bits ; about ~140\r
+ MOVE.L D2,D1 ; Move the results into place.\r
+ SWAP D1\r
+ AND.L #$FFFF,D1 ; only the remainder\r
+ AND.L #$FFFF,D2 ; only the quotient\r
+ BRA.S BSLR\r
+USLH64:\r
+USLD32:\r
+ BRA.S BSLENT ; bail for now\r
+USL0 MOVE.L D2,D1 ; dividend low word as remainder\r
+ MOVEQ.L #-1,D2 ; saturated quotient\r
+ BRA.S BSLR\r
+ \r
+\r
+*\r
+ EVEN\r
+ DC.B 0\r
+ DC.B $82\r
+ DC.B 'B' ; 'B/'\r
+ DC.B '/'|$80\r
+ DC.L USLASH-3-NATWID\r
+* Using the bit divide to reduce testing burden, working in registers.\r
+BSLASH:\r
+ DC.L *+NATWID\r
MOVEM.L (PSP),D0/D1/D2 ; D1:D2 by D0 (40~ ignore attempts to count cycles)\r
+BSLENT:\r
MOVE.W #32,D3 ; bit ct for DBcc (8~)\r
-USLDIV:\r
+BSLDIV:\r
CMP.L D0,D1 ; divisor (6~)\r
- BHS.S USLSUB (8/10~)\r
+ BHS.S BSLSUB (8/10~)\r
AND #~F_EXT,CCR ; X-carry clear (20~)\r
- BRA.S USLBIT (10~)\r
-USLSUB:\r
+ BRA.S BSLBIT (10~)\r
+BSLSUB:\r
SUB.L D0,D1 (6~)\r
OR #F_EXT,CCR ; quotient, (X-carry set) (20~)\r
-USLBIT:\r
+BSLBIT:\r
ROXL.L #1,D2 ; save it (8~)\r
- DBF D3,USLMOR ; more bits? Don't mess with CCR ((12/14)/10~)\r
-USLR:\r
+ DBF D3,BSLMOR ; more bits? Don't mess with CCR ((12/14)/10~)\r
+BSLR:\r
LEA NATWID(PSP),PSP (8~)\r
MOVE.L D1,NATWID(PSP) (16~)\r
MOVE.L D2,(PSP) (12~)\r
RTS\r
-USLMOR\r
+BSLMOR\r
ROXL.L #1,D1 ; remainder (8~)\r
- BCC.S USLDIV (8/10~)\r
- BRA.S USLSUB (10~) (~90*32=~2880+entry+exit, about 800 μS at 4 MHz)\r
+ BCC.S BSLDIV (8/10~)\r
+ BRA.S BSLSUB (10~) (~90*32=~2880+entry+exit, about 800 μS at 4 MHz)\r
\r
* The following is not yet functional, only here to help me remember:\r
* cUSLASH DC.L *+NATWID\r
DC.B 'AN' ; 'AND'\r
DC.B "D"|$80\r
* DC.L I-2-NATWID ; ***** debug link *****\r
- DC.L USLASH-3-NATWID ; correct link\r
+ DC.L BSLASH-3-NATWID ; correct link\r
AND DC.L *+NATWID\r
MOVE.L (PSP)+,D0\r
AND.L D0,(PSP)\r
DC.B '<'|$80\r
DC.L ZEQU-3-NATWID\r
ZLESS DC.L *+NATWID\r
- CLR.L D0\r
+* CLR.L D0\r
TST.L (PSP)\r
SMI D0\r
- BRA.S ZEQMSK ; trade a few cycles for several bytes\r
+* BRA.S ZEQMSK ; don't trade a few cycles for several bytes\r
+* AND.W #1,D0 ; flatten it, instead.\r
+ AND.L #1,D0 ; flatten it, instead.\r
+ MOVE.L D0,(PSP)\r
+ RTS\r
*\r
* ######>> screen 29 <<\r
* ======>> 33 <<\r
* This would not be hard to flatten to native code,\r
* especially in the 6809 or 68000.\r
* But that's not the purpose of a model.\r
+* Except that's the purpose of this model, now.\r
+* So we will start flattening here, to see how it goes.\r
+* First need to flatten certain of the called words.\r
EVEN\r
DC.B $C1 ; : immediate\r
DC.B ':'|$80\r
DC.L CREATE,RBRAK\r
DC.L PSCODE\r
\r
+* What was I thinking?\r
+*COLON DC.L *+NATWID\r
+* BSR.W QEXEC+NATWID\r
+** BSR.W SCSP+NATWID\r
+* MOVE.L PSP,XCSP-UORIG(UP) ; SCSP\r
+* MOVE.L XCURR-UORIG(UP),XCONT-UORIG(UP)\r
+* BSR.W CREATE+NATWID\r
+* BSR.W RBRAK+NATWID\r
+* MOVE.L #STCOMP,XSTATE-UORIG(UP) ; RBRAK\r
+** NOW WHAT? Should work.\r
+* DC.L PSCODE\r
+\r
* Here is the IP pusher for allowing\r
* nested words in the virtual machine:\r
* ( ;S is the equivalent un-nester )\r
MOVE.L A0,-(PSP)\r
RTS\r
\r
-* Hey, the per-user table can actually be larger than 256 bytes!\r
+* Hey, the per-user table can actually be larger than 256 bytes, esp. on the 68000!\r
*\r
PAGE\r
*\r
DC.B 'F'|$80\r
DC.L BCTL-6-NATWID\r
BBUF DC.L DOCON\r
- DC.L SECTSZ\r
+ DC.L RBLKSZ\r
* Hardcoded in 6800 model:\r
* FDB 128\r
*\r
* ======>> 60 <<\r
* ( --- blocksperscreen ) \r
* The size, in blocks, of a screen.\r
-* Should this be the same as NBLK, the number of block buffers maintained?\r
EVEN\r
DC.B $85\r
DC.B 'B/SC' ; 'B/SCR' : (blocks/screen)\r
DC.B 'R'|$80\r
DC.L BBUF-6-NATWID\r
-BSCR DC.L DOCON\r
- DC.L SCRSZ/SECTSZ\r
+*BSCR DC.L DOCON\r
+* DC.L NBLK\r
+BSCR DC.L *+NATWID\r
+ MOVE.L #NBLK,-(PSP)\r
+ RTS\r
* Hardcoded in 6800 model as:\r
* FDB 8\r
* blocks/screen = 1024 / "B/BUF" = 8, if sectors are 128 bytes.\r
DC.B 'S' ; 'S0'\r
DC.B '0'|$80\r
DC.L PORIG-8-NATWID\r
-SZERO DC.L DOUSER\r
- DC.L XSPZER-UORIG\r
+*SZERO DC.L DOUSER\r
+* DC.L XSPZER-UORIG\r
+SZERO DC.L *+NATWID\r
+ LEA XSPZER-UORIG(UP),A0\r
+ MOVE.L A0,-(PSP) ; Note that BRA.W takes 32 bits to encode, so saves no space.\r
+ RTS\r
*\r
* ======>> 63 <<\r
* ( n --- adr )\r
DC.B 'R' ; 'R0'\r
DC.B '0'|$80\r
DC.L SZERO-3-NATWID\r
-RZERO DC.L DOUSER\r
- DC.L XRZERO-UORIG\r
+*RZERO DC.L DOUSER\r
+* DC.L XRZERO-UORIG\r
+RZERO DC.L *+NATWID\r
+ LEA XRZERO-UORIG(UP),A0\r
+ MOVE.L A0,-(PSP) ; Note that BRA.W takes 32 bits to encode, so saves no space.\r
+ RTS\r
*\r
* ======>> 64 <<\r
* ( --- vadr ) \r
DC.B 'WARNIN' ; 'WARNING'\r
DC.B 'G'|$80\r
DC.L WIDTH-6-NATWID\r
-WARN DC.L DOUSER\r
- DC.L XWARN-UORIG\r
+*WARN DC.L DOUSER ; Must be callable from low-level.\r
+* DC.L XWARN-UORIG\r
+WARN DC.L *+NATWID\r
+ LEA XWARN-UORIG(UP),A0\r
+ MOVE.L A0,-(PSP) ; Note that BRA.W takes 32 bits to encode, so saves no space.\r
+ RTS\r
*\r
* ======>> 67 <<\r
* ( --- vadr ) \r
DC.B 'OFFSE' ; 'OFFSET'\r
DC.B 'T'|$80\r
DC.L SCR-4-NATWID\r
-OFSET DC.L DOUSER\r
- DC.L XOFSET-UORIG\r
+*OFSET DC.L DOUSER\r
+* DC.L XOFSET-UORIG\r
+OFSET DC.L *+NATWID\r
+ LEA XOFSET-UORIG(UP),A0\r
+ MOVE.L A0,-(PSP)\r
+ RTS\r
*\r
* ======>> 74 <<\r
* ( --- vadr ) \r
DC.B '!CS' ; '!CSP'\r
DC.B 'P'|$80\r
DC.L PFA-4-NATWID\r
-SCSP DC.L DOCOL,SPAT,CSP,STORE\r
- DC.L SEMIS\r
+*SCSP DC.L DOCOL,SPAT,CSP,STORE\r
+* DC.L SEMIS\r
+SCSP DC.L *+NATWID\r
+ MOVE.L PSP,XCSP-UORIG(UP) \r
+ RTS\r
+* How would the optimizer have been able to work through the following\r
+* to get the above?\r
+*SCSP DC.L *+NATWID\r
+* MOVE.L PSP,-(PSP)\r
+** MOVE.L (W),D0 ; Offset into the table.\r
+** LEA (UP,D0.L),A0\r
+* LEA XCSP-UORIG(UP),A0\r
+* MOVE.L A0,-(PSP)\r
+* MOVEM.L (PSP)+,D0/A0\r
+* EXG D0,A0\r
+* MOVE.L D0,(A0)\r
+* RTS\r
*\r
PAGE\r
*\r
DC.B $81 ; ]\r
DC.B ']'|$80\r
DC.L LBRAK-2-NATWID\r
-RBRAK DC.L DOCOL,LIT16\r
- DC.W STCOMP\r
- DC.L STATE,STORE\r
- DC.L SEMIS\r
+*RBRAK DC.L DOCOL,LIT16\r
+* DC.W STCOMP\r
+* DC.L STATE,STORE\r
+* DC.L SEMIS\r
+RBRAK DC.L *+NATWID\r
+ MOVE.L #STCOMP,XSTATE-UORIG(UP)\r
+ RTS\r
*\r
* ======>> 114 <<\r
* ( --- )\r
DC.B 'COUN' ; 'COUNT'\r
DC.B 'T'|$80\r
DC.L DOES-6-NATWID\r
-COUNT DC.L DOCOL,DUP,ONEP,SWAP,CAT\r
- DC.L SEMIS\r
+*COUNT DC.L DOCOL,DUP,ONEP,SWAP,CAT\r
+* DC.L SEMIS\r
+COUNT DC.L *+NATWID\r
+ MOVE.L (PSP),A0\r
+ CLR.L D0\r
+ MOVE.B (A0)+,D0\r
+ MOVE.L A0,(PSP)\r
+ MOVE.L D0,-(PSP)\r
+ RTS\r
*\r
* ======>> 122 <<\r
* ( strptr count --- )\r
DC.B 'TYP' ; 'TYPE'\r
DC.B 'E'|$80\r
DC.L COUNT-6-NATWID\r
-TYPE DC.L DOCOL,DDUP,ZBRAN\r
- DC.L TYPE3-*-NATWID\r
- DC.L OVER,PLUS,SWAP,XDO\r
-TYPE2 DC.L I,CAT,EMIT,XLOOP\r
- DC.L TYPE2-*-NATWID\r
- DC.L BRAN\r
- DC.L TYPE4-*-NATWID\r
-TYPE3 DC.L DROP\r
-TYPE4 DC.L SEMIS\r
+*TYPE DC.L DOCOL,DDUP,ZBRAN\r
+* DC.L TYPE3-*-NATWID\r
+* DC.L OVER,PLUS,SWAP,XDO\r
+*TYPE2 DC.L I,CAT,EMIT,XLOOP\r
+* DC.L TYPE2-*-NATWID\r
+* DC.L BRAN\r
+* DC.L TYPE4-*-NATWID\r
+*TYPE3 DC.L DROP\r
+*TYPE4 DC.L SEMIS\r
+*\r
+TYPE DC.L *+NATWID\r
+ MOVEM.L (PSP)+,D0/A0\r
+ LEA (A0,D0.L),A1\r
+ BRA.S TYPET\r
+TYPEL CLR.L D0\r
+ MOVE.B (A0)+,D0\r
+ MOVE.L D0,-(PSP)\r
+ BSR.W EMIT+NATWID\r
+TYPET CMP.L A0,A1\r
+ BHI.S TYPEL\r
+TYPEX RTS\r
*\r
* ======>> 123 <<\r
* ( strptr count1 --- strptr count2 )\r
DC.L DTRAIL-10-NATWID\r
* PDOTQ DC.L DOCOL,R,TWOP,COUNT,DUP,ONEP\r
* PDOTQ DC.L DOCOL,R,NATP,COUNT,DUP,ONEP\r
-PDOTQ DC.L DOCOL,R ; A5/IP is post-inc.\r
- DC.L COUNT,DUP,ONEP ; There's a count byte, too.\r
- DC.L ZERO,ALGNB,PLUS ; Align the count.\r
- DC.L FROMR,PLUS,TOR ; IP ready to continue after the string.\r
- DC.L TYPE\r
- DC.L BREAK ; DBG *****\r
- DC.L SEMIS\r
+*PDOTQ DC.L DOCOL,R ; A5/IP is post-inc.\r
+* DC.L COUNT,DUP,ONEP ; There's a count byte, too.\r
+* DC.L ZERO,ALGNB,PLUS ; Align the count.\r
+* DC.L FROMR,PLUS,TOR ; IP ready to continue after the string.\r
+* DC.L TYPE\r
+* DC.L BREAK ; DBG *****\r
+* DC.L SEMIS\r
+*\r
+PDOTQ DC.L *+NATWID ; DOCOL\r
+ MOVE.L IP,-(PSP) ; R -- Without DOCOL, IP (post-inc) is where the pointer is.\r
+ BSR.W COUNT+NATWID ; Don't want to break the binding to COUNT.\r
+ MOVEM.L (PSP),D0/A0 ; count and pointer to string, leave ready for TYPE\r
+ ADD.L A0,D0 ; pointer to end of string in D0\r
+ BTST #0,D0 ; Odd?\r
+ BEQ.S PDOTQZ\r
+ ADDQ #1,D0 ; Bump it even.\r
+PDOTQZ MOVE.L D0,IP ; Bump IP over the string.\r
+ BRA.W TYPE ; Tail-call.\r
*\r
* ======>> 125 <<\r
* ( --- ) P\r
* ( anything --- nothing ) ( anything *** nothing )\r
* An indirection for ABORT, for ERROR,\r
* which may be modified carefully.\r
+* We are now using PABORT for what it probably was originally intended --\r
+* a way to break the dependency cycle in ERROR.\r
EVEN\r
DC.B $87\r
DC.B '(ABORT' ; '(ABORT)'\r
DC.B ')'|$80\r
DC.L DFIND-6-NATWID\r
-PABORT DC.L DOCOL,ABORT\r
- DC.L SEMIS\r
+*PABORT DC.L DOCOL,ABORT\r
+* DC.L SEMIS\r
+PABORT DC.L *+NATWID\r
+ MOVE.L #ABORT+NATWID,IP\r
+ BRA.W NEXT ; Don't even return.\r
*\r
* ======>> 143 <<\r
* ERROR ( anything line --- IN BLK ) ( anything *** nothing )\r
DC.B 'ERRO' ; 'ERROR'\r
DC.B 'R'|$80\r
DC.L PABORT-8-NATWID\r
-* This really should not be high level, according to best practices.\r
-* But fixing that cascades through MESSAGE,\r
-* requiring re-architecting the disk block system.\r
-* First, we need to get this transliteration running.\r
+* It's time to make this low-level.\r
ERROR DC.L DOCOL,WARN,AT,ZLESS\r
DC.L ZBRAN\r
DC.L ERROR2-*-NATWID\r
DC.B 'S->' ; 'S->D'\r
DC.B 'D'|$80\r
DC.L COLD-5-NATWID ; Note that this does not link to FORTH (RFORTH)!\r
-STOD DC.L DOCOL,DUP,ZLESS,MINUS\r
- DC.L SEMIS\r
-\r
+*STOD DC.L DOCOL,DUP,ZLESS,MINUS\r
+* DC.L SEMIS\r
+STOD DC.L *+NATWID ; Make it directly callable.\r
+ TST.L (PSP)\r
+ SMI D0\r
+ AND.L #1,D0\r
+ NEG.L D0\r
+ MOVE.L D0,-(PSP)\r
+ RTS\r
+* CLR.L D0\r
+* TST.L (PSP)\r
+* BMI.S STODS\r
+* COM.L D0\r
+*STODS MOVE.L D0,-(PSP)\r
+* RTS\r
\r
+* TST.L (PSP)\r
+* SMI D0\r
+* AND.L #1,D0\r
+* NEG.L D0\r
+* MOVE.L D0,-(PSP)\r
+* RTS\r
*\r
* ======>> 159 <<\r
* ( multiplier multiplicand --- product )\r
* ALIGN 256 ; want to do this, but the ATARI CNOP directive doesn't look standard to me.\r
*\r
* substitute for disc mass memory\r
-NBLK EQU 4 ; # of disc buffer blocks for virtual memory\r
-* Should NBLK be SCRSZ/SECTSZ? maybe not.\r
-* each block is SECTSZ+SECTRL bytes in size,\r
-* holding SECTSZ characters\r
-SECTSZ EQU 256\r
-SECTRL EQU 2*NATWID ; Currently held sector number, etc.\r
-BUFSZ EQU (SECTSZ+SECTRL)*NBLK\r
+*NBLK EQU 4 ; # of disc buffer blocks for virtual memory, must be defined before using\r
+* Should NBLK be SCRSZ/RBLKSZ? maybe not.\r
+* each block is RBLKSZ+SECTRL bytes in size,\r
+* holding RBLKSZ characters\r
+*RBLKSZ EQU 256 ; must be defined before using.\r
+*SECTRL EQU 2*NATWID ; Currently held sector number, etc., define with rest\r
+*BUFSZ EQU (RBLKSZ+SECTRL)*NBLK ; define with rest\r
*\r
BUFBAS DS.L BUFSZ\r
* This is a really awkward place to define the disk buffer records.\r
* *BUG* SECTRL was magic-number hard-wired into several definitions.\r
* It will take a bit of work to ferret them out.\r
* It is too small, and it should not be hard-wired.\r
-* SECTSZ was also magic-number hard-wired into several definitions,\r
+* RBLKSZ was also magic-number hard-wired into several definitions,\r
* will I find them all?\r
DC.L 0,0,0,0,0,0,0,0 ; put a little space between\r
* ALIGN 256 ; Again, I want to, but ...\r
MEMEND EQU *\r
*\r
-SCRSZ EQU 1024\r
+*SCRSZ EQU 1024 ; must be defined before using\r
*\r
* FIRST\r
*\r