The ForthVM is an attempt at making a very small compiled form of forth that is still easily portable to different machines. It exists as an inbetween for forth code and assembler code. A quick opcode list follows: 00 dup
01 drop
02 swap
03 over
04 r>
05 >r
06 +
07 -
08 *
09 /
0A %
0B @
0C !
0D c@
0E c!
0F ,
10 c,
11 ~
12 &
13 |
14 x|
15 >>
16 <<
17 <
18 >
19 =
1A <>
1B 0;
1C inc
1D dec
1E key
1F emit
20 .
21 ret
22 cname
23 name (1-byte number follows)
24 lit (4-byte number follows)
25 resb (4-byte count follows)
26 call (4-byte address follows)
27 jmp (4-byte address follows)
28 ?call (4-byte address follows)
29 ?jmp (4-byte address follows)
With this opcode list, one could easily rewrite regular forth code into a form easily translated into bytecode (and indeed, this is essentially what a forth->bytecode compiler would do). The existence of 0; is explicitly for looping structures (if tos=0, exit word, else do nothing), and as you can see, it reduces code length by a bit. You'll notice that there are no looping words. Commonly looping words must be added into a forth system at the CODE word level. By making extensive use of recursion and iteration, the ForthVM removes the need for looping constructs. This puts the bytecode a bit closer to assembler level than forth level, pushing more of the work onto the forth to bytecode translation than the bytecode to native code translation.
Here's a sample: : ^ 1 swap for over * next swap drop ;
5 3 ^ .
: ^ 1 swap ^f swap drop ;
: ^f dup 0; >r over * r> dec ^f ;
5 3 ^ .
This code would end up being compiled like this: 0: lit 5 lit 3 call 1 .
1: lit 1 swap call 2 swap drop ret
2: dup 0; >r over * r> dec jmp 2
Note that in line 1, there is a jump at the end, returning to line 1. How does it return to 0 then? Remember that 0; itself is an implied ret :). ForthVM will compile any calls to words as jumps if the call is at the end of a word definition. Generally it's safe because whatever word you're jumping to is going to either return (which will exit back to the word that called the parent word), or it will also jump to another word and in turn will wait for a word down the line to return. This decreases the amount of activity on the return stack, the amount of memory used by the return stack, and eliminates unnecessary recursion, which should greatly increase performance.
Notes on memory allocation
Originally, ForthVM had a system where 'create...allot' structures would be factored out and set aside as separate memory, and the 'create...,' structures would be left in the heap. This caused a problem (due to my lack of foresight) where there was no reference to the create'd word (doh!). So, the new method is to do everything in the heap, and add a new opcode that creates a named entry (this is separate from resb to avoid having to create names for memory areas that don't need to be named). We're presented with a new problem though: what do we do when we run out of memory for the heap? Well, basically my idea is to allocate the heap in blocks, and when you run out of memory in a block, allocate a new block, copy the contents over from the last created word, and change the named word reference. This may prove to be inflexible later though as you cannot allocate memory larger than a single block, but at the same time, using large blocks wastes memory. This will probably change as I change my mind =p.
Notes on FORTH->ForthVM translation
Since ForthVM has no looping structures, all looping words are extracted out as separate words (which is evident in the example above). Create statements outside of words become part of the data section (in the case of an allot that is. in the case of using commas, the memory will still be allocated, but the code section will begin with code to populate the memory area. this again may change in future versions). Jmps will be compiled when tail recursion is detected (when a word calls itself again just before a ; ). This will save on time, space, and resource usage. I haven't yet decided if the translator will churn out the simplified source code along with the bytecode or not. I'm in favor of it because it means I can write only a few small programs to get all the work done (namely Forth->simplified forth, simplified forth->bytecode, and bytecode->executable).
The translator will make 2 passes to compile into ForthVM's bytecode. On the first pass, all the data section will be laid out, as well as all the code. Instead of memory locations for calls and jumps though, ID's will be inserted for the word names. On the second pass, these IDs are turned into concrete references in the code.
There are pretty solid transformations that take place to convert regular forth code to ForthVM. Here's how ForthVM will compile various different types of conditionals: : word 2dup > if do-x else do-y then do-z ;
0: over over > ?jmp 1 jmp 2
1: do-x jmp 3
2: do-y jmp 3
3: do-z ret
: word 2dup > if do-x then do-y ;
0: over over > ?call 1 do-y ret
1: do-x ret
: word 2dup > if do-x ;; then do-y ;
0: over over > ?jmp 1 do-y ret
1: do-x ret
: word 2dup > if do-x else do-y then ;
0: over over > ?jmp 1 jmp 2
1: do-x ret
2: do-y ret
In the event that there're multiple things after a then statement, ForthVM will put them into a single word, and jump to it from after then. Here's an example of the for/next, repeat/until, and repeat/again loops: : word do-x for do-y next do-z ;
0: do-x call 1 do-z
1: dup 0; >r do-y r> dec jmp 1
: word do-x repeat do-y until do-z ;
0: do-x call 1 do-z
1: do-y dup 0; dec jmp 1
: word do-x repeat do-y again do-z ;
0: do-x call 1 do-z
1: do-y jmp 1
Notes on ForthVM style
Generally speaking, ?call/?jmp is only used when you really have to test for a specific inequality. Programming using 0; is almost always preferable as it takes up far less space in your code and doesn't result in extra words. Personally I find ForthVM very comfortable coming from a retroforth background, but I know others will not be as comfortable. It takes some getting used to, and the Forth->ForthVM translator should let you do most everything you normally do.
ForthVM will translate your code into fairly reasonable bytecode. However, it will not factor common parts of your code automatically. It will (given the appropriate option to flatten code) inline a lot of code, and remove any calls and jumps that it sees as unnecessary. The flattened code when decompiled however will be forth code, but will not be even remotely readable. I suggest that you flatten the code only after proper debugging has been done. |