1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-05-18 18:40:22 +02:00

Fixed the compiler, got the disassembler working.

* doc/guile-vm.texi:  Texified and cleaned up.
* src/vm.c:  Use `scm_from_locale_string ()' instead of `scm_makfrom0str ()'.
* src/vm_engine.c:  Likewise.
* src/programs.c (scm_program_bytecode):  Return a u8vector instead of a string.
* module/system/vm/conv.scm (make-byte-decoder):  Fixed a few things wrt. to
  the string to u8vector transition.
* src/objcodes.c (bytecode->objcode):  Fixed a bug where the last 10 bytes of
  the bytecode where ignored.
* module/system/vm/assemble.scm (dump-object!):  Don't convert everything
  to a u8vector, keep strings where it makes sense.
* module/system/vm/conv.scm (code->bytes):  Accordingly, convert strings to
  u8vectors when needed.
  (make-byte-decoder):  Accordingly too, when decoding instructions, return
  variable-length instructions' argument as strings except for `load-program'.
* module/system/vm/disasm.scm:  Export `disassemble-bytecode'.

git-archimport-id: lcourtes@laas.fr--2004-libre/guile-vm--revival--0.6--patch-4
This commit is contained in:
Ludovic Court`es 2005-04-27 09:36:52 +00:00 committed by Ludovic Courtès
parent 054599f117
commit fa19602c28
9 changed files with 223 additions and 135 deletions

View file

@ -6,9 +6,9 @@
@setchapternewpage odd
@c %**end of header
@set EDITION 0.3
@set VERSION 0.3
@set UPDATED 2000-08-22
@set EDITION 0.6
@set VERSION 0.6
@set UPDATED 2005-04-26
@ifinfo
@dircategory Scheme Programming
@ -79,10 +79,14 @@ approved by the Free Software Foundation.
This document corresponds to Guile VM @value{VERSION}.
@menu
* Introduction::
* Variable Management::
* Program Execution::
* Instruction Set::
@end menu
@c *********************************************************************
@node Introduction, Getting Started, Top, Top
@node Introduction, Variable Management, Top, Top
@chapter What is Guile VM?
A Guile VM has a set of registers and its own stack memory. Guile may
@ -90,33 +94,35 @@ have more than one VM's. Each VM may execute at most one program at a
time. Guile VM is a CISC system so designed as to execute Scheme and
other languages efficiently.
@unnumberdsubsec Registers
@unnumberedsubsec Registers
pc - Program counter ;; ip (instruction poiner) is better?
sp - Stack pointer
bp - Base pointer
ac - Accumulator
@itemize
@item pc - Program counter ;; ip (instruction poiner) is better?
@item sp - Stack pointer
@item bp - Base pointer
@item ac - Accumulator
@end itemize
@unnumberdsubsec Engine
@unnumberedsubsec Engine
A VM may have one of three engines: reckless, regular, or debugging.
Reckless engine is fastest but dangerous. Regular engine is normally
fail-safe and reasonably fast. Debugging engine is safest and
functional but very slow.
@unnumberdsubsec Memory
@unnumberedsubsec Memory
Stack is the only memory that each VM owns. The other memory is shared
memory that is shared among every VM and other part of Guile.
@unnumberdsubsec Program
@unnumberedsubsec Program
A VM program consists of a bytecode that is executed and an environment
in which execution is done. Each program is allocated in the shared
memory and may be executed by any VM. A program may call other programs
within a VM.
@unnumberdsubsec Instruction
@unnumberedsubsec Instruction
Guile VM has dozens of system instructions and (possibly) hundreds of
functional instructions. Some Scheme procedures such as cons and car
@ -129,18 +135,19 @@ Most instructions deal with the accumulator (ac). The VM stores all
results from functions in ac, instead of pushing them into the stack.
I'm not sure whether this is a good thing or not.
@node Variable Management
@node Variable Management, Program Execution, Introduction, Top
@chapter Variable Management
A program may have access to local variables, external variables, and
top-level variables.
** Local/external variables
@section Local/external variables
A stack is logically divided into several blocks during execution. A
"block" is such a unit that maintains local variables and dynamic chain.
A "frame" is an upper level unit that maintains subprogram calls.
@example
Stack
dynamic | | | |
chain +==========+ - =
@ -159,9 +166,11 @@ A "frame" is an upper level unit that maintains subprogram calls.
/|frame data| | |
| +----------+ - |
| | | | |
@end example
The first block of each frame may look like this:
@example
Address Data
------- ----
xxx0028 Local variable 2
@ -172,6 +181,7 @@ The first block of each frame may look like this:
xxx0014 Stack pointer (block data)
xxx0010 Return address (frame data)
xxx000c Parent program (frame data)
@end example
The base pointer (bp) always points to the lowest address of local
variables of the recent block. Local variables are referred as "bp[n]".
@ -185,6 +195,7 @@ The external link field of a block has a pointer to such a variable set,
which I call "fragment" (what should I call?). A fragment has a set of
variables and its own chain.
@example
local external
chain| | chain
| +-----+ .--------, |
@ -195,6 +206,7 @@ variables and its own chain.
`-|block|---->|external|-'
+-----+ `--------'
| |
@end example
An external variable is referred as "bp[-2]->variables[n]" or
"bp[-2]->link->...->variables[n]". This is also represented by a pair
@ -204,19 +216,21 @@ current environment of a program.
Other data fields are described later.
** Top-level variables
@section Top-level variables
Guile VM uses the same top-level variables as the regular Guile. A
program may have direct access to vcells. Currently this is done by
calling scm_intern0, but a program is possible to have any top-level
environment defined by the current module.
*** Scheme and VM variable
@section Scheme and VM variable
Let's think about the following Scheme code as an example:
@example
(define (foo a)
(lambda (b) (list foo a b)))
@end example
In the lambda expression, "foo" is a top-level variable, "a" is an
external variable, and "b" is a local variable.
@ -228,24 +242,28 @@ creates a subprogram (closure), associating the fragment with the
subprogram as its external environment. When the closure is executed,
its environment will look like this:
@example
block Top-level: foo
+-------------+
|local var: b | fragment
+-------------+ .-----------,
|external link|---->|variable: a|
+-------------+ `-----------'
@end example
The fragment remains as long as the closure exists.
** Addressing mode
@section Addressing mode
Guile VM has five addressing modes:
o Real address
o Local position
o External position
o Top-level location
o Constant object
@itemize
@item Real address
@item Local position
@item External position
@item Top-level location
@item Constant object
@end itemize
Real address points to the address in the real program and is only used
with the program counter (pc).
@ -262,50 +280,56 @@ object directly.
[ We'll also need dynamic scope addressing to support Emacs Lisp? ]
*** At a Glance
@unnumberedsubsec At a Glance
Guile VM has a set of instructions for each instruction family. `%load'
is, for example, a family to load an object from memory and set the
accumulator (ac). There are four basic `%load' instructions:
@example
%loadl - Local addressing
%loade - External addressing
%loadt - Top-level addressing
%loadi - Immediate addressing
@end example
A possible program code may look like this:
@example
%loadl (0 . 1) ; ac = local[0][1]
%loade (2 . 3) ; ac = external[2][3]
%loadt (foo . #<undefined>) ; ac = #<undefined>
%loadi "hello" ; ac = "hello"
@end example
One instruction that uses real addressing is `%jump', which changes the
value of the program counter:
@example
%jump 0x80234ab8 ; pc = 0x80234ab8
@end example
* Program Execution
@node Program Execution, Instruction Set, Variable Management, Top
@chapter Program Execution
Overall procedure:
1. A source program is compiled into a bytecode.
2. A bytecode is given an environment and becomes a program.
3. A VM starts execution, creating a frame for it.
4. Whenever a program calls a subprogram, a new frame is created for it.
5. When a program finishes execution, it returns a value, and the VM
@enumerate
@item A source program is compiled into a bytecode.
@item A bytecode is given an environment and becomes a program.
@item A VM starts execution, creating a frame for it.
@item Whenever a program calls a subprogram, a new frame is created for it.
@item When a program finishes execution, it returns a value, and the VM
continues execution of the parent program.
@item When all programs terminated, the VM returns the final value and stops.
@end enumerate
6. When all programs terminated, the VM returns the final value and stops.
** Environment
@section Environment
Local variable:
@example
(let ((a 1) (b 2) (c 3)) (+ a b c)) ->
%pushi 1 ; a
@ -317,9 +341,11 @@ Local variable:
%pushl (0 . 2) ; local variable c
add 3 ; ac = a + b + c
%unbind ; remove local bindings
@end example
External variable:
@example
(define foo (let ((n 0)) (lambda () n)))
%pushi 0 ; n
@ -335,15 +361,19 @@ External variable:
%call 0 ; change the current external link
%loade (0 . 0) ; external variable n
%return ; recover the external link
@end example
Top-level variable:
@example
foo ->
%loadt (foo . #<program xxx>) ; top-level variable foo
@end example
** Flow control
@section Flow control
@example
(if #t 1 0) ->
%loadi #t
@ -352,11 +382,13 @@ Top-level variable:
%jump L2
L1: %loadi 0
L2:
@end example
** Function call
@section Function call
Builtin function:
@example
(1+ 2) ->
%loadi 2 ; ac = 2
@ -374,9 +406,11 @@ Builtin function:
%pushi 2 ; 2 -> stack
%pushi 3 ; 3 -> stack
add 3 ; many argument
@end example
External function:
@example
(version) ->
%func0 (version . #<primitive-procedure version>) ; no argument
@ -399,9 +433,11 @@ External function:
%pushi 3
%loadi 3 ; the number of arguments
%func (equal . #<primitive-procedure equal>) ; many arguments
@end example
** Subprogram call
@section Subprogram call
@example
(define (plus a b) (+ a b))
(plus 1 2) ->
@ -413,8 +449,10 @@ External function:
%loadl (0 . 1) ; argument 2
add2 ; ac = 1 + 2
%return ; result is 3
@end example
* Instruction Set
@node Instruction Set, , Program Execution, Top
@chapter Instruction Set
The Guile VM instruction set is roughly divided two groups: system
instructions and functional instructions. System instructions control
@ -422,79 +460,97 @@ the execution of programs, while functional instructions provide many
useful calculations. By convention, system instructions begin with a
letter `%'.
** Environment control instructions
@section Environment control instructions
- %alloc
- %bind
- %export
- %unbind
@itemize
@item %alloc
@item %bind
@item %export
@item %unbind
@end itemize
** Subprogram control instructions
@section Subprogram control instructions
- %make-program
- %call
- %return
@itemize
@item %make-program
@item %call
@item %return
@end itemize
** Data control instructinos
@section Data control instructinos
- %push
- %pushi
- %pushl, %pushl:0:0, %pushl:0:1, %pushl:0:2, %pushl:0:3
- %pushe, %pushe:0:0, %pushe:0:1, %pushe:0:2, %pushe:0:3
- %pusht
@itemize
@item %push
@item %pushi
@item %pushl, %pushl:0:0, %pushl:0:1, %pushl:0:2, %pushl:0:3
@item %pushe, %pushe:0:0, %pushe:0:1, %pushe:0:2, %pushe:0:3
@item %pusht
@end itemize
- %loadi
- %loadl, %loadl:0:0, %loadl:0:1, %loadl:0:2, %loadl:0:3
- %loade, %loade:0:0, %loade:0:1, %loade:0:2, %loade:0:3
- %loadt
@itemize
@item %loadi
@item %loadl, %loadl:0:0, %loadl:0:1, %loadl:0:2, %loadl:0:3
@item %loade, %loade:0:0, %loade:0:1, %loade:0:2, %loade:0:3
@item %loadt
@end itemize
- %savei
- %savel, %savel:0:0, %savel:0:1, %savel:0:2, %savel:0:3
- %savee, %savee:0:0, %savee:0:1, %savee:0:2, %savee:0:3
- %savet
@itemize
@item %savei
@item %savel, %savel:0:0, %savel:0:1, %savel:0:2, %savel:0:3
@item %savee, %savee:0:0, %savee:0:1, %savee:0:2, %savee:0:3
@item %savet
@end itemize
** Flow control instructions
@section Flow control instructions
- %br-if
- %br-if-not
- %jump
@itemize
@item %br-if
@item %br-if-not
@item %jump
@end itemize
** Function call instructions
@section Function call instructions
- %func, %func0, %func1, %func2
@itemize
@item %func, %func0, %func1, %func2
@end itemize
** Scheme buitin functions
@section Scheme built-in functions
- cons
- car
- cdr
@itemize
@item cons
@item car
@item cdr
@end itemize
** Mathematical buitin functions
@section Mathematical buitin functions
- 1+
- 1-
- add, add2
- sub, sub2, minus
- mul2
- div2
- lt2
- gt2
- le2
- ge2
- num-eq2
@itemize
@item 1+
@item 1-
@item add, add2
@item sub, sub2, minus
@item mul2
@item div2
@item lt2
@item gt2
@item le2
@item ge2
@item num-eq2
@end itemize
@c *********************************************************************
@node Concept Index, Command Index, Related Information, Top
@unnumbered Concept Index
@printindex cp
@c @node Concept Index, Command Index, Related Information, Top
@c @unnumbered Concept Index
@c @printindex cp
@node Command Index, Variable Index, Concept Index, Top
@unnumbered Command Index
@printindex fn
@c @node Command Index, Variable Index, Concept Index, Top
@c @unnumbered Command Index
@c @printindex fn
@node Variable Index, , Command Index, Top
@unnumbered Variable Index
@printindex vr
@c @node Variable Index, , Command Index, Top
@c @unnumbered Variable Index
@c @printindex vr
@bye

View file

@ -77,6 +77,7 @@
(label-alist '())
(object-alist '()))
(define (push-code! code)
(format #t "push-code! ~a~%" code)
(set! stack (cons (code->bytes code) stack)))
(define (push-object! x)
(cond ((object->code x) => push-code!)
@ -90,7 +91,7 @@
(push-code! `(object-ref ,i))))))
(define (current-address)
(define (byte-length x)
(cond ((string? x) (u8vector-length x))
(cond ((u8vector? x) (u8vector-length x))
(else 3)))
(apply + (map byte-length stack)))
(define (generate-code x)
@ -167,6 +168,7 @@
;;
;; main
(for-each generate-code body)
(format #t "codegen: stack = ~a~%" (reverse stack))
(let ((bytes (stack->bytes (reverse! stack) label-alist)))
(if toplevel
(bytecode->objcode bytes vars.nlocs vars.nexts)
@ -211,18 +213,6 @@
;; NOTE: undumpped in vm_load.c.
(define (dump-object! push-code! x)
(define (symbol->u8vector sym)
(apply u8vector
(map char->integer
(string->list (symbol->string sym)))))
(define (number->u8vector num)
(apply u8vector
(map char->integer
(string->list (number->string num)))))
(define (string->u8vector str)
(apply u8vector
(map char->integer (string->list str))))
(let dump! ((x x))
(cond
((object->code x) => push-code!)
@ -256,7 +246,7 @@
(push-code! `(load-program ,bytes)))
(($ <vlink> module name)
;; FIXME: dump module
(push-code! `(link ,(symbol->u8vector name))))
(push-code! `(link ,(symbol->string name))))
(($ <vmod> id)
(push-code! `(load-module ,id)))
((and ($ integer) ($ exact))
@ -266,14 +256,14 @@
(apply u8vector l)))))
(push-code! `(load-integer ,str))))
(($ number)
(push-code! `(load-number ,(number->u8vector x))))
(push-code! `(load-number ,(number->string x))))
(($ string)
(push-code! `(load-string ,(string->u8vector x))))
(push-code! `(load-string ,(string->string x))))
(($ symbol)
(push-code! `(load-symbol ,(symbol->u8vector x))))
(push-code! `(load-symbol ,(symbol->string x))))
(($ keyword)
(push-code! `(load-keyword
,(symbol->u8vector (keyword-dash-symbol x)))))
,(symbol->string (keyword-dash-symbol x)))))
(($ list)
(for-each dump! x)
(push-code! `(list ,(length x))))

View file

@ -24,6 +24,7 @@
:use-module (ice-9 match)
:use-module (ice-9 regex)
:use-module (srfi srfi-4)
:use-module (srfi srfi-1)
:export (code-pack code-unpack object->code code->object code->bytes
make-byte-decoder))
@ -86,7 +87,18 @@
(('load-keyword s) (make-keyword-from-dash-symbol (string->symbol s)))
(else #f)))
; (let ((c->o code->object))
; (set! code->object
; (lambda (code)
; (format #t "code->object: ~a~%" code)
; (let ((ret (c->o code)))
; (format #t "code->object returned ~a~%" ret)
; ret))))
(define (code->bytes code)
(define (string->u8vector str)
(apply u8vector (map char->integer (string->list str))))
(let* ((code (code-pack code))
(inst (car code))
(rest (cdr code))
@ -95,6 +107,8 @@
(cond ((< len 0)
;; Variable-length code
;; Typical instructions are `link' and `load-program'.
(if (string? (car rest))
(set-car! rest (string->u8vector (car rest))))
(let* ((str (car rest))
(str-len (u8vector-length str))
(encoded-len (encode-length str-len))
@ -121,9 +135,11 @@
(define (make-byte-decoder bytes)
(let ((addr 0) (size (u8vector-length bytes)))
(define (pop)
(let ((byte (char->integer (u8vector-ref bytes addr))))
(let ((byte (u8vector-ref bytes addr)))
(set! addr (1+ addr))
byte))
(define (sublist lst start end)
(take (drop lst start) (- end start)))
(lambda ()
(if (< addr size)
(let* ((start addr)
@ -132,12 +148,16 @@
(code (if (< n 0)
;; variable length
(let* ((end (+ (decode-length pop) addr))
(str (apply u8vector
(list-tail (u8vector->list
bytes)
addr))))
(subbytes (sublist
(u8vector->list bytes)
addr end))
(->string? (not (eq? inst 'load-program))))
(set! addr end)
(list inst str))
(list inst
(if ->string?
(list->string
(map integer->char subbytes))
(apply u8vector subbytes))))
;; fixed length
(do ((n n (1- n))
(l '() (cons (pop) l)))

View file

@ -27,7 +27,7 @@
:use-module (ice-9 format)
:use-module (ice-9 receive)
:use-module (ice-9 and-let-star)
:export (disassemble-objcode disassemble-program))
:export (disassemble-objcode disassemble-program disassemble-bytecode))
(define (disassemble-objcode objcode . opts)
(let* ((prog (objcode->program objcode))
@ -129,6 +129,15 @@
(define (list->info list)
(object->string list))
; (define (u8vector->string vec)
; (list->string (map integer->char (u8vector->list vec))))
; (case (car list)
; ((link)
; (object->string `(link ,(u8vector->string (cadr list)))))
; (else
; (object->string list))))
(define (print-info addr info extra)
(if extra
(format #t "~4@A ~32A;; ~A\n" addr info extra)

View file

@ -87,7 +87,7 @@ SCM_DEFINE (scm_instruction_list, "instruction-list", 0, 0, 0,
SCM list = SCM_EOL;
struct scm_instruction *ip;
for (ip = scm_instruction_table; ip->opcode != scm_op_last; ip++)
list = scm_cons (scm_str2symbol (ip->name), list);
list = scm_cons (scm_from_locale_symbol (ip->name), list);
return scm_reverse_x (list, SCM_EOL);
}
#undef FUNC_NAME
@ -150,7 +150,7 @@ SCM_DEFINE (scm_opcode_to_instruction, "opcode->instruction", 1, 0, 0,
SCM_VALIDATE_INUM (1, op);
i = scm_to_int (op);
SCM_ASSERT_RANGE (1, op, 0 <= i && i < scm_op_last);
return scm_str2symbol (scm_instruction_table[i].name);
return scm_from_locale_symbol (scm_instruction_table[i].name);
}
#undef FUNC_NAME

View file

@ -153,6 +153,8 @@ SCM_DEFINE (scm_bytecode_to_objcode, "bytecode->objcode", 3, 0, 0,
c_bytecode = scm_u8vector_elements (bytecode, &handle, &size, &increment);
assert (increment == 1);
/* Account for the 10 byte-long header. */
size += 10;
objcode = make_objcode (size);
base = SCM_OBJCODE_BASE (objcode);

View file

@ -188,15 +188,26 @@ SCM_DEFINE (scm_program_external, "program-external", 1, 0, 0,
SCM_DEFINE (scm_program_bytecode, "program-bytecode", 1, 0, 0,
(SCM program),
"")
"Return a u8vector containing @var{program}'s bytecode.")
#define FUNC_NAME s_scm_program_bytecode
{
size_t size;
char *c_bytecode;
SCM_VALIDATE_PROGRAM (1, program);
return scm_makfromstr (SCM_PROGRAM_DATA (program)->base,
SCM_PROGRAM_DATA (program)->size, 0);
size = SCM_PROGRAM_DATA (program)->size;
c_bytecode = malloc (size);
if (!c_bytecode)
return SCM_BOOL_F;
memcpy (c_bytecode, SCM_PROGRAM_DATA (program)->base, size);
return scm_take_u8vector (c_bytecode, size);
}
#undef FUNC_NAME
void
scm_init_programs (void)

View file

@ -301,7 +301,7 @@ SCM_DEFINE (scm_vm_version, "vm-version", 0, 0, 0,
"")
#define FUNC_NAME s_scm_vm_version
{
return scm_makfrom0str (VERSION);
return scm_from_locale_string (VERSION);
}
#undef FUNC_NAME

View file

@ -123,44 +123,44 @@ vm_run (SCM vm, SCM program, SCM args)
/* Errors */
{
vm_error_unbound:
err_msg = scm_makfrom0str ("VM: Unbound variable: ~A");
err_msg = scm_from_locale_string ("VM: Unbound variable: ~A");
goto vm_error;
vm_error_wrong_type_arg:
err_msg = scm_makfrom0str ("VM: Wrong type argument");
err_msg = scm_from_locale_string ("VM: Wrong type argument");
err_args = SCM_EOL;
goto vm_error;
vm_error_wrong_num_args:
err_msg = scm_makfrom0str ("VM: Wrong number of arguments");
err_msg = scm_from_locale_string ("VM: Wrong number of arguments");
err_args = SCM_EOL;
goto vm_error;
vm_error_wrong_type_apply:
err_msg = scm_makfrom0str ("VM: Wrong type to apply: ~S");
err_msg = scm_from_locale_string ("VM: Wrong type to apply: ~S");
err_args = SCM_LIST1 (program);
goto vm_error;
vm_error_stack_overflow:
err_msg = scm_makfrom0str ("VM: Stack overflow");
err_msg = scm_from_locale_string ("VM: Stack overflow");
err_args = SCM_EOL;
goto vm_error;
vm_error_stack_underflow:
err_msg = scm_makfrom0str ("VM: Stack underflow");
err_msg = scm_from_locale_string ("VM: Stack underflow");
err_args = SCM_EOL;
goto vm_error;
#if VM_CHECK_IP
vm_error_invalid_address:
err_msg = scm_makfrom0str ("VM: Invalid program address");
err_msg = scm_from_locale_string ("VM: Invalid program address");
err_args = SCM_EOL;
goto vm_error;
#endif
#if VM_CHECK_EXTERNAL
vm_error_external:
err_msg = scm_makfrom0str ("VM: Invalid external access");
err_msg = scm_from_locale_string ("VM: Invalid external access");
err_args = SCM_EOL;
goto vm_error;
#endif