Update "Variables and the VM"

* doc/ref/vm.texi (Variables and the VM): Update.
2025-07-12 12:10:30 +02:00 · 2018-09-28 12:16:39 +02:00 · 2018-09-28 12:16:39 +02:00 · 4c53593bbe
commit 4c53593bbe
parent 5e671cea02
1 changed files with 110 additions and 53 deletions
--- a/doc/ref/vm.texi
+++ b/doc/ref/vm.texi
@ -240,7 +240,7 @@ Consider the following Scheme code as an example:
@example
  (define (foo a)
-    (lambda (b) (list foo a b)))
+    (lambda (b) (vector foo a b)))
@end example
 Within the lambda expression, @code{foo} is a top-level variable,
@ -307,50 +307,107 @@ segment that contains the compiled bytecode, and accessed directly by
 the bytecode.
 Another use for statically allocated data is to serve as a cache for a
-bytecode.  Top-level variable lookups are handled in this way.  If the
+bytecode.  Top-level variable lookups are handled in this way; the first
-@code{toplevel-box} instruction finds that it does not have a cached
+time a top-level binding is referenced, the resolved variable will be
-variable for a top-level reference, it accesses other static data to
+stored in a cache.  Thereafter all access to the variable goes through
-resolve the reference, and fills in the cache slot.  Thereafter all
+the cache cell.  The variable's value may change in the future, but the
-access to the variable goes through the cache cell.  The variable's
+variable itself will not.
 value may change in the future, but the variable itself will not.
 We can see how these concepts tie together by disassembling the
@code{foo} function we defined earlier to see what is going on:
@smallexample
-scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
+scheme@@(guile-user)> (define (foo a) (lambda (b) (vector foo a b)))
 scheme@@(guile-user)> ,x foo
-Disassembly of #<procedure foo (a)> at #xea4ce4:
+Disassembly of #<procedure foo (a)> at #xf1da30:
-   0    (assert-nargs-ee/locals 2 0)    ;; 2 slots (1 arg)    at (unknown file):1:0
+   0    (instrument-entry 164)                                at (unknown file):5:0
-   1    (make-closure 1 7 1)            ;; anonymous procedure at #xea4d04 (1 free var)
+   2    (assert-nargs-ee/locals 2 1)    ;; 3 slots (1 arg)
-   4    (free-set! 1 0 0)               ;; free var 0
+   3    (allocate-words/immediate 2 3)                        at (unknown file):5:16
-   6    (mov 0 1)
+   4    (load-u64 0 0 65605)
-   7    (return-values 2)               ;; 1 value
+   7    (word-set!/immediate 2 0 0)
   8    (load-label 0 7)                ;; anonymous procedure at #xf1da6c
  10    (word-set!/immediate 2 1 0)
  11    (scm-set!/immediate 2 2 1)
  12    (reset-frame 1)                 ;; 1 slot
  13    (handle-interrupts)
  14    (return-values)
 ----------------------------------------
-Disassembly of anonymous procedure at #xea4d04:
+Disassembly of anonymous procedure at #xf1da6c:
-   0    (assert-nargs-ee/locals 2 2)    ;; 4 slots (1 arg)    at (unknown file):1:16
+   0    (instrument-entry 183)                                at (unknown file):5:16
-   1    (toplevel-box 1 74 58 68 #t)    ;; `foo'
+   2    (assert-nargs-ee/locals 2 3)    ;; 5 slots (1 arg)
-   6    (box-ref 1 1)                   
+   3    (static-ref 2 152)              ;; #<variable 112e530 value: #<procedure foo (a)>>
-   7    (make-short-immediate 0 772)    ;; ()                 at (unknown file):1:28
+   5    (immediate-tag=? 2 7 0)         ;; heap-object?
-   8    (cons 2 2 0)                    
+   7    (je 19)                         ;; -> L2
-   9    (free-ref 3 3 0)                ;; free var 0
+   8    (static-ref 2 119)              ;; #<directory (guile-user) ca9750>
-  11    (cons 3 3 2)
+  10    (static-ref 1 127)              ;; foo
-  12    (cons 2 1 3)
+  12    (call-scm<-scm-scm 2 2 1 40)
-  13    (return-values 2)               ;; 1 value
+  14    (immediate-tag=? 2 7 0)         ;; heap-object?
  16    (jne 8)                         ;; -> L1
  17    (scm-ref/immediate 0 2 1)
  18    (immediate-tag=? 0 4095 2308)   ;; undefined?
  20    (je 4)                          ;; -> L1
  21    (static-set! 2 134)             ;; #<variable 112e530 value: #<procedure foo (a)>>
  23    (j 3)                           ;; -> L2
 L1:
  24    (throw/value 1 151)             ;; #(unbound-variable #f "Unbound variable: ~S")
 L2:
  26    (scm-ref/immediate 2 2 1)
  27    (allocate-words/immediate 1 4)                        at (unknown file):5:28
  28    (load-u64 0 0 781)
  31    (word-set!/immediate 1 0 0)
  32    (scm-set!/immediate 1 1 2)
  33    (scm-ref/immediate 4 4 2)
  34    (scm-set!/immediate 1 2 4)
  35    (scm-set!/immediate 1 3 3)
  36    (mov 4 1)
  37    (reset-frame 1)                 ;; 1 slot
  38    (handle-interrupts)
  39    (return-values)
@end smallexample
-First there's some prelude, where @code{foo} checks that it was called
+The first thing to notice is that the bytecode is at a fairly low level.
-with only 1 argument.  Then at @code{ip} 1, we allocate a new closure
+When a program is compiled from Scheme to bytecode, it is expressed in
-and store it in slot 1, relative to the @code{sp}.
+terms of more primitive operations.  As such, there can be more
 instructions than you might expect.
-At run-time, local variables in Guile are usually addressed relative to
+The first chunk of instructions is the outer @code{foo} procedure.  It
-the stack pointer, which leads to a pleasantly efficient
+is followed by the code for the contained closure.  The code can look
-@code{sp[@var{n}]} access.  However it can make the disassembly hard to
+daunting at first glance, but with practice it quickly becomes
-read, because the @code{sp} can change during the function, and because
+comprehensible, and indeed being able to read bytecode is an important
-incoming arguments are relative to the @code{fp}, not the @code{sp}.
+step to understanding the low-level performance of Guile programs.
 The @code{foo} function begins with a prelude.  The
@code{instrument-entry} bytecode increments a counter associated with
 the function.  If the counter reaches a certain threshold, Guile will
 emit machine code (``JIT-compile'') for @code{foo}.  Emitting machine
 code is fairly cheap but it does take time, so it's not something you
 want to do for every function.  Using a per-function counter and a
 global threshold allows Guile to spend time JIT-compiling only the
 ``hot'' functions.
 Next in the prelude is an argument-checking instruction, which checks
 that it was called with only 1 argument (plus the callee function itself
 makes 2) and then reserves stack space for an additional 2 locals.
 Then from @code{ip} 3 to 11, we allocate a new closure by allocating a
 three-word object, initializing its first word to store a type tag,
 setting its second word to its code pointer, and finally at @code{ip}
 11, storing local value 1 (the @code{a} argument) into the third word
 (the first free variable).
 Before returning, @code{foo} ``resets the frame'' to hold only one local
 (the return value), runs any pending interrupts (@pxref{Asyncs}) and
 then returns.
 Note that local variables in Guile's virtual machine are usually
 addressed relative to the stack pointer, which leads to a pleasantly
 efficient @code{sp[@var{n}]} access.  However it can make the
 disassembly hard to read, because the @code{sp} can change during the
 function, and because incoming arguments are relative to the @code{fp},
 not the @code{sp}.
 To know what @code{fp}-relative slot corresponds to an
@code{sp}-relative reference, scan up in the disassembly until you get
@ -362,30 +419,30 @@ Guile doesn't need the value of the closure to compute its result, and
 so slot 0 was free for re-use, in this case for the result of making a
 new closure.
-A closure is code with data.  The @code{6} in the @code{(make-closure 1
+A closure is code with data.  As you can see, making the closure
-6 1)} is a relative offset from the instruction pointer of the code for
+involved making an object (@code{ip} 3), putting a code pointer in it
-the closure, and the final @code{1} indicates that the closure has space
+(@code{ip} 8 and 10), and putting in the closure's free variable
-for 1 free variable.  @code{Ip} 4 initializes free variable 0 in the new
+(@code{ip} 11).
 closure with the value from @code{sp}-relative slot 0, which corresponds
 to @code{fp}-relative slot 1, the first argument of @code{foo}:
@code{a}.  Finally we return the closure.
 The second stanza disassembles the code for the closure.  After the
-prelude, we load the variable for the toplevel variable @code{foo} into
+prelude, all of the code between @code{ip} 5 and 24 related to loading
-slot 1.  This lookup occurs lazily, the first time the variable is
+the load the variable for the toplevel variable @code{foo} into slot 1.
-actually referenced, and the location of the lookup is cached so that
+This lookup happens only once, and is associated with a cache; after the
-future references are very cheap.  @xref{Top-Level Environment
+first run, the value in the cache will be a bound variable, and the code
-Instructions}, for more details.  The @code{box-ref} dereferences the
+will jump from @code{ip} 7 to 26.  On the first run, Guile gets the
-variable cell, replacing the contents of slot 1.
+module associated with the function, calls out to a run-time routine to
 look up the variable, and checks that the variable is bound before
 initializing the cache.  Either way, @code{ip} 26 dereferences the
 variable into local 2.
-What follows is a sequence of conses to build up the result list.
+What follows is the allocation and initialization of the vector return
-@code{Ip} 7 makes the tail of the list.  @code{Ip} 8 conses on the value
+value.  @code{Ip} 27 does the allocation, and the following two
-in slot 2, corresponding to the first argument to the closure: @code{b}.
+instructions initialize the type-and-length tag for the object's first
-@code{Ip} 9 loads free variable 0 of slot 3 -- the procedure being
+word.  @code{Ip} 32 sets word 1 of the object (the first vector slot) to
-called, in @code{fp}-relative slot 0 -- into slot 3, then @code{ip} 11
+the value of @code{foo}; @code{ip} 33 fetches the closure variable for
-conses it onto the list.  Finally we cons the value in slot 1,
+@code{a}, then in @code{ip} 34 stores it in the second vector slot; and
-containing the @code{foo} toplevel, onto the front of the list, and we
+finally, in @code{ip} 35, local @code{b} is stored to the third vector
-return it.
+slot.  This is followed by the return sequence.
@node Object File Format