bye bye

2025-06-25 20:50:31 +02:00 · 2002-03-24 00:44:31 +00:00 · 2002-03-24 00:44:31 +00:00 · fbea34b7cc
commit fbea34b7cc
parent dffe307d60
13 changed files with 0 additions and 1759 deletions
--- a/devel/ChangeLog
+++ b/devel/ChangeLog
@ -1,58 +0,0 @@
 2001-06-27  Thien-Thi Nguyen  <ttn@revel.glug.org>
 	* README: Remove tasks.text.
 	* tasks.text: Bye bye (contents folded into ../TODO).
 2001-05-08  Martin Grabmueller  <mgrabmue@cs.tu-berlin.de>
 	* modules/module-snippets.texi: Fixed a lot of typos and clarified
 	some points.  Thanks to Neil for the typo+questions patch!
 2001-05-07  Martin Grabmueller  <mgrabmue@cs.tu-berlin.de>
 	* modules/module-snippets.texi: New file, documenting the module
 	system.  Placed in `devel' for review purposes.
 2001-03-16  Martin Grabmueller  <mgrabmue@cs.tu-berlin.de>
 	* modules: New directory.
 	* modules/module-layout.text: New file.
 2000-08-26  Mikael Djurfeldt  <mdj@linnaeus.mit.edu>
 	* strings: New directory.
 	* strings/sharedstr.text (sharedstr.text): New file.
 2000-08-12  Mikael Djurfeldt  <mdj@linnaeus.mit.edu>
 	* translate: New directory.
 	* translate/langtools.text: New file.
 2000-05-30  Mikael Djurfeldt  <mdj@mdj.nada.kth.se>
 	* tasks.text: Use outline-mode.  Added section for tasks in need
 	of attention.
 2000-05-29  Mikael Djurfeldt  <mdj@mdj.nada.kth.se>
 	* tasks.text: New file.
 2000-05-25  Mikael Djurfeldt  <mdj@mdj.nada.kth.se>
 	* README: New file.
 	* build/snarf-macros.text: New file.
 2000-05-20  Mikael Djurfeldt  <mdj@mdj.nada.kth.se>
 	* policy/goals.text, policy/principles.text, policy/plans.text:
 	New files.
 2000-03-21  Mikael Djurfeldt  <mdj@thalamus.nada.kth.se>
 	* policy/names.text: New file.
--- a/devel/README
+++ b/devel/README
@ -1,13 +0,0 @@
 Directories:
 policy		Guile policy documents
 build		Build/installation process
 string		Strings and characters
 translation	Language traslation
 vm		Virtual machines
 vm/ior		Mikael's ideas on a new type of Scheme interpreter
--- a/devel/build/snarf-macros.text
+++ b/devel/build/snarf-macros.text
--- a/devel/modules/module-layout.text
+++ b/devel/modules/module-layout.text
@ -1,288 +0,0 @@
 Module Layout Proposal
 ======================
 Martin Grabmueller
 <mgrabmue@cs.tu-berlin.de>
 Draft: 2001-03-11
 Version: $Id: module-layout.text,v 1.1 2001-03-16 08:37:37 mgrabmue Exp $
 * Table of contents
 ** Abstract
 ** Overview
 *** What do we have now?
 *** What should we change?
 ** Policy of module separation
 *** Functionality
 *** Standards
 *** Importance
 *** Compatibility
 ** Module naming
 *** Scheme
 *** Object oriented programming
 *** Systems programming
 *** Database programming
 *** Text processing
 *** Math programming
 *** Network programming
 *** Graphics
 *** GTK+ programming
 *** X programming
 *** Games
 *** Multiple names
 *** Application modules
 ** Future ideas
 * Abstract
 This is a proposal for a new layout of the module name space.  The
 goal is to reduce (or even eliminate) the clutter in the current ice-9
 module directory, and to provide a clean framework for splitting
 libguile into subsystems, grouped by functionality, standards
 compliance and maybe other characteristics.
 This is not a completed policy document, but rather a collection of
 ideas and proposals which still have to be decided.  I will mention by
 personal preference, where appropriate, but the final decisions are of
 course up to the maintainers.
 * Overview
 Currently, new modules are added in an ad-hoc manner to the ice-9
 module name space when the need for them arises.  I think that was
 mainly because no other directory for installed Scheme modules was
 created.  With the integration of GOOPS, the new top-level module
 directory oop was introduced, and we should follow this practice for
 other subsystems which share functionality.
 DISCLAIMER: Please note that I am no expert on Guile's module system,
 so be patient with me and correct me where I got anything wrong.
 ** What do we have now?
 The module (oop goops) contains all functionality needed for
 object-oriented programming with Guile (with a few exceptions in the
 evaluator, which is clearly needed for performance).
 Except for the procedures in the module (ice-9 rdelim), all Guile
 primitives are currently located in the root module (I think it is the
 module (guile)), and some procedures defined in `boot-9.scm' are
 installed in the module (guile-user).
 ** What should we change?
 In the core, there are a lot of primitive procedures which can cleanly
 be grouped into subsystems, and then grouped into modules.  That would
 make the core library more maintainable, would ease seperate testing
 of subsystems and clean up dependencies between subsystems.
 * Policy of module separation
 There are several possibilities to group procedures into modules.
 - They could be grouped by functionality.
 - They could be grouped by standards compliance.
 - They could be grouped by level of importance.
 One important group of modules should of course be provided
 additionally:
 - Compatibility modules.
 So the first thing to decide is: Which of these policies should we
 adopt?  Personally, I think that it is not possible to cleanly use
 exactly one of the possibilities, we will probably use a mixture of
 them.  I propose to group by functionality, and maybe use some
 `bridge-modules', which make functionality available when the user
 requests the modules for a given standard.
 ** Functionality
 Candidates for the first possibility are groups of procedures, which
 already are grouped in source files, such as
 - Regular expression procedures.
 - Network procedures.
 - Systems programming procedures.
 - Random number procedures.
 - Math/numeric procedures.
 - String-processing procedures.
 - List-processing procedures.
 - Character handling procedures.
 - Object-oriented programming support.
 ** Standards
 Guile now complies to R5RS, and I think that the procedures required
 by this standards should always be available to the programmer.
 People who do not want them, could always create :pure modules when
 they need it.
 On the other hand, the SRFI procedures fit nicely into a `group by
 standards' scheme.  An example which is already provided, is the
 SRFI-8 syntax `receive'.  Following that, we could provide two modules
 for each SRFI, one named after the SRFI (like `srfi-8') and one named
 after the main functionality (`receive').
 ** Importance
 By importance, I mean `how important are procedures for the average
 Guile user'.  That means that procedures which are only useful to a
 small group of users (the Guile developers, for example) should not be
 immediately available at the REPL, so that they not confuse the user
 when thay appear in the `apropos' output or the tab completion.
 A good example would be debugging procedures (which also could be
 added with a special command-line option), or low-level system calls.
 ** Compatibility
 This group is for modules providing compatibility procedures.  An
 example would be a module for old string-processing procedures, which
 could someday get overridden by incompatible SRFI procedures of the
 same name.
 * Module naming
 Provided we choose to take the `group by functionality' approach, I
 propose the following naming hierarchy (some of them were actually
 suggested by Mikael Djurfeldt).
 - Schame language related in     (scheme)
 - Object oriented programming in (oop)
 - Systems programming in         (system)
 - Database programming in        (database)
 - Text processing in             (text)
 - Math/numeric programming in    (math)
 - Network programming in         (network)
 - Graphics programming in	 (graphics)
 - GTK+ programming in		 (gtk)
 - X programming in               (xlib)
 - Games in			 (games)
 The layout of sub-hierarchies is up to the writers of modules, we
 should not enforce a strict policy here, because we can not imagine
 what will happen in this area.
 ** Scheme
 (scheme r5rs)		 Complete R5RS procedures set.
 (scheme safe)		 Safe modules.
 (scheme srfi-1)		 List processing.
 (scheme srfi-8)		 Multiple valuas via `receive'.
 (scheme receive)	 dito.
 (scheme and-let-star)	 and-let*
 (scheme syncase)	 syntax-case hygienic macros (maybe included in
 			 (scheme r5rs?).
 (scheme slib)		 SLIB, for historic reasons in (scheme).
 ** Object oriented programming
 Examples in this section are
 (oop goops)              For GOOPS.
 (oop goops ...)          For lower-level GOOPS functionality and utilities.
 ** Systems programming
 (system shell)	         Shell utilities (glob, system etc).
 (system process)	 Process handling.
 (system file-system)	 Low-level filesystem support.
 (system user)		 getuid, setpgrp, etc.
 _or_
 (system posix)		 All posix procedures.
 ** Database programming
 In the database section, there should be sub-module hierarchies for
 each supported database which contains the low-level code, and a
 common database layer, which should unify access to SQL databases via a single interface a la Perl's DBMI.
 (database postgres ...)  Low-level database functionality. 
 (database oracle ...)    ...
 (database mysql ...)     ...
 (database msql ...)      ...
 (database sql)		 Common SQL accessors. 
 (database gdbm ...)      ...
 (database hashed)        Common hashed database accessors (like gdbm).
 (database util)		 Leftovers.
 ** Text processing
 (text rdelim)            Line oriented in-/output.
 (text util)		 Mangling text files.
 ** Math programming
 (math random)            Random numbers.
 (math primes)		 Prime numbers.
 (math vector)		 Vector math.
 (math algebra)		 Algebra.
 (math analysis)		 Analysis.
 (math util)		 Leftovers.
 ** Network programming
 (network inet)		 Internet procedures.
 (network socket)	 Socket interface.
 (network db)		 Network database accessors.
 (network util)		 ntohl, htons and friends.
 ** Graphics
 (graphics vector)	 Generalized vector graphics handling.
 (graphics vector vrml)	 VRML parsers etc.
 (graphisc bitmap)	 Generalized bitmap handling.
 (graphics bitmap ...)    Bitmap format handling (TIFF, PNG, etc.).
 ** GTK+ programming
 (gtk gtk)		 GTK+ procedures.
 (gtk gdk)		 GDK procedures.
 (gtk threads)		 gtktreads.
 ** X programming
 (xlib xlib)		 Low-level XLib programming.
 ** Games
 (games robots)		 GNU robots.
 ** Multiple names
 As already mentioned above, I think that some modules should have
 several names, to make it easier for the user to get the functionality
 she needs.  For example, a user could say: `hey, I need the receive
 macro', or she could say: `I want to stick to SRFI syntax, so where
 the hell is the module for SRFI-8?!?'.
 ** Application modules
 We should not enforce policy on applications.  So I propose that
 application writers should be advised to place modules either in
 application-specific directories $PREFIX/share/$APP/guile/... and name
 that however they like, or to use the application's name as the first
 part of the module name, e.g (gnucash import), (scwm background),
 (rcalc ui).
 * Future ideas
 I have not yet come up with a good idea for grouping modules, which
 deal for example with XML processing.  They would fit into the (text)
 module space, because most XML files contain text data, but they would
 also fit into (database), because XML files are essentially databases.
 On the other hand, XML processing is such a large field that it
 probably is worth it's own top-level name space (xml).
 Local Variables:
 mode: outline
 End:
--- a/devel/modules/module-snippets.texi
+++ b/devel/modules/module-snippets.texi
--- a/devel/policy/goals.text
+++ b/devel/policy/goals.text
--- a/devel/policy/names.text
+++ b/devel/policy/names.text
--- a/devel/policy/plans.text
+++ b/devel/policy/plans.text
--- a/devel/policy/principles.text
+++ b/devel/policy/principles.text
--- a/devel/strings/sharedstr.text
+++ b/devel/strings/sharedstr.text
@ -1,143 +0,0 @@
 Implementation of shared substrings with fresh-copy semantics
 =============================================================
 Version: $Id: sharedstr.text,v 1.1 2000-08-26 20:55:21 mdj Exp $
 Background
 ----------
 In Guile, most string operations work on two other data types apart
 from strings: shared substrings and read-only strings (which includes
 symbols).  One of Guile's sub-goals is to be a scripting language in
 which string management is important.  Read-only strings and shared
 substrings were introduced in order to reduce overhead in string
 manipulation.
 We now want to simplify the Guile API by removing these two data
 types, but keeping performance by allowing ordinary strings to share
 storage.
 The idea is to let operations like `symbol->string' and `substring'
 return a pointer into the original string/symbol, thus avoiding the
 need to copy the string.
 Two of the problems which then arise are:
 * If s2 is derived from s1, and therefore share storage with s1, a
  modification to either s1 or s2 will affect the other.
 * Guile is supposed to interact closely with many UNIX libraries in
  which the NUL character is used to terminate strings.  Therefore
  Guile strings contain a NUL character at the end, in addition to the
  string length (the latter of which is used by Guile's string
  operations).
 The solutions to these problems are to
 * Copy a string with shared storage when it's modified.
 * Copy a string with shared storage when it's being used as argument
  to a C library call.  (Copying implies inserting an ending NUL
  character.)
 But this leads to memory management problems.  When is it OK to free
 a character array which was allocated for a symbol or a string?
 Abstract description of proposed solution
 -----------------------------------------
 Definitions
  STRING = <TYPETAG, LENGTH, CHARRECORDPTR, CHARPTR>
  SYMBOL = <TYPETAG, LENGTH, CHARRECORDPTR, CHARPTR>
  CHARRECORD = <PHASE, SHAREDFLAG, CHARS>
  PHASE = black | white
  SHAREDFLAG = private | shared
  CHARS is a character array
  CHARPTR points into it
 Memory management
 A string or symbol is initially allocated with its contents stored in
 a character array in a character record.  The string/symbol header
 contains a pointer to this record.  The initial value of the shared
 flag in the character record is `private'.
 The GC mark phases alternate between black and white---every second
 phase is black, the rest are white.  This is used to distinguish
 whether a character record has been encountered before:
 During a black mark phase, when the GC encounters a string or symbol,
 it changes the PHASE and SHAREDFLAG marks of the corresponding
 character record according to the following table:
  <white, private> --> <black, private>   (white => unconditionally
  <white, shared>  --> <black, private>    set to <black, private>)
  <black, private> --> <black, shared>    (SHAREDFLAG changed)
  <black, shared>  --> <black, shared>    (no change)
 The behaviour of a white phase is quivalent with the color names
 switched.
 The GC sweep phase frees any unmarked string or symbol header and
 frees its character record either if it is marked with the "wrong"
 color (not matching the color of the last mark phase) or if its
 SHAREDFLAG is `private'.
 Copy-on-write
 An attempt at mutating string contents leads to copying if SHAREDFLAG
 is `shared'.  Copying means making a copy of the character record and
 mutating the CHARRECORDPTR and CHARPTR fields of the object header to
 point to the copy.
 Substring operation
 When making a substring, a new string header is allocated, with new
 contents for the LENGTH and CHARPTR fields.
 Implementation details
 ----------------------
 * We store the character record consecutively with the character
  array and lump the PHASE and SHAREDFLAG fields together into one
  byte containing an integer code for the four possible states of the
  PHASE and SHAREDFLAG fields.  Another way of viewing it is that
  these fields are represented as bits 1 and 0 in the "header" of the
  character array.  We let CHARRECORDPTR point to the first character
  position instead of on this header:
  CHARRECORDPTR
   |
   V
  FCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
  F = 0, 1, 2, 3
 * We represent strings as the sub-types `simple-string' and
  `substring'.
 * In a simple string, CHARRECORDPTR and CHARPTR are represented by a
  single pointer, so a `simple-string' is an ordinary heap cell with
  TYPETAG and LENGTH in the CAR and CHARPTR in the CDR.
 * substring:s are represented as double cells, with TYPETAG and LENGTH
  in word 0, CHARRECORDPTR in word 1 and CHARPTR in word 2
  (alternatively, we could store an offset from CHARRECORDPTR).
 Problems with this implementation
 ---------------------------------
 * How do we make copy-on-write thread-safe?  Is there a different
  implementation which is efficient and thread-safe?
 * If small substrings are frequently generated from large, temporary
  strings and the small substrings are kept in a data structure, the
  heap will still have to host the large original strings.  Should we
  simply accept this?
--- a/devel/translation/langtools.text
+++ b/devel/translation/langtools.text
@ -1,592 +0,0 @@
 * Introduction
 Version: $Id: langtools.text,v 1.5 2000-08-13 04:47:26 mdj Exp $
 This is a proposal for how Guile could interface with language
 translators.  It will be posted on the Guile list and revised for some
 short time (days rather than weeks) before being implemented.
 The document can be found in the CVS repository as
 guile-core/devel/translation/langtools.text.  All Guile developers are
 welcome to modify and extend it according to the ongoing discussion
 using CVS.
 Ideas and comments are welcome.
 For clarity, the proposal is partially written as if describing an
 already existing system.
 MDJ 000812 <djurfeldt@nada.kth.se>
 * Language names
 A translator for Guile is a certain kind of Guile module, implemented
 in Scheme, C, or a mixture of both.
 To make things simple, the name of the language is closely related to
 the name of the translator module.
 Languages have long and short names.  The long form is simply the name
 of the translator module: `(lang ctax)', `(lang emacs-lisp)',
 `(my-modules foo-lang)' etc.
 Languages with the long name `(lang IDENTIFIER)' can be referred to
 with the short name IDENTIFIER, for example `emacs-lisp'.
 * How to tell Guile to read code in a different language (than Scheme)
 There are four methods of specifying which translator to use when
 reading a file:
 ** Command option
 The options to the guile command are parsed linearly from left to
 right.  You can change the language at zero or more points using the
 option
 -t, --language LANGUAGE
 Example:
  guile -t emacs-lisp -l foo -l bar -t scheme -l baz
 will use the emacs-lisp translator while reading "foo" and "bar", and
 the default translator (scheme) for "baz".
 You can use this technique in a script together with the meta switch:
 #!/usr/local/bin/guile \
 -t emacs-lisp -s
 !#
 ** Commentary in file
 When opening a file for reading, Guile will read the first few lines,
 looking for the string "-*- LANGNAME -*-", where LANGNAME can be
 either the long or short form of the name.
 If found, the corresponding translator is loaded and used to read the
 file.
 ** File extension
 Guile maintains an alist mapping filename extensions to languages.
 Each entry has the form:
  (REGEXP . LANGNAME)
 where REGEXP is a string and LANGNAME a symbol or a list of symbols.
 The alist can be accessed using `language-alist' which is exported
 by the module `(core config)':
  (language-alist)			--> current alist
  (language-alist ALIST) 		sets the alist to ALIST
  (language-alist ALIST :prepend)	prepends ALIST onto the current list
  (language-alist ALIST :append)	appends ALIST after current list
 The `load' command will match filenames against this alist and choose
 the translator to use accordingly.
 There will be a default alist for common translators.  For translators
 not listed, the alist has to be extended in .guile just as Emacs users
 extend auto-mode-alist in .emacs.
 ** Module header
 You specify the language used by a module with the :language option in
 the module header.  (See below under "Module configuration language".)
 * Module system
 This section describes how the Guile module system is adapted to use
 with other languages.
 ** Module configuration language
 *** The `(config)' module
 Guile has a sophisticated module system.  We don't require each
 translator implementation to implement its own syntax for modules.
 That would be too much work for the implementor, and users would have
 to learn the module system anew for each syntax.
 Instead, the module `(config)' exports the module header form
 `(define-module ...)'.
 The config module also exports a number of primitives by which you can
 customize the Guile library, such as `language-alist' and `load-path'.
 *** Default module environment
 The bindings of the config module is available in the default
 interaction environment when Guile starts up.  This is because the
 config module is on the module use list for the startup environment.
 However, config bindings are *not* available by default in new
 modules.
 The default module environment provides bindings from the R5RS module
 only.
 *** Module headers
 The module header of the current module system is the form
  (define-module NAME OPTION1 ...)
 You can specify a translator using the option
  :language LANGNAME
 where LANGNAME is the long or short form of language name as described
 above.
 The translator is being fed characters from the module file, starting
 immediately after the end-parenthesis of the module header form.
 NOTE: There can be only one module header per file.
 It is also possible to put the module header in a separate file and
 use the option
  :file FILENAME
 to point out a file containing the actual code.
 Example:
 foo.gm:
 ----------------------------------------------------------------------
 (define-module (foo)
  :language emacs-lisp
  :file "foo.el"
  :export (foo bar)
  )
 ----------------------------------------------------------------------
 foo.el:
 ----------------------------------------------------------------------
 (defun foo ()
  ...)
 (defun bar ()
  ...)
 ----------------------------------------------------------------------
 ** Repl commands
 Up till now, Guile has been dependent upon the available bindings in
 the selected module in order to do basic operations such as moving to
 a different module, enter the debugger or getting documentation.
 This is not acceptable since we want be able to control Guile
 consistently regardless of in which module we are, and sinc we don't
 want to equip a module with bindings which don't have anything to do
 with the purpose of the module.
 Therefore, the repl provides a special command language on top of
 whatever syntax the current module provides.  (Scheme48 and RScheme
 provides similar repl command languages.)
 [Jost Boekemeier has suggested the following alternative solution:
 Commands are bindings just like any other binding.  It is enough if
 some modules carry command bindings (it's in fact enough if *one*
 module has them), because from such a module you can use the command
 (in MODULE) to walk into a module not carrying command bindings, and
 then use CTRL-D to exit.
 However, this has the disadvantage of mixing the "real" bindings with
 command bindings (the module might want to use "in" for other
 purposes), that CTRL-D could cause problems since for some channels
 CTRL-D might close down the connection, and that using one type of
 command ("in") to go "into" the module and another (CTRL-D) to "exit"
 is more complex than simply "going to" a module.]
 *** Repl command syntax
 Normally, repl commands have the syntax
  ,COMMAND ARG1 ...
 Input starting with arbitrary amount of whitespace + a comma thus
 works as an escape syntax.
 This syntax is probably compatible with all languages.  (Note that we
 don't need to activate the lexer of the language until we've checked
 if the first non-whitespace char is a comma.)
 (Hypothetically, if this would become a problem, we can provide means
 of disabling this behaviour of the repl and let that particular
 language module take sole control of reading at the repl prompt.)
 Among the commands available are
 *** ,in MODULE
 Select module named MODULE, that is any new expressions typed by the
 user after this command will be evaluated in the evaluation
 environment provided by MODULE.
 *** ,in MODULE EXPR
 Evaluate expression EXPR in MODULE.  EXPR has the syntax supplied by
 the language used by MODULE.
 *** ,use MODULE
 Import all bindings exported by MODULE to the current module.
 * Language modules
 Since code written in any kind of language should be able to implement
 most tasks, which may include reading, evaluating and writing, and
 generally computing with, expressions and data originating from other
 languages, we want the basic reading, evaluation and printing
 operations to be independent of the language.
 That is, instead of supplying separate `read', `eval' and `write'
 procedures for different languages, a language module is required to
 use the system procedures in the translated code.
 This means that the behaviour of `read', `eval' and `write' are
 context dependent.  (See further "How Guile system procedures `read',
 `eval', `write' use language modules" below.)
 ** Language data types
 Each language module should try to use the fundamental Scheme data
 types as far as this is possible.
 Some data types have important differences in semantics between
 languages, though, and all required data types may not exist in
 Guile.
 In such cases, the language module must supply its own, distinct, data
 types.  So, each language supported by Guile uses a certain set of
 data types, with the basic Scheme data types as the intersection
 between all sets.
 Specifically, syntax trees representing source code expressions should
 normally be a distinct data type.
 ** Foreign language escape syntax
 Note that such data can flow freely between modules.  In order to
 accomodate data with different native syntaxes, each language module
 provides a foreign language escape syntax.  In Scheme, this syntax
 uses the sharp comma extension specified by SRFI-10.  The read
 constructor is simply the last symbol in the long language name (which
 is usually the same as the short language name).
 ** Example 1
 Characters have the syntax in Scheme and in ctax.  Lists currently
 have syntax in Scheme but lack ctax syntax.  Ctax doesn't have a
 datatype "enum", but we pretend it has for this example.
 The following table now shows the syntax used for reading and writing
 these expressions in module A using the language scheme, and module B
 using the language ctax (we assume that the foreign language escape
 syntax in ctax is #LANGUAGE EXPR):
 	  A		   B
 chars	  #\X		   'X'
 lists	  (1 2 3)	   #scheme (1 2 3)
 enums	  #,(ctax ENUM)	   ENUM
 ** Example 2
  A user is typing expressions in a ctax module which imports the
  bindings x and y from the module `(foo)':
  ctax> x = read ();
  1+2;
  1+2;
  ctax> x
  1+2;
  ctax> y = 1;
  1
  ctax> y;
  1  
  ctax> ,in (guile-user)
  guile> ,use (foo)
  guile> x
  #,(ctax 1+2;)
  guile> y
  1
  guile>
 The example shows that ctax uses a distinct representation for ctax
 expressions, but Scheme integers for integers.
 ** Language module interface
 A language module is an ordinary Guile module importing bindings from
 other modules and exporting bindings through its public interface.
 It is required to export the following variable and procedures:
 *** language-environment --> ENVIRONMENT
 Returns a fresh top-level ENVIRONMENT (a module) where expressions
 in this language are evaluated by default.
 Modules using this language will by default have this environment
 on their use list.
 The intention is for this procedure to provide the "run-time
 environment" for the language.
 *** native-read PORT --> OBJECT
 Read next expression in the foreign syntax from PORT and return an
 object OBJECT representing it.
 It is entirely up to the language module to define what one
 expression is, that is, how much to read.
 In lisp-like languages, `native-read' corresponds to `read'.  Note
 that in such languages, OBJECT need not be source code, but could
 be data.
 The representation of OBJECT is also chosen by the language
 module.  It can consist of Scheme data types, data types distinct for
 the language, or a mixture.
 There is one requirement, however: Distinct data types must be
 instances of a subclass of `language-specific-class'.
 This procedure will be called during interactive use (the user
 types expressions at a prompt) and when the system `read'
 procedure is called at a time when a module using this language is
 selected.
 Some languages (for example Python) parse differently depending if
 its an interactive or non-interactive session.  Guile prvides the
 predicate `interactive-port?' to test for this.
 *** language-specific-class
 This variable contains the superclass of all non-Scheme data-types
 provided by the language.
 *** native-write OBJECT PORT
 This procedure prints the OBJECT on PORT using the specific
 language syntax.
 *** write-foreign-syntax OBJECT LANGUAGE NATIVE-WRITE PORT
 Write OBJECT in the foreign language escape syntax of this module.
 The object is specific to language LANGUAGE and can be written using
 NATIVE-WRITE.
 Here's an implementation for Scheme:
 (define (write-foreign-syntax object language native-write port)
  (format port "#(~A " language))
  (native-write object port)
  (display #\) port)
 *** translate EXPRESSION --> SCHEMECODE
 Translate an EXPRESSION into SCHEMECODE.
 EXPRESSION can be anything returned by `read'.
 SCHEMECODE is Scheme source code represented using ordinary Scheme
 data.  It will be passed to `eval' in an environment containing
 bindings in the environment returned by `language-environment'.
 This procedure will be called duing interactive use and when the
 system `eval
 *** translate-all PORT [ALIST] --> THUNK
 Translate the entire stream of characters PORT until #<eof>.
 Return a THUNK which can be called repeatedly like this:
  THUNK --> SCHEMECODE
 Each call will yield a new piece of scheme code.  The THUNK signals
 end of translation by returning the value *end-of-translation* (which
 is tested using the predicate `end-of-translation?').
 The optional argument ALIST provides compilation options for the
 translator:
  (debug . #t) means produce code suitable for debugging
 This procedure will be called by the system `load' command and by
 the module system when loading files.
 The intensions are:
 1. To let the language module decide when and in how large chunks
   to do the processing.  It may choose to do all processing at
   the time translate-all is called, all processing when THUNK is
   called the first time, or small pieces of processing each time
   THUNK is called, or any conceivable combination.
 2. To let the language module decide in how large chunks to output
   the resulting Scheme code in order not to overload memory.
 3. To enable the language module to use temporary files, and
   whole-module analysis and optimization techniques.
 *** untranslate SCHEMECODE --> EXPRESSION
 Attempt to do the inverse of `translate'.  An approximation is OK.  It
 is also OK to return #f.  This procedure will be called from the
 debugger, when generating error messages, backtraces etc.
 The debugger uses the local evaluation environment to determine from
 which module an expression come.  This is how the debugger can know
 which `untranslate' procedure to call for a given expression.
 (This is used currently to decide whether which backtrace frames to
 display.  System modules use the option :no-backtrace to prevent
 displaying of Guile's internals to the user.)
 Note that `untranslate' can use source-properties set by `native-read'
 to give hints about how to do the reverse translation.  Such hints
 could for example be the filename, and line and column numbers for the
 source expression, or an actual copy of the source expression.
 ** How Guile system procedures `read', `eval', `write' use language modules
 *** read
 The idea is that the `read' exported from the R5RS library will
 continue work when called from other languages, and will keep its
 semantics.
 A call to `read' simply means "read in an expression from PORT using
 the syntax associated with that port".
 Each module carries information about its language.
 When an input port is created for a module to be read or during
 interaction with a given module, this information is copied to the
 port object.
 read uses this information to call `native-read' in the correct
 language module.
 *** eval
 [To be written.]
 *** write
 [To be written.]
 * Error handling
 ** Errors during translation
 Errors during translation are generated as usual by calling scm-error
 (from Scheme) or scm_misc_error etc (from C).  The effect of
 throwing errors from within `translate-all' is the same as when they
 are generated within a call to the THUNK returned from
 `translate-all'.
 scm-error takes a fifth argument.  This is a property list (alist)
 which you can use to pass extra information to the error reporting
 machinery.
 Currently, the following properties are supported:
  filename  filename of file being translated
  line	    line number of errring expression
  column    column number
 ** Run-time errors (errors in SCHEMECODE)
 This section pertains to what happens when a run-time error occurs
 during evaluation of the translated code.
 In order to get "foreign code" in error messages, make sure that
 `untranslate' yields good output.  Note the possibility of maintaining
 a table (preferably using weak references) mapping SCHEMECODE to
 EXPRESSION.
 Note the availability of source-properties for attaching filename,
 line and column number, and other, information, such as EXPRESSION, to
 SCHEMECODE.  If filename, line, and, column properties are defined,
 they will be automatically used by the error reporting machinery.
 * Proposed changes to Guile
 ** Implement the above proposal.
 ** Add new field `reader' and `translator' to all module objects
 Make sure they are initialized when a language is specified.
 ** Use `untranslate' during error handling.
 ** Implement the use of arg 5 to scm-error
 (specified in "Errors during translation")
 ** Implement a generic lexical analyzer with interface similar to read/rp
 Mikael is working on this.  (It might take a few days, since he is
 busy with his studies right now.)
 ** Remove scm:eval-transformer
 This is replaced by new fields in each module object (environment).
 `eval' will instead directly the `transformer' field in the module
 passed as second arg.
 Internal evaluation will, similarly, use the transformer of the module
 representing the top-level of the local environment.
 Note that this level of transformation is something independent of
 language translation.  *This* is a hook for adding Scheme macro
 packages and belong to the core language.
 We also need to check the new `translator' field, potentially using
 it.
 ** Package local environments as smobs
 so that environment list structures can't leak out on the Scheme
 level.  (This has already been done in SCM.)
 ** Introduce new fields in input ports
 These carries state information such as
 *** which keyword syntax to support
 *** whether to be case sensitive or not
 *** which lexical grammar to use
 *** whether the port is used in an interactive session or not
 There will be a new Guile primitive `interactive-port?' testing for this.
 ** Move configuration of keyword syntax and case sensitivity to the read-state
 Add new fields to the module objects for these values, so that the
 read-state can be initialized from them.
  *fixme* When? Why? How?
 Probably as soon as the language has been determined during file loading.
 Need to figure out how to set these values.
 Local Variables:
 mode: outline
 End:
--- a/devel/vm/ior/ior-intro.text
+++ b/devel/vm/ior/ior-intro.text
--- a/devel/vm/ior/ior.text
+++ b/devel/vm/ior/ior.text
@ -1,665 +0,0 @@
 ***
 *** These notes about the design of a new type of Scheme interpreter
 *** "Ior" are cut out from various emails from early spring 2000.
 ***
 *** MDJ 000817 <djurfeldt@nada.kth.se>
 ***
 Generally, we should try to make a design which is clean and
 minimalistic in as many respects as possible.  For example, even if we
 need more primitives than those in R5RS internally, I don't think
 these should be made available to the user in the core, but rather be
 made available *through* libraries (implementation in core,
 publication via library).
 The suggested working name for this project is "Ior" (Swedish name for
 the donkey in "Winnie the Pooh" :).  If, against the odds, we really
 would succeed in producing an Ior, and we find it suitable, we could
 turn it into a Guile 2.0 (or whatever).  (The architecture still
 allows for support of the gh interface and uses conservative GC (Hans
 Böhm's, in fact).)
 Beware now that I'm just sending over my original letter, which is
 just a sketch of the more detailed, but cryptic, design notes I made
 originally, which are, in turn, not as detailed as the design has
 become now. :)
 Please also excuse the lack of structure.  I shouldn't work on this at
 all right now.  Choose for yourselves if you want to read this
 unstructured information or if you want to wait until I've structured
 it after end of January.
 But then I actually have to blurt out the basic idea of my
 architecture already now.  (I had hoped to present you with a proper
 and fairly detailed spec, but I won't be able to complete such a spec
 quickly.)
 The basic idea is this:
 * Don't waste time on non-computation!
 Why waste a lot of time on type-checks, unboxing and boxing of data?
 Neither of these actions do any computations!
 I'd like both interpreter and compiled code to work directly with data
 in raw, native form (integers represented as 32bit longs, inexact
 numbers as doubles, short strings as bytes in a word, longer strings
 as a normal pointer to malloced memory, bignums are just pointers to a
 gmp (GNU MultiPrecision library) object, etc.)
 * Don't we need to dispatch on type to know what to do?
 But don't we need to dispatch on the type in order to know how to
 compute with the data?  E.g., `display' does entirely different
 computations on a <fixnum> and a <string>.  (<fixnum> is an integer
 between -2^31 and 2^31-1.)
 The answer is *no*, not in 95% of all cases.  The main reason is that
 the interpreter does type analysis while converting closures to
 bytecode, and knows already when _calling_ `display' what type it's
 arguments has.  This means that the bytecode compiler can choose a
 suitable _version_ of `display' which handles that particular type.
 This type analysis is greatly simplified by the fact that just as the
 type analysis _results_ in the type of the argument in the call to
 `display', and, thus, we can select the correct _version_ of
 `display', the closure byte-code itself will only be one _version_ of
 the closure with the types of its arguments fixed at the start of the
 analysis.
 As you already have understood by now, the basic architecture is that
 all procedures are generic functions, and the "versions" I'm speaking
 about is a kind of methods.  Let's call them "branches" by now.
 For example:
 (define foo
  (lambda (x)
    ...
    (display x)
    ...)
 may result in the following two branches:
 1. [<fixnum>-foo] =
     (branch ((x <fixnum>))
       ...
       ([<fixnum>-display] x)
       ...)
 2. [<string>-foo] =
     (branch ((x <string>))
       ...
       ([<string>-display] x)
       ...)
 and a new closure
 (define bar
  (lambda (x y)
    ...
    (foo x)
    ...))
 results in
 [<fixnum>-<fixnum>-bar] =
  (branch ((x <fixnum>) (y <fixnum>))
    ...
    ([<fixnum>-foo] x)
    ...)
 Note how all type dispatch is eliminated in these examples.
 As a further reinforcement to the type analysis, branches will not
 only have typed parameters but also have return types.  This means
 that the type of a branch will look like
  <type 1> x ... x <type n> --> <type r>
 In essence, the entire system will be very ML-like internally, and we
 can benefit from the research done on ML-compilation.
 However, we now get three major problems to confront:
 1. In the Scheme language not all situations can be completely type
   analyzed.
 2. In particular, for some operations, even if the types of the
   parameters are well defined, we can't determine the return type
   generically.  For example, [<fixnum>-<fixnum>-+] may have return
   type <fixnum> _or_ <bignum>.
 3. Even if we can do a complete analysis, some closures will generate
   a combinatoric explosion of branches.
 Problem 1: Incomplete analysis
 We introduce a new type <boxed>.  This data type has type <boxed> and
 contents
 struct ior_boxed_t {
  ior_type *type; /* pointer to class struct */
  void *data;     /* generic field, may also contain immediate objects
 	           */
 }
 For example, a boxed fixnum 4711 has type <boxed> and contents
 { <fixnum>, 4711 }.  The boxed type essentially corresponds to Guile's
 SCM type.  It's just that the 1 or 3 or 7 or 16-bit type tag has been
 replaced with a 32-bit type tag (the pointer to the class structure
 describing the type of the object).
 This is more inefficient than the SCM type system, but it's no problem
 since it won't be used in 95% of all cases.  The big advantage
 compared to SCM's type system is that it is so simple and uniform.
 I should note here that while SCM and Guile are centered around the
 cell representation and all objects either _are_ cells or have a cell
 handle, objects in ior will more look like mallocs.  This is the
 reason why I planned to start with B<><42>öhm's GC which has C pointers as
 object handles.  But it is of course still possible to use a heap, or,
 preferably several heaps for different kinds of objects.  (B<><42>öhm's GC
 has multiple heaps for different sizes of objects.)  If we write a
 custom GC, we can increase speed further.
 Problem 3 (yes, I skipped :) Combinatoric explosion
 We simply don't generate all possible branches.  In the interpreter we
 generate branches "just-too-late" (well, it's normally called "lazy
 compilation" or "just-in-time", but if it was "in-time", the procedure
 would already be compiled when it was needed, right? :) as when Guile
 memoizes or when a Java machine turns byte-codes into machine code, or
 as when GOOPS turns methods into cmethods for that matter.
 Have noticed that branches (although still without return type
 information) already exist in GOOPS?  They are currently called
 "cmethods" and are generated on demand from the method code and put
 into the GF cache during evaluation of GOOPS code.  :-)  (I have not
 utilized this fully yet.  I plan to soon use this method compilation
 (into branches) to eliminate almost all type dispatch in calls to
 accessors.)
 For the compiler, we use profiling information, just as the modern GCC
 scheduler, or else relies on some type analysis (if a procedure says
 (+ x y), x is not normally a <string> but rather some subclass of
 <number>) and some common sense (it's usually more important to
 generate <fixnum> branches than <foobar> branches).
 The rest of the cases can be handled by <boxed>-branches.  We can, for
 example, have a:
 [<boxed>-<boxed>-bar] =
  (branch ((x <boxed>) (y <boxed>))
    ...
    ([<boxed>-foo] x)
    ...)
 [<boxed>-foo] will use an efficient type dispatch mechanism (for
 example akin to the GOOPS one) to select the right branch of
 `display'.
 Problem 2: Ambiguous return type
 If the return type of a branch is ambiguous, we simply define the
 return type as <boxed>, and box data at the point in the branch where
 it can be decided which type of data we will return.  This is how
 things can be handled in the general case.  However, we might be able
 to handle things in a more neat way, at least in some cases:
 During compilation to byte code, we'll probably use an intermediate
 representation in continuation passing style.  We might even use a
 subtype of branches reprented as continuations (not a heavy
 representation, as in Guile and SCM, but probably not much more than a
 function pointer).  This is, for example, one way of handling tail
 recursion, especially mutual tail recursion.
 One case where we would like to try really hard not to box data is
 when fixnums "overflow into" bignums.
 Let's say that the branch [<fixnum>-<fixnum>-bar] contains a form
  (+ x y)
 where the type analyzer knows that x and y are fixnums.  We then split
 the branch right after the form and let it fork into two possible
 continuation branches bar1 and bar2:
 [The following is only pseudo code.  It can be made efficient on the C
 level.  We can also use the asm compiler directive in conditional
 compilation for GCC on i386.  We could even let autoconf/automake
 substitute an architecture specific solution for multiple
 architectures, but still support a C level default case.]
    (if (sum-over/underflow? x y)
 	(bar1 (fixnum->bignum x) (fixnum->bignum y) ...)
        (bar2 x y ...))
 bar1 begins with the evaluation of the form
  ([<bignum>-<bignum>-+] x y)
 while bar 2 begins with
  ([<fixnum>-<fixnum>-+] x y)
 Note that the return type of each of these forms is unambiguous.
 Now some random points from the design:
 * The basic concept in Ior is the class.  A type is a concrete class.
  Classes which are subclasses of <object> are concrete, otherwise they
  are abstract.
 * A procedure is a collection of methods. Each method can have
  arbitrary number of parameters of arbitrary class (not type).
 * The type of a method is the tuple of it's argument classes.
 * The type of a procedure is the set of it's method types.
 But the most important new concept is the branch.
 Regard the procedure:
 (define (half x)
  (quotient x 2))
 The procedure half will have the single method
  (method ((x <top>))
    (quotient x 2))
 When `(half 128)' is called the Ior evaluator will create a new branch
 during the actual evaluation.  I'm now going to extend the branch
 syntax by adding a second list of formals: the continuations of the
 branch.
 * The type of a branch is namely the tuple of the tuple of it's
  argument types (not classes!) and the tuple of it's continuation
  argument types.  The branch generated above will be:
  (branch ((x <fixnum>) ((c <fixnum>))
    (c (quotient x 2)))
  If the method
  (method ((x <top>) (y <top>))
    (quotient (+ x 1) y))
  is called with arguments 1 and 2 it results in the branch
  (branch ((x <fixnum>) (y <fixnum>)) ((c1 <fixnum>) (c2 <bignum>))
    (quotient (+ x 1 c3) 2))
  where c3 is:
  (branch ((x <fixnum>) (y <fixnum>)) ((c <bignum>))
    (quotient (+ (fixnum->bignum x) 1) 2)
 The generated branches are stored in a cache in the procedure object.
 But wait a minute!  What about variables and data structures?
 In essence, what we do is that we fork up all data paths so that they
 can be typed: We put the type tags on the _data paths_ instead of on
 the data itself.  You can look upon the "branches" as tubes of
 information where the type tag is attached to the tube instead of on
 what passes through it.
 Variables and data structures are part of the "tubes", so they need to
 be typed.  For example, the generic pair looks like:
 (define-class <pair> ()
  car-type
  car
  cdr-type
  cdr)
 But note that since car and cdr are generic procedures, we can let
 more efficient pairs exist in parallel, like
 (define-class <immutable-fixnum-list> ()
  (car (class <fixnum>))
  (cdr (class <immutable-fixnum-list>))) 
 Note that instances of this last type only takes two words of memory!
 They are easy to use too.  We can't use `cons' or `list' to create
 them, since these procedures can't assume immutability, but we don't
 need to specify the type <fixnum> in our program.  Something like
  (const-cons 1 x)
 where x is in the data flow path tagged as <immutable-fixnum-list>, or
  (const-list 1 2 3)
 Some further notes:
 * The concepts module and instance are the same thing.  Using other
  modules means 1. creating a new module class which inherits the
  classes of the used modules and 2. instantiating it.
 * Module definitions and class definitions are equivalent but
  different syntactic sugar adapted for each kind of use.
 * (define x 1) means: create an instance variable which is itself a
  subclass of <boxed> with initial value 1 (which is an instance of
  <fixnum>).
 The interpreter is a mixture between a stack machine and a register
 machine.  The evaluator looks like this...  :)
  /* the interpreter! */
  if (!setjmp (ior_context->exit_buf))
 #ifndef i386_GCC
    while (1)
 #endif
      (*ior_continue) (IOR_MICRO_OP_ARGS);
 The branches are represented as an array of pointers to micro
 operations.  In essence, the evaluator doesn't exist in itself, but is
 folded out over the entire implementation.  This allows for an extreme
 form of modularity!
 The i386_GCC is a machine specific optimization which avoids all
 unnecessary popping and pushing of the CPU stack (which is different
 from the Ior data stack).
 The execution environment consists of
 * a continue register similar to the program counter in the CPU
 * a data stack (where micro operation arguments and results are stored)
 * a linked chain of environment frames (but look at exception below!)
 * a dynamic context
 I've written a small baby Ior which uses Guile's infrastructure.
 Here's the context from that baby Ior:
 typedef struct ior_context_t {
  ior_data_t *env;		/* rest of environment frames */
  ior_cont_t save_continue;	/* saves or represents continuation */
  ior_data_t *save_env;		/* saves or represents environment */
  ior_data_t *fluids;		/* array of fluids (use GC_malloc!) */
  int n_fluids;
  int fluids_size;
  /* dynwind chain is stored directly in the environment, not in context */
  jmp_buf exit_buf;
  IOR_SCM guile_protected;	/* temporary */
 } ior_context_t;
 There's an important exception regarding the lowest environment
 frame.  That frame isn't stored in a separate block on the heap, but
 on Ior's data stack.  Frames are copied out onto the heap when
 necessary (for example when closures "escape").
 Now a concrete example:
 Look at:
 (define sum
  (lambda (from to res)
    (if (= from to)
 	res
 	(sum (+ 1 from) to (+ from res)))))
 This can be rewritten into CPS (which captures a lot of what happens
 during flow analysis):
 (define sum
  (lambda (from to res c1)
    (let ((c2 (lambda (limit?)
 		(let ((c3 (lambda ()
 			    (c1 res)))
 		      (c4 (lambda ()
 			    (let ((c5 (lambda (from+1)
 					(let ((c6 (lambda (from+res)
 						    (sum from+1 to from+res c1))))
 					  (_+ from res c6)))))
 			      (_+ 1 from c5)))))
 		  (_if limit? c3 c4)))))
      (_= from to c2))))
 Finally, after branch expansion, some optimization, code generation,
 and some optimization again, we end up with the byte code for the two
 branches (here marked by labels `sum' and `sumbig'):
 c5
 (ref -3)
 (shift -1)
 (+ <fixnum> <fixnum> c4big)
 ;; c4
 (shift -2)
 (+ <fixnum> 1 sumbig)
 ;; c6
 sum
 (shift 3)
 (ref2 -3)
 ;; c2
 (if!= <fixnum> <fixnum> c5)
 ;; c3
 (ref -1)
 ;; c1
 (end)
 c5big
 (ref -3)
 (shift -1)
 (+ <bignum> <bignum>)
 c4big
 (shift -2)
 (+ <bignum> 1)
 ;; c6
 sumbig
 (shift 3)
 (ref2 -3)
 ;; c2
 (= <bignum> <bignum>)
 (if! c5big)
 ;; c3
 (ref -1)
 ;; c1
 (end)
 Let's take a closer look upon the (+ <fixnum> 1 sumbig) micro
 operation.  The generated assembler from the Ior C source + machine
 specific optimizations for i386_GCC looks like this (with some rubbish
 deleted):
 ior_int_int_sum_intbig:
 	movl 4(%ebx),%eax	; fetch arg 2
 	addl (%ebx),%eax        ; fetch arg 1 and do the work!
 	jo ior_big_sum_int_int  ; dispatch to other branch on overflow
 	movl %eax,(%ebx)	; store result in first environment frame
 	addl $8,%esi		; increment program counter
 	jmp (%esi)		; execute next opcode
 ior_big_sum_int_int:
 To clearify: This is output from the C compiler.  I added the comments
 afterwards.
 The source currently looks like this:
 IOR_MICRO_BRANCH_2_2 ("+", int, big, sum, int, int, 1, 0)
 {
  int res = IOR_ARG (int, 0) + IOR_ARG (int, 1);
  IOR_JUMP_OVERFLOW (res, ior_big_sum_int_int);
  IOR_NEXT2 (z);
 }
 where the macros allow for different definitions depending on if we
 want to play pure ANSI or optimize for a certain machine/compiler.
 The plan is actually to write all source in the Ior language and write
 Ior code to translate the core code into bootstrapping C code.
 Please note that if i386_GCC isn't defined, we run plain portable ANSI C.
 Just one further note:
 In Ior, there are three modes of evaluation
 1. evaluating and type analyzing (these go in parallel)
 2. code generation
 3. executing byte codes
 It is mode 3 which is really fast in Ior.
 You can look upon your program as a web of branch segments where one
 branch segment can be generated from fragments of many closures.  Mode
 switches doesn't occur at the procedure borders, but at "growth
 points".  I don't have time to define them here, but they are based
 upon the idea that the continuation together with the type signature
 of the data flow path is unique.
 We normally run in mode 3.  When we come to a source growth point
 (essentially an apply instruction) for uncompiled code we "dive out"
 of mode 3 into mode 1 which starts to eval/analyze code until we come
 to a "sink".  When we reach the "sink", we have enough information
 about the data path to do code generation, so we backtrack to the
 source growth point and grow the branch between source and sink.
 Finally, we "dive into" mode 3!
 So, code generation doesn't respect procedure borders.  We instead get
 a very neat kind of inlining, which, e.g., means that it is OK to use
 closures instead of macros in many cases.
 ----------------------------------------------------------------------
 Ior and module system
 =====================
 How, exactly, should the module system of Ior look like?
 There is this general issue of whether to have a single-dispatch or
 multi-dispatch system.  Personally, I see that Scheme already use
 multi-dispatch.  Compare (+ 1.0 2) and (+ 1 2.0).
 As you've seen if you've read the notes about Ior design, efficiency
 is not an issue here, since almost all dispatch will be eliminated
 anyway.
 Also, note an interesting thing: GOOPS actually has a special,
 implicit, argument to all of it's methods: the lexical environment.
 It would be very ugly to add a second, special, argument to this.
 Of course, the theoreticians have already recognised this, and in many
 systems, the implicit argument (the object) and the environment for
 the method is the same thing.
 I think we should especially take impressions from Matthias Blume's
 module/object system.
 The idea, now, for Ior (remember that everything about Ior is
 negotiable between us) is that a module is a type, as well as an
 instance of that type.  The idea is that we basically keep the GOOPS
 style of methods, with the implicit argument being the module object
 (or some other lexical environment, in a chain with the module as
 root).
 Let's say now that module C uses modules A and B.  Modules A and B
 both exports the procedure `foo'.  But A:foo and B:foo as different
 sets of methods.
 What does this mean?  Well, it obviously means that the procedure
 `foo' in module C is a subtype of A:foo and B:foo.  Note how this is
 similar in structure to slot inheritance: When class C is created with
 superclasses A and B, the properties of a slot in C are created
 through slot inheritance.  One way of interpreting variable foo in
 module A is as a slot with init value foo.  Through the MOP, we can
 specify that procedure slot inheritance in a module class implies
 creation of new init values through inheritance.
 This may look like a kludge, and perhaps it is, and, sure, we are not
 going to accept any kludges in Ior.  But, it might actually not be a
 kludge...
 I think it is commonly accepted by computer scientists that a module,
 and/or at least a module interface is a type.  Again, this type can be
 seen as the set of types of the functions in the interface.  The types
 of our procedures are the set of branch types the provide.  It is then
 natural that a module using two other modules create new procedure
 types by folding.
 This thing would become less cloudy (yes, this is a cloudy part of my
 reasoning; I meant previously that the interpreter itself is now
 clear) if module interfaces were required to be explicitly types.
 Actually, this would fit much better together with the rest of Ior's
 design.  On one hand, we might be free to introduce such a restriction
 (compiler writers would applaud it), since R5RS hasn't specified any
 module system.  On the other hand, it might be strange to require
 explicit typing when Scheme is fundamentally implicitly types...
 We also have to consider that a module has an "inward" face, which is
 one type, and possibly many "outward" faces, which are different
 types.  (Compare the idea of "interfaces" in Scheme48.)
 It thus, seems that, while a module can truly be an Ior class, the
 reverse should probably not hold in the general case...
 Unless
  instance		<-> module proper
  class of the instance <-> "inward interface"
  superclasses		<-> "outward interfaces + inward uses"
 ...hmm, is this possible to reconcile with Rees' object system?
 Please think about these issues.  We should try to end up with a
 beautiful and consistent object/module system.
 ----------------------------------------------------------------------
 Here's a difficult problem in Ior's design:
 Let's say that we have a mutable data structure, like an ordinary
 list.  Since, in Ior, the type tag (which is really a pointer to a
 class structure) is stored separately from the data, it is thinkable
 that another thread modifies the location in the list between when our
 thread reads the type tag and when it reads the data.
 The reading of type and data must be made atomic in some way.
 Probably, some kind of locking of the heap is required.  It's just
 that it may cause a lot of overhead to look the heap at every *read*
 from a mutable data structure.
 Look how much trouble those set!-operations cause!  Not only does it
 force us to store type tags for each car and cdr in the list, but it
 also forces a lot of explicit dispatch to be done, and causes troubles
 in a threaded system...
 ----------------------------------------------------------------------
 Jim Blandy <jimb@red-bean.com> writes:
 > We also should try to make less work for the GC, by avoiding consing
 > up local environments until they're closed over.
 Did the texts which I sent to you talk about Ior's solution?
 It basically is: Use *two* environment "arguments" to the evaluator
 (in Ior, they aren't arguments but registers):
 * One argument is a pointer to the "top" of an environment stack.
  This is used in the "inner loop" for very efficient access to
  in-between results.  The "top" segment of the environment stack is
  also regarded as the first environment frame in the lexical
  environment.  ("top" is bottom on a stack which grows downwards)
 * The other argument points to a structure holding the evaluation
  context.  In this context, there is a pointer to the chain of the
  rest of the environment frames.  Note that since frames are just
  blocks of SCM values, you can very efficiently "release" a frame
  into the heap by block copying it (remember that Ior uses Boehms GC;
  this is how we allocate the block).