mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-06-03 18:50:19 +02:00
bye bye
This commit is contained in:
parent
a7dc0db49a
commit
8660251f7d
29 changed files with 0 additions and 1467 deletions
|
@ -1,295 +0,0 @@
|
|||
@c devel/modules/desgin-notes.texi
|
||||
|
||||
@c TODO
|
||||
@c - distill wishlist, index
|
||||
@c - in Findings, characterize current module system wrt wishlist
|
||||
|
||||
@node Module System Design Notes
|
||||
@chapter Module System Design Notes
|
||||
|
||||
This chapter documents module system design history. At the moment
|
||||
(guile-1.5.4, 2002-02-08), the module system is supposedly undergoing
|
||||
redesign; the provisional implementation works but has problems, notably
|
||||
making compilation difficult.
|
||||
|
||||
Because module systems design is (was?) an area of active research and
|
||||
development in the Scheme community, many different features are possible.
|
||||
This section is a historical record of the selection process used by the Guile
|
||||
hackers (if one can be discerned).
|
||||
|
||||
@menu
|
||||
* Wishlist::
|
||||
* Findings::
|
||||
* Selection Criteria::
|
||||
* Rationale Statements::
|
||||
* Specification::
|
||||
@end menu
|
||||
|
||||
@node Wishlist
|
||||
@subsection Wishlist
|
||||
|
||||
In the guile-related mailing lists of yore, discussion resulted in the
|
||||
following desirable traits. Note that some of these contradict each other.
|
||||
|
||||
@itemize @bullet
|
||||
|
||||
@item
|
||||
support separate compilation
|
||||
|
||||
@item
|
||||
hierarchical module names
|
||||
|
||||
@item
|
||||
support relative references within the module name space (so that a
|
||||
module within a package can use a sibling module without knowing the
|
||||
prefix of the module name)
|
||||
|
||||
@item
|
||||
support re-use of code (the same implementation can be presented to
|
||||
the world through several interfaces)
|
||||
|
||||
@item
|
||||
support individual and group renaming of bindings when using other
|
||||
modules
|
||||
|
||||
@item
|
||||
easy to import and re-export entire interfaces (so that a main
|
||||
interface in a package can function as a "relay" and publish
|
||||
bindings from many modules in the package)
|
||||
|
||||
@item
|
||||
support both variable and syntactic bindings (these should be
|
||||
clearly separated, though) and mesh well with hygienic macro
|
||||
systems
|
||||
|
||||
@item
|
||||
hygienic implies that we shouldn't need to care to export bindings
|
||||
which an exported macro introduces at the point of its use
|
||||
|
||||
@item
|
||||
autoloading
|
||||
|
||||
@item
|
||||
unmemoization protocol
|
||||
|
||||
@item
|
||||
cleanliness
|
||||
|
||||
A module should be able to be totally clean. There should be no
|
||||
need to have *any* extra bindings in a module (a la
|
||||
%module-interface or `define-module').
|
||||
|
||||
Therefore, we should have at least one dedicated "command" or
|
||||
"config" or "repl" module.
|
||||
|
||||
It would probably be a good idea to follow other good Scheme
|
||||
interpreters' lead and introduce the ,<command> syntax for walking
|
||||
around modules, inspecting things, entering the debugger, etc.
|
||||
Such commands can be looked up in this repl module.
|
||||
|
||||
If we insist on not using ,<command> syntax, we are forced to let
|
||||
the module use list consist of a "sticky" part and the rest, where
|
||||
the "sticky" part is only available at the repl prompt and not to
|
||||
the code within the module, and which follows us when we walk around
|
||||
in the system.
|
||||
|
||||
@item
|
||||
well integrated with the translator framework
|
||||
|
||||
We should be able to say that a module uses a different syntax or
|
||||
language.
|
||||
|
||||
Note here the nice config language of the Scheme48 module system
|
||||
where it is possible to separate code from module specification: the
|
||||
module specification can use scheme syntax and rest in one file,
|
||||
while the module itself can use the syntax of another language.
|
||||
|
||||
This config language also supports re-use of code in a neat way.
|
||||
|
||||
@item
|
||||
examine connection with object system: how easy is it to support
|
||||
Java and other class-centered object systems?
|
||||
|
||||
@item
|
||||
easy to export the same module under several different names
|
||||
|
||||
@item
|
||||
easily supports both compiled and interpreted modules
|
||||
|
||||
@item
|
||||
compiled modules can by dynamically loaded or statically linked in
|
||||
(libltdl might provide this automatically)
|
||||
|
||||
@item
|
||||
convenient syntax for referencing bindings in modules that are
|
||||
loaded but not used
|
||||
|
||||
(Assuming this distinction exists.) But maybe group renaming is a better
|
||||
solution to a similar class of problems.
|
||||
|
||||
@item
|
||||
ability to unuse a module (and have it get collected when there are
|
||||
no more references)
|
||||
|
||||
@item
|
||||
orthoganality between source files, directories and modules. i.e. ability to
|
||||
have multiple modules in one source file and/or multiple source files in one
|
||||
module
|
||||
|
||||
@item
|
||||
backward compatibility
|
||||
|
||||
@item
|
||||
whenever possible the module's meta-information should be stored
|
||||
within the module itself (only possible for scheme files)
|
||||
|
||||
@item
|
||||
the compiler stores the meta-information into a single file and updates it
|
||||
accordingly
|
||||
|
||||
(FIXME: per module, per package, directory?, per project?) This
|
||||
meta-information should still be human readable (Sun's EJB use XML for their
|
||||
deployment descriptors).
|
||||
|
||||
@item
|
||||
use the file system as module repository
|
||||
|
||||
Since guile is a GNU project we can expect a GNU (or Unix) environment. That
|
||||
means that we should use the file system as the module repository. (This
|
||||
would help us avoid modules pointing to files which are pointing to other
|
||||
files telling the user "hey, that's me (the module) over there".)
|
||||
|
||||
@item
|
||||
every module has exactly @emph{one} owner who is responsible for the
|
||||
module @emph{and} its interface (this contradicts with the "more than one
|
||||
interface" concept)
|
||||
|
||||
@item
|
||||
support module collections
|
||||
|
||||
Provide "packages" with a package scope for people working together on a
|
||||
project. In some sense a module is a view on the underlying package.
|
||||
|
||||
@item
|
||||
ability to request (i.e. import or access) complete packages
|
||||
|
||||
@item
|
||||
support module "generations" (or "versions")
|
||||
|
||||
Whenever a new module fails to handle a request (due to an error) it will be
|
||||
replaced by the old version.
|
||||
|
||||
@item
|
||||
help the user to handle exceptions (note: exceptions
|
||||
are not errors, see above)
|
||||
|
||||
@item
|
||||
no special configuration language (like @code{,in} etc.)
|
||||
|
||||
You can always press Control-D to terminate the module's repl and return to
|
||||
the config module.
|
||||
|
||||
@item
|
||||
both C and Scheme level interfaces
|
||||
|
||||
@item
|
||||
programming interface to module system primitives
|
||||
|
||||
One should be able to make customized module systems from the low-level
|
||||
interface, as an alternative to using the default module system. The
|
||||
default module system should use the same low-level procedures.
|
||||
|
||||
@item
|
||||
Here are some features Keisuke Nishida desires to support his VM/compiler
|
||||
[snarfed directly from post <m37l33z0cl.wl@kei.cwru.edu> dated 2001-02-06,
|
||||
and requires formatting]:
|
||||
|
||||
* There is no "current module".
|
||||
|
||||
* Module variables are globally identified by an identifier
|
||||
like "system::core::car".
|
||||
|
||||
* Bindings are solved syntactically, either by a translator
|
||||
or a compiler. If you write just "car", it is expanded to
|
||||
"system::core::car" by a translator or compiler, depending
|
||||
on what modules you use.
|
||||
|
||||
* An interpreter (repl) may memorize the "current working module".
|
||||
It tells the translator or the compiler what modules should be
|
||||
used to identify a variable. So, during interactive sessions,
|
||||
a user may feel as if there *is* the current module.
|
||||
|
||||
* But the fact is, all variables are globally identified at
|
||||
syntax level. Therefore, the compiler can identify all
|
||||
variables at compile time. This means the following code
|
||||
is not allowed:
|
||||
|
||||
;; I want to access `foo' in the module named some-module-name
|
||||
(let ((module (lookup-module some-module-name)))
|
||||
(set! (current-module) module)
|
||||
(foo x))
|
||||
-> ERROR: Unbound variable: current-module
|
||||
|
||||
(let ((module (lookup-module some-module-name)))
|
||||
(module::foo x))
|
||||
-> ERROR: There is no variable named "module::foo"
|
||||
|
||||
Instead, you should write as follows if you need a dynamic access:
|
||||
|
||||
(let ((module (lookup-module some-module-name)))
|
||||
((module-ref module 'foo) x))
|
||||
|
||||
(let ((module (lookup-module some-module-name)))
|
||||
((module 'foo) x)) ;; if module is an applicable smob
|
||||
|
||||
@end itemize
|
||||
|
||||
@c $Date: 2002-02-08 10:50:36 $
|
||||
@node Findings
|
||||
@subsection Findings
|
||||
|
||||
This section briefly describes module system truths that are one step more
|
||||
detailed than "module systems are hairy". These truths are not self-evident;
|
||||
we rely on research, active experimentation and lots of discussion. The goal
|
||||
of this section is to save ourselves from rehashing that which was hashed
|
||||
previously.
|
||||
|
||||
@itemize @bullet
|
||||
|
||||
@item Kent Dybvig's module system
|
||||
|
||||
A paper is available at
|
||||
@uref{http://www.cs.indiana.edu/~dyb/papers/popl99.ps.gz,
|
||||
http://www.cs.indiana.edu/~dyb/papers/popl99.ps.gz}.
|
||||
|
||||
This was discussed in 2000-11 and 2000-12.
|
||||
|
||||
@item Distinction between Top-Level Environment and Module
|
||||
|
||||
These two are different beasts! Each of the following needs to be
|
||||
well-defined with respect to both of these concepts: @code{eval},
|
||||
@code{define}, @code{define-public}, @code{define-module}, @code{load},
|
||||
working from REPL, top-level @code{begin}, [add here].
|
||||
|
||||
In guile-1.4, the distinction was not clear.
|
||||
|
||||
@item Current module system internals
|
||||
|
||||
@xref{Top,Module Internals,,module-snippets}, for implemetation
|
||||
details of the module system up to and including guile-1.6.x.
|
||||
|
||||
@item [add here]
|
||||
|
||||
@end itemize
|
||||
|
||||
@node Selection Criteria
|
||||
@subsection Selection Criteria
|
||||
|
||||
@node Rationale Statements
|
||||
@subsection Rationale Statements
|
||||
|
||||
@node Specification
|
||||
@subsection Specification
|
||||
|
||||
|
||||
@c devel/modules/desgin-notes.texi ends here
|
|
@ -1,288 +0,0 @@
|
|||
Module Layout Proposal
|
||||
======================
|
||||
|
||||
Martin Grabmueller
|
||||
<mgrabmue@cs.tu-berlin.de>
|
||||
Draft: 2001-03-11
|
||||
|
||||
Version: $Id: module-layout.text,v 1.1 2001-03-16 08:37:37 mgrabmue Exp $
|
||||
|
||||
* Table of contents
|
||||
|
||||
** Abstract
|
||||
** Overview
|
||||
*** What do we have now?
|
||||
*** What should we change?
|
||||
** Policy of module separation
|
||||
*** Functionality
|
||||
*** Standards
|
||||
*** Importance
|
||||
*** Compatibility
|
||||
** Module naming
|
||||
*** Scheme
|
||||
*** Object oriented programming
|
||||
*** Systems programming
|
||||
*** Database programming
|
||||
*** Text processing
|
||||
*** Math programming
|
||||
*** Network programming
|
||||
*** Graphics
|
||||
*** GTK+ programming
|
||||
*** X programming
|
||||
*** Games
|
||||
*** Multiple names
|
||||
*** Application modules
|
||||
** Future ideas
|
||||
|
||||
|
||||
* Abstract
|
||||
|
||||
This is a proposal for a new layout of the module name space. The
|
||||
goal is to reduce (or even eliminate) the clutter in the current ice-9
|
||||
module directory, and to provide a clean framework for splitting
|
||||
libguile into subsystems, grouped by functionality, standards
|
||||
compliance and maybe other characteristics.
|
||||
|
||||
This is not a completed policy document, but rather a collection of
|
||||
ideas and proposals which still have to be decided. I will mention by
|
||||
personal preference, where appropriate, but the final decisions are of
|
||||
course up to the maintainers.
|
||||
|
||||
|
||||
* Overview
|
||||
|
||||
Currently, new modules are added in an ad-hoc manner to the ice-9
|
||||
module name space when the need for them arises. I think that was
|
||||
mainly because no other directory for installed Scheme modules was
|
||||
created. With the integration of GOOPS, the new top-level module
|
||||
directory oop was introduced, and we should follow this practice for
|
||||
other subsystems which share functionality.
|
||||
|
||||
DISCLAIMER: Please note that I am no expert on Guile's module system,
|
||||
so be patient with me and correct me where I got anything wrong.
|
||||
|
||||
** What do we have now?
|
||||
|
||||
The module (oop goops) contains all functionality needed for
|
||||
object-oriented programming with Guile (with a few exceptions in the
|
||||
evaluator, which is clearly needed for performance).
|
||||
|
||||
Except for the procedures in the module (ice-9 rdelim), all Guile
|
||||
primitives are currently located in the root module (I think it is the
|
||||
module (guile)), and some procedures defined in `boot-9.scm' are
|
||||
installed in the module (guile-user).
|
||||
|
||||
** What should we change?
|
||||
|
||||
In the core, there are a lot of primitive procedures which can cleanly
|
||||
be grouped into subsystems, and then grouped into modules. That would
|
||||
make the core library more maintainable, would ease seperate testing
|
||||
of subsystems and clean up dependencies between subsystems.
|
||||
|
||||
|
||||
* Policy of module separation
|
||||
|
||||
There are several possibilities to group procedures into modules.
|
||||
|
||||
- They could be grouped by functionality.
|
||||
- They could be grouped by standards compliance.
|
||||
- They could be grouped by level of importance.
|
||||
|
||||
One important group of modules should of course be provided
|
||||
additionally:
|
||||
|
||||
- Compatibility modules.
|
||||
|
||||
So the first thing to decide is: Which of these policies should we
|
||||
adopt? Personally, I think that it is not possible to cleanly use
|
||||
exactly one of the possibilities, we will probably use a mixture of
|
||||
them. I propose to group by functionality, and maybe use some
|
||||
`bridge-modules', which make functionality available when the user
|
||||
requests the modules for a given standard.
|
||||
|
||||
** Functionality
|
||||
|
||||
Candidates for the first possibility are groups of procedures, which
|
||||
already are grouped in source files, such as
|
||||
|
||||
- Regular expression procedures.
|
||||
- Network procedures.
|
||||
- Systems programming procedures.
|
||||
- Random number procedures.
|
||||
- Math/numeric procedures.
|
||||
- String-processing procedures.
|
||||
- List-processing procedures.
|
||||
- Character handling procedures.
|
||||
- Object-oriented programming support.
|
||||
|
||||
** Standards
|
||||
|
||||
Guile now complies to R5RS, and I think that the procedures required
|
||||
by this standards should always be available to the programmer.
|
||||
People who do not want them, could always create :pure modules when
|
||||
they need it.
|
||||
|
||||
On the other hand, the SRFI procedures fit nicely into a `group by
|
||||
standards' scheme. An example which is already provided, is the
|
||||
SRFI-8 syntax `receive'. Following that, we could provide two modules
|
||||
for each SRFI, one named after the SRFI (like `srfi-8') and one named
|
||||
after the main functionality (`receive').
|
||||
|
||||
** Importance
|
||||
|
||||
By importance, I mean `how important are procedures for the average
|
||||
Guile user'. That means that procedures which are only useful to a
|
||||
small group of users (the Guile developers, for example) should not be
|
||||
immediately available at the REPL, so that they not confuse the user
|
||||
when thay appear in the `apropos' output or the tab completion.
|
||||
|
||||
A good example would be debugging procedures (which also could be
|
||||
added with a special command-line option), or low-level system calls.
|
||||
|
||||
** Compatibility
|
||||
|
||||
This group is for modules providing compatibility procedures. An
|
||||
example would be a module for old string-processing procedures, which
|
||||
could someday get overridden by incompatible SRFI procedures of the
|
||||
same name.
|
||||
|
||||
|
||||
* Module naming
|
||||
|
||||
Provided we choose to take the `group by functionality' approach, I
|
||||
propose the following naming hierarchy (some of them were actually
|
||||
suggested by Mikael Djurfeldt).
|
||||
|
||||
- Schame language related in (scheme)
|
||||
- Object oriented programming in (oop)
|
||||
- Systems programming in (system)
|
||||
- Database programming in (database)
|
||||
- Text processing in (text)
|
||||
- Math/numeric programming in (math)
|
||||
- Network programming in (network)
|
||||
- Graphics programming in (graphics)
|
||||
- GTK+ programming in (gtk)
|
||||
- X programming in (xlib)
|
||||
- Games in (games)
|
||||
|
||||
The layout of sub-hierarchies is up to the writers of modules, we
|
||||
should not enforce a strict policy here, because we can not imagine
|
||||
what will happen in this area.
|
||||
|
||||
** Scheme
|
||||
|
||||
(scheme r5rs) Complete R5RS procedures set.
|
||||
(scheme safe) Safe modules.
|
||||
(scheme srfi-1) List processing.
|
||||
(scheme srfi-8) Multiple valuas via `receive'.
|
||||
(scheme receive) dito.
|
||||
(scheme and-let-star) and-let*
|
||||
(scheme syncase) syntax-case hygienic macros (maybe included in
|
||||
(scheme r5rs?).
|
||||
(scheme slib) SLIB, for historic reasons in (scheme).
|
||||
|
||||
** Object oriented programming
|
||||
|
||||
Examples in this section are
|
||||
(oop goops) For GOOPS.
|
||||
(oop goops ...) For lower-level GOOPS functionality and utilities.
|
||||
|
||||
** Systems programming
|
||||
|
||||
(system shell) Shell utilities (glob, system etc).
|
||||
(system process) Process handling.
|
||||
(system file-system) Low-level filesystem support.
|
||||
(system user) getuid, setpgrp, etc.
|
||||
|
||||
_or_
|
||||
|
||||
(system posix) All posix procedures.
|
||||
|
||||
** Database programming
|
||||
|
||||
In the database section, there should be sub-module hierarchies for
|
||||
each supported database which contains the low-level code, and a
|
||||
common database layer, which should unify access to SQL databases via a single interface a la Perl's DBMI.
|
||||
|
||||
(database postgres ...) Low-level database functionality.
|
||||
(database oracle ...) ...
|
||||
(database mysql ...) ...
|
||||
(database msql ...) ...
|
||||
(database sql) Common SQL accessors.
|
||||
(database gdbm ...) ...
|
||||
(database hashed) Common hashed database accessors (like gdbm).
|
||||
(database util) Leftovers.
|
||||
|
||||
** Text processing
|
||||
|
||||
(text rdelim) Line oriented in-/output.
|
||||
(text util) Mangling text files.
|
||||
|
||||
** Math programming
|
||||
|
||||
(math random) Random numbers.
|
||||
(math primes) Prime numbers.
|
||||
(math vector) Vector math.
|
||||
(math algebra) Algebra.
|
||||
(math analysis) Analysis.
|
||||
(math util) Leftovers.
|
||||
|
||||
** Network programming
|
||||
|
||||
(network inet) Internet procedures.
|
||||
(network socket) Socket interface.
|
||||
(network db) Network database accessors.
|
||||
(network util) ntohl, htons and friends.
|
||||
|
||||
** Graphics
|
||||
|
||||
(graphics vector) Generalized vector graphics handling.
|
||||
(graphics vector vrml) VRML parsers etc.
|
||||
(graphisc bitmap) Generalized bitmap handling.
|
||||
(graphics bitmap ...) Bitmap format handling (TIFF, PNG, etc.).
|
||||
|
||||
** GTK+ programming
|
||||
|
||||
(gtk gtk) GTK+ procedures.
|
||||
(gtk gdk) GDK procedures.
|
||||
(gtk threads) gtktreads.
|
||||
|
||||
** X programming
|
||||
|
||||
(xlib xlib) Low-level XLib programming.
|
||||
|
||||
** Games
|
||||
|
||||
(games robots) GNU robots.
|
||||
|
||||
** Multiple names
|
||||
|
||||
As already mentioned above, I think that some modules should have
|
||||
several names, to make it easier for the user to get the functionality
|
||||
she needs. For example, a user could say: `hey, I need the receive
|
||||
macro', or she could say: `I want to stick to SRFI syntax, so where
|
||||
the hell is the module for SRFI-8?!?'.
|
||||
|
||||
** Application modules
|
||||
|
||||
We should not enforce policy on applications. So I propose that
|
||||
application writers should be advised to place modules either in
|
||||
application-specific directories $PREFIX/share/$APP/guile/... and name
|
||||
that however they like, or to use the application's name as the first
|
||||
part of the module name, e.g (gnucash import), (scwm background),
|
||||
(rcalc ui).
|
||||
|
||||
* Future ideas
|
||||
|
||||
I have not yet come up with a good idea for grouping modules, which
|
||||
deal for example with XML processing. They would fit into the (text)
|
||||
module space, because most XML files contain text data, but they would
|
||||
also fit into (database), because XML files are essentially databases.
|
||||
|
||||
On the other hand, XML processing is such a large field that it
|
||||
probably is worth it's own top-level name space (xml).
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: outline
|
||||
End:
|
|
@ -1,149 +0,0 @@
|
|||
* Intro / Index (last modified: $Date: 2002-02-28 05:09:19 $)
|
||||
|
||||
This working document explains the design of the libguile API,
|
||||
specifically the interface to the C programming language.
|
||||
Note that this is NOT an API reference manual.
|
||||
|
||||
- Motivation
|
||||
- History
|
||||
- Decisions
|
||||
- gh_ removal
|
||||
- malloc policy
|
||||
- [add here]
|
||||
|
||||
|
||||
* Motivation
|
||||
|
||||
The file goals.text says:
|
||||
|
||||
Guile's primary aim is to provide a good extension language
|
||||
which is easy to add to an application written in C for the GNU
|
||||
system. This means that it must export the features of a higher
|
||||
level language in a way that makes it easy not to break them
|
||||
from C code.
|
||||
|
||||
Although it may no longer be a "primary aim", creating a stable API is
|
||||
important anyway since without something defined, people will take libguile
|
||||
and use it in ad-hoc ways that may cause them trouble later.
|
||||
|
||||
|
||||
* History
|
||||
|
||||
The initial (in ttn's memory, which only goes back to guile-1.3.4) stab at an
|
||||
API was known as the "gh_ interface", which provided "high-level" abstraction
|
||||
to libguile w/ the premise of supporting multiple implementations at some
|
||||
future date. In practice this approach resulted in many gh_* objects being
|
||||
very slight wrappers for the underlying scm_* objects, so eventually this
|
||||
maintenance burden outweighed the (as yet unrealized) hope for alternate
|
||||
implementations, and the gh_ interface was deprecated starting in guile-1.5.x.
|
||||
|
||||
Starting w/ guile-1.7.x, in concurrence w/ an effort to make libguile
|
||||
available to usloth windows platforms, the naked library was once again
|
||||
dressed w/ the "SCM_API interface".
|
||||
|
||||
Here is a table of versions (! means planned):
|
||||
|
||||
guile libguile readline qthreads srfi-4 -13-14
|
||||
---------------------------------------------------
|
||||
1.3.4 6.0.0 0.0.0 0.0.0 - -
|
||||
1.4 9.0.0 0.0.0 0.0.0 - -
|
||||
1.4.1 10.0.0 TBD 15.0.0 - - !
|
||||
1.6.x 15.0.0 10.0.0 15.0.0 1.0.0 1.0.0 !
|
||||
|
||||
Note: These are libtool-style versions: CURRENT:REVISION:AGE
|
||||
|
||||
|
||||
* Decisions
|
||||
|
||||
** gh_ removal
|
||||
|
||||
At some point, we need to remove gh_ entirely: guile-X.Y.Z.
|
||||
|
||||
** malloc policy
|
||||
|
||||
Here's a proposal by ela:
|
||||
|
||||
I would like to propose the follow gh_() equivalents:
|
||||
|
||||
gh_scm2newstr() -> scm_string2str() in string.[ch]
|
||||
gh_symbol2newstr() -> scm_symbol2str() in symbol.[ch]
|
||||
|
||||
Both taking the (SCM obj, char *context) and allocating memory via
|
||||
scm_must_malloc(). Thus the user can safely free the returned strings
|
||||
with scm_must_free(). The latter feature would be an improvement to the
|
||||
previous gh_() interface functions, for which the user was told to free()
|
||||
them directly. This caused problems when libguile.so used libc malloc()
|
||||
and the calling application used its own standard free(), which might not
|
||||
be libc free().
|
||||
|
||||
It seems to address the general question of: How should client programs use
|
||||
malloc with respect to libguile? Some specific questions:
|
||||
|
||||
* Do you like the names of the functions? Maybe they should be named
|
||||
scm_c_*() instead of scm_*().
|
||||
* Do they make sense?
|
||||
* Should we provide something like scm_c_free() for pointers returned by
|
||||
these kind of functions?
|
||||
|
||||
The first proposal regarding a malloc policy has been rejected for the
|
||||
following resons:
|
||||
|
||||
That would mean, users of guile - even on non M$ systems - would have to
|
||||
keep track where their memory came from?
|
||||
Assume there are users which have some kind of hash table where they store
|
||||
strings in. The hash table is responsible for removing entries from the
|
||||
table. Now, if you want to put strings from guile as well as other
|
||||
strings into that table you would have to store a pointer to the
|
||||
corresponding version of 'free' with every string? We should demand such
|
||||
coding from all guile users?
|
||||
|
||||
The proposal itself read: For a clean memory interface of a client program
|
||||
to libguile we use the following functions from libguile:
|
||||
|
||||
* scm_c_malloc -- should be used to allocate memory returned by some
|
||||
of the SCM to C converter functions in libguile if the
|
||||
client program does not supply memory
|
||||
* scm_c_free -- must be used by the client program to free the memory
|
||||
returned by the SCM to C converter functions in
|
||||
libguile if the client program did not supply a buffer
|
||||
* scm_c_realloc -- to be complete, do not know a real purpose yet
|
||||
|
||||
|
||||
Yet another proposal regarding this problem reads as follows: We could make
|
||||
life easier, if we supplied the following:
|
||||
|
||||
[in gc.h]
|
||||
typedef void * (* scm_t_malloc_func) (size_t);
|
||||
typedef void (* svz_t_free_func) (void *);
|
||||
SCM_API scm_t_malloc_func scm_c_malloc;
|
||||
SCM_API scm_t_free_func scm_c_free;
|
||||
|
||||
[in gc.c]
|
||||
{
|
||||
/* At some library initialization point. */
|
||||
scm_c_malloc = malloc;
|
||||
scm_c_free = free;
|
||||
}
|
||||
|
||||
Then the SCM to C converters allocating memory to store their results use
|
||||
scm_c_malloc() instead of simply malloc(). This way all libguile/Unix users
|
||||
can stick to the previous free() policy, saying that you need to free()
|
||||
pointers delivered by libguile. On the other hand M$-Windows users can pass
|
||||
their own malloc()-function-pointer to the library and use their own free()
|
||||
then. Basically this can be achieved in the following order:
|
||||
|
||||
{
|
||||
char *str;
|
||||
scm_boot_guile (...);
|
||||
scm_c_malloc = malloc;
|
||||
str = scm_c_string2str (obj, NULL, NULL);
|
||||
free (str);
|
||||
}
|
||||
|
||||
This policy is still discussed:
|
||||
If there is one global variable scm_c_malloc, then setting it within one
|
||||
thread may interfere with another thread that expects scm_c_malloc to be
|
||||
set differently. In other words, you would have to introduce some locking
|
||||
mechanism to guarantee that the sequence of setting scm_c_malloc and
|
||||
calling scm_string2str can not be interrupted by a different thread that
|
||||
sets scm_c_malloc to a different value.
|
|
@ -1,143 +0,0 @@
|
|||
Implementation of shared substrings with fresh-copy semantics
|
||||
=============================================================
|
||||
|
||||
Version: $Id: sharedstr.text,v 1.1 2000-08-26 20:55:21 mdj Exp $
|
||||
|
||||
Background
|
||||
----------
|
||||
|
||||
In Guile, most string operations work on two other data types apart
|
||||
from strings: shared substrings and read-only strings (which includes
|
||||
symbols). One of Guile's sub-goals is to be a scripting language in
|
||||
which string management is important. Read-only strings and shared
|
||||
substrings were introduced in order to reduce overhead in string
|
||||
manipulation.
|
||||
|
||||
We now want to simplify the Guile API by removing these two data
|
||||
types, but keeping performance by allowing ordinary strings to share
|
||||
storage.
|
||||
|
||||
The idea is to let operations like `symbol->string' and `substring'
|
||||
return a pointer into the original string/symbol, thus avoiding the
|
||||
need to copy the string.
|
||||
|
||||
Two of the problems which then arise are:
|
||||
|
||||
* If s2 is derived from s1, and therefore share storage with s1, a
|
||||
modification to either s1 or s2 will affect the other.
|
||||
|
||||
* Guile is supposed to interact closely with many UNIX libraries in
|
||||
which the NUL character is used to terminate strings. Therefore
|
||||
Guile strings contain a NUL character at the end, in addition to the
|
||||
string length (the latter of which is used by Guile's string
|
||||
operations).
|
||||
|
||||
The solutions to these problems are to
|
||||
|
||||
* Copy a string with shared storage when it's modified.
|
||||
|
||||
* Copy a string with shared storage when it's being used as argument
|
||||
to a C library call. (Copying implies inserting an ending NUL
|
||||
character.)
|
||||
|
||||
But this leads to memory management problems. When is it OK to free
|
||||
a character array which was allocated for a symbol or a string?
|
||||
|
||||
Abstract description of proposed solution
|
||||
-----------------------------------------
|
||||
|
||||
Definitions
|
||||
|
||||
STRING = <TYPETAG, LENGTH, CHARRECORDPTR, CHARPTR>
|
||||
|
||||
SYMBOL = <TYPETAG, LENGTH, CHARRECORDPTR, CHARPTR>
|
||||
|
||||
CHARRECORD = <PHASE, SHAREDFLAG, CHARS>
|
||||
|
||||
PHASE = black | white
|
||||
|
||||
SHAREDFLAG = private | shared
|
||||
|
||||
CHARS is a character array
|
||||
|
||||
CHARPTR points into it
|
||||
|
||||
Memory management
|
||||
|
||||
A string or symbol is initially allocated with its contents stored in
|
||||
a character array in a character record. The string/symbol header
|
||||
contains a pointer to this record. The initial value of the shared
|
||||
flag in the character record is `private'.
|
||||
|
||||
The GC mark phases alternate between black and white---every second
|
||||
phase is black, the rest are white. This is used to distinguish
|
||||
whether a character record has been encountered before:
|
||||
|
||||
During a black mark phase, when the GC encounters a string or symbol,
|
||||
it changes the PHASE and SHAREDFLAG marks of the corresponding
|
||||
character record according to the following table:
|
||||
|
||||
<white, private> --> <black, private> (white => unconditionally
|
||||
<white, shared> --> <black, private> set to <black, private>)
|
||||
<black, private> --> <black, shared> (SHAREDFLAG changed)
|
||||
<black, shared> --> <black, shared> (no change)
|
||||
|
||||
The behaviour of a white phase is quivalent with the color names
|
||||
switched.
|
||||
|
||||
The GC sweep phase frees any unmarked string or symbol header and
|
||||
frees its character record either if it is marked with the "wrong"
|
||||
color (not matching the color of the last mark phase) or if its
|
||||
SHAREDFLAG is `private'.
|
||||
|
||||
Copy-on-write
|
||||
|
||||
An attempt at mutating string contents leads to copying if SHAREDFLAG
|
||||
is `shared'. Copying means making a copy of the character record and
|
||||
mutating the CHARRECORDPTR and CHARPTR fields of the object header to
|
||||
point to the copy.
|
||||
|
||||
Substring operation
|
||||
|
||||
When making a substring, a new string header is allocated, with new
|
||||
contents for the LENGTH and CHARPTR fields.
|
||||
|
||||
Implementation details
|
||||
----------------------
|
||||
|
||||
* We store the character record consecutively with the character
|
||||
array and lump the PHASE and SHAREDFLAG fields together into one
|
||||
byte containing an integer code for the four possible states of the
|
||||
PHASE and SHAREDFLAG fields. Another way of viewing it is that
|
||||
these fields are represented as bits 1 and 0 in the "header" of the
|
||||
character array. We let CHARRECORDPTR point to the first character
|
||||
position instead of on this header:
|
||||
|
||||
CHARRECORDPTR
|
||||
|
|
||||
V
|
||||
FCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
|
||||
|
||||
F = 0, 1, 2, 3
|
||||
|
||||
* We represent strings as the sub-types `simple-string' and
|
||||
`substring'.
|
||||
|
||||
* In a simple string, CHARRECORDPTR and CHARPTR are represented by a
|
||||
single pointer, so a `simple-string' is an ordinary heap cell with
|
||||
TYPETAG and LENGTH in the CAR and CHARPTR in the CDR.
|
||||
|
||||
* substring:s are represented as double cells, with TYPETAG and LENGTH
|
||||
in word 0, CHARRECORDPTR in word 1 and CHARPTR in word 2
|
||||
(alternatively, we could store an offset from CHARRECORDPTR).
|
||||
|
||||
Problems with this implementation
|
||||
---------------------------------
|
||||
|
||||
* How do we make copy-on-write thread-safe? Is there a different
|
||||
implementation which is efficient and thread-safe?
|
||||
|
||||
* If small substrings are frequently generated from large, temporary
|
||||
strings and the small substrings are kept in a data structure, the
|
||||
heap will still have to host the large original strings. Should we
|
||||
simply accept this?
|
|
@ -1,592 +0,0 @@
|
|||
* Introduction
|
||||
|
||||
Version: $Id: langtools.text,v 1.5 2000-08-13 04:47:26 mdj Exp $
|
||||
|
||||
This is a proposal for how Guile could interface with language
|
||||
translators. It will be posted on the Guile list and revised for some
|
||||
short time (days rather than weeks) before being implemented.
|
||||
|
||||
The document can be found in the CVS repository as
|
||||
guile-core/devel/translation/langtools.text. All Guile developers are
|
||||
welcome to modify and extend it according to the ongoing discussion
|
||||
using CVS.
|
||||
|
||||
Ideas and comments are welcome.
|
||||
|
||||
For clarity, the proposal is partially written as if describing an
|
||||
already existing system.
|
||||
|
||||
MDJ 000812 <djurfeldt@nada.kth.se>
|
||||
|
||||
* Language names
|
||||
|
||||
A translator for Guile is a certain kind of Guile module, implemented
|
||||
in Scheme, C, or a mixture of both.
|
||||
|
||||
To make things simple, the name of the language is closely related to
|
||||
the name of the translator module.
|
||||
|
||||
Languages have long and short names. The long form is simply the name
|
||||
of the translator module: `(lang ctax)', `(lang emacs-lisp)',
|
||||
`(my-modules foo-lang)' etc.
|
||||
|
||||
Languages with the long name `(lang IDENTIFIER)' can be referred to
|
||||
with the short name IDENTIFIER, for example `emacs-lisp'.
|
||||
|
||||
* How to tell Guile to read code in a different language (than Scheme)
|
||||
|
||||
There are four methods of specifying which translator to use when
|
||||
reading a file:
|
||||
|
||||
** Command option
|
||||
|
||||
The options to the guile command are parsed linearly from left to
|
||||
right. You can change the language at zero or more points using the
|
||||
option
|
||||
|
||||
-t, --language LANGUAGE
|
||||
|
||||
Example:
|
||||
|
||||
guile -t emacs-lisp -l foo -l bar -t scheme -l baz
|
||||
|
||||
will use the emacs-lisp translator while reading "foo" and "bar", and
|
||||
the default translator (scheme) for "baz".
|
||||
|
||||
You can use this technique in a script together with the meta switch:
|
||||
|
||||
#!/usr/local/bin/guile \
|
||||
-t emacs-lisp -s
|
||||
!#
|
||||
|
||||
** Commentary in file
|
||||
|
||||
When opening a file for reading, Guile will read the first few lines,
|
||||
looking for the string "-*- LANGNAME -*-", where LANGNAME can be
|
||||
either the long or short form of the name.
|
||||
|
||||
If found, the corresponding translator is loaded and used to read the
|
||||
file.
|
||||
|
||||
** File extension
|
||||
|
||||
Guile maintains an alist mapping filename extensions to languages.
|
||||
Each entry has the form:
|
||||
|
||||
(REGEXP . LANGNAME)
|
||||
|
||||
where REGEXP is a string and LANGNAME a symbol or a list of symbols.
|
||||
|
||||
The alist can be accessed using `language-alist' which is exported
|
||||
by the module `(core config)':
|
||||
|
||||
(language-alist) --> current alist
|
||||
(language-alist ALIST) sets the alist to ALIST
|
||||
(language-alist ALIST :prepend) prepends ALIST onto the current list
|
||||
(language-alist ALIST :append) appends ALIST after current list
|
||||
|
||||
The `load' command will match filenames against this alist and choose
|
||||
the translator to use accordingly.
|
||||
|
||||
There will be a default alist for common translators. For translators
|
||||
not listed, the alist has to be extended in .guile just as Emacs users
|
||||
extend auto-mode-alist in .emacs.
|
||||
|
||||
** Module header
|
||||
|
||||
You specify the language used by a module with the :language option in
|
||||
the module header. (See below under "Module configuration language".)
|
||||
|
||||
* Module system
|
||||
|
||||
This section describes how the Guile module system is adapted to use
|
||||
with other languages.
|
||||
|
||||
** Module configuration language
|
||||
|
||||
*** The `(config)' module
|
||||
|
||||
Guile has a sophisticated module system. We don't require each
|
||||
translator implementation to implement its own syntax for modules.
|
||||
That would be too much work for the implementor, and users would have
|
||||
to learn the module system anew for each syntax.
|
||||
|
||||
Instead, the module `(config)' exports the module header form
|
||||
`(define-module ...)'.
|
||||
|
||||
The config module also exports a number of primitives by which you can
|
||||
customize the Guile library, such as `language-alist' and `load-path'.
|
||||
|
||||
*** Default module environment
|
||||
|
||||
The bindings of the config module is available in the default
|
||||
interaction environment when Guile starts up. This is because the
|
||||
config module is on the module use list for the startup environment.
|
||||
|
||||
However, config bindings are *not* available by default in new
|
||||
modules.
|
||||
|
||||
The default module environment provides bindings from the R5RS module
|
||||
only.
|
||||
|
||||
*** Module headers
|
||||
|
||||
The module header of the current module system is the form
|
||||
|
||||
(define-module NAME OPTION1 ...)
|
||||
|
||||
You can specify a translator using the option
|
||||
|
||||
:language LANGNAME
|
||||
|
||||
where LANGNAME is the long or short form of language name as described
|
||||
above.
|
||||
|
||||
The translator is being fed characters from the module file, starting
|
||||
immediately after the end-parenthesis of the module header form.
|
||||
|
||||
NOTE: There can be only one module header per file.
|
||||
|
||||
It is also possible to put the module header in a separate file and
|
||||
use the option
|
||||
|
||||
:file FILENAME
|
||||
|
||||
to point out a file containing the actual code.
|
||||
|
||||
Example:
|
||||
|
||||
foo.gm:
|
||||
----------------------------------------------------------------------
|
||||
(define-module (foo)
|
||||
:language emacs-lisp
|
||||
:file "foo.el"
|
||||
:export (foo bar)
|
||||
)
|
||||
----------------------------------------------------------------------
|
||||
|
||||
foo.el:
|
||||
----------------------------------------------------------------------
|
||||
(defun foo ()
|
||||
...)
|
||||
|
||||
(defun bar ()
|
||||
...)
|
||||
----------------------------------------------------------------------
|
||||
|
||||
** Repl commands
|
||||
|
||||
Up till now, Guile has been dependent upon the available bindings in
|
||||
the selected module in order to do basic operations such as moving to
|
||||
a different module, enter the debugger or getting documentation.
|
||||
|
||||
This is not acceptable since we want be able to control Guile
|
||||
consistently regardless of in which module we are, and sinc we don't
|
||||
want to equip a module with bindings which don't have anything to do
|
||||
with the purpose of the module.
|
||||
|
||||
Therefore, the repl provides a special command language on top of
|
||||
whatever syntax the current module provides. (Scheme48 and RScheme
|
||||
provides similar repl command languages.)
|
||||
|
||||
[Jost Boekemeier has suggested the following alternative solution:
|
||||
Commands are bindings just like any other binding. It is enough if
|
||||
some modules carry command bindings (it's in fact enough if *one*
|
||||
module has them), because from such a module you can use the command
|
||||
(in MODULE) to walk into a module not carrying command bindings, and
|
||||
then use CTRL-D to exit.
|
||||
|
||||
However, this has the disadvantage of mixing the "real" bindings with
|
||||
command bindings (the module might want to use "in" for other
|
||||
purposes), that CTRL-D could cause problems since for some channels
|
||||
CTRL-D might close down the connection, and that using one type of
|
||||
command ("in") to go "into" the module and another (CTRL-D) to "exit"
|
||||
is more complex than simply "going to" a module.]
|
||||
|
||||
*** Repl command syntax
|
||||
|
||||
Normally, repl commands have the syntax
|
||||
|
||||
,COMMAND ARG1 ...
|
||||
|
||||
Input starting with arbitrary amount of whitespace + a comma thus
|
||||
works as an escape syntax.
|
||||
|
||||
This syntax is probably compatible with all languages. (Note that we
|
||||
don't need to activate the lexer of the language until we've checked
|
||||
if the first non-whitespace char is a comma.)
|
||||
|
||||
(Hypothetically, if this would become a problem, we can provide means
|
||||
of disabling this behaviour of the repl and let that particular
|
||||
language module take sole control of reading at the repl prompt.)
|
||||
|
||||
Among the commands available are
|
||||
|
||||
*** ,in MODULE
|
||||
|
||||
Select module named MODULE, that is any new expressions typed by the
|
||||
user after this command will be evaluated in the evaluation
|
||||
environment provided by MODULE.
|
||||
|
||||
*** ,in MODULE EXPR
|
||||
|
||||
Evaluate expression EXPR in MODULE. EXPR has the syntax supplied by
|
||||
the language used by MODULE.
|
||||
|
||||
*** ,use MODULE
|
||||
|
||||
Import all bindings exported by MODULE to the current module.
|
||||
|
||||
* Language modules
|
||||
|
||||
Since code written in any kind of language should be able to implement
|
||||
most tasks, which may include reading, evaluating and writing, and
|
||||
generally computing with, expressions and data originating from other
|
||||
languages, we want the basic reading, evaluation and printing
|
||||
operations to be independent of the language.
|
||||
|
||||
That is, instead of supplying separate `read', `eval' and `write'
|
||||
procedures for different languages, a language module is required to
|
||||
use the system procedures in the translated code.
|
||||
|
||||
This means that the behaviour of `read', `eval' and `write' are
|
||||
context dependent. (See further "How Guile system procedures `read',
|
||||
`eval', `write' use language modules" below.)
|
||||
|
||||
** Language data types
|
||||
|
||||
Each language module should try to use the fundamental Scheme data
|
||||
types as far as this is possible.
|
||||
|
||||
Some data types have important differences in semantics between
|
||||
languages, though, and all required data types may not exist in
|
||||
Guile.
|
||||
|
||||
In such cases, the language module must supply its own, distinct, data
|
||||
types. So, each language supported by Guile uses a certain set of
|
||||
data types, with the basic Scheme data types as the intersection
|
||||
between all sets.
|
||||
|
||||
Specifically, syntax trees representing source code expressions should
|
||||
normally be a distinct data type.
|
||||
|
||||
** Foreign language escape syntax
|
||||
|
||||
Note that such data can flow freely between modules. In order to
|
||||
accomodate data with different native syntaxes, each language module
|
||||
provides a foreign language escape syntax. In Scheme, this syntax
|
||||
uses the sharp comma extension specified by SRFI-10. The read
|
||||
constructor is simply the last symbol in the long language name (which
|
||||
is usually the same as the short language name).
|
||||
|
||||
** Example 1
|
||||
|
||||
Characters have the syntax in Scheme and in ctax. Lists currently
|
||||
have syntax in Scheme but lack ctax syntax. Ctax doesn't have a
|
||||
datatype "enum", but we pretend it has for this example.
|
||||
|
||||
The following table now shows the syntax used for reading and writing
|
||||
these expressions in module A using the language scheme, and module B
|
||||
using the language ctax (we assume that the foreign language escape
|
||||
syntax in ctax is #LANGUAGE EXPR):
|
||||
|
||||
A B
|
||||
|
||||
chars #\X 'X'
|
||||
|
||||
lists (1 2 3) #scheme (1 2 3)
|
||||
|
||||
enums #,(ctax ENUM) ENUM
|
||||
|
||||
** Example 2
|
||||
|
||||
A user is typing expressions in a ctax module which imports the
|
||||
bindings x and y from the module `(foo)':
|
||||
|
||||
ctax> x = read ();
|
||||
1+2;
|
||||
1+2;
|
||||
ctax> x
|
||||
1+2;
|
||||
ctax> y = 1;
|
||||
1
|
||||
ctax> y;
|
||||
1
|
||||
ctax> ,in (guile-user)
|
||||
guile> ,use (foo)
|
||||
guile> x
|
||||
#,(ctax 1+2;)
|
||||
guile> y
|
||||
1
|
||||
guile>
|
||||
|
||||
The example shows that ctax uses a distinct representation for ctax
|
||||
expressions, but Scheme integers for integers.
|
||||
|
||||
** Language module interface
|
||||
|
||||
A language module is an ordinary Guile module importing bindings from
|
||||
other modules and exporting bindings through its public interface.
|
||||
|
||||
It is required to export the following variable and procedures:
|
||||
|
||||
*** language-environment --> ENVIRONMENT
|
||||
|
||||
Returns a fresh top-level ENVIRONMENT (a module) where expressions
|
||||
in this language are evaluated by default.
|
||||
|
||||
Modules using this language will by default have this environment
|
||||
on their use list.
|
||||
|
||||
The intention is for this procedure to provide the "run-time
|
||||
environment" for the language.
|
||||
|
||||
*** native-read PORT --> OBJECT
|
||||
|
||||
Read next expression in the foreign syntax from PORT and return an
|
||||
object OBJECT representing it.
|
||||
|
||||
It is entirely up to the language module to define what one
|
||||
expression is, that is, how much to read.
|
||||
|
||||
In lisp-like languages, `native-read' corresponds to `read'. Note
|
||||
that in such languages, OBJECT need not be source code, but could
|
||||
be data.
|
||||
|
||||
The representation of OBJECT is also chosen by the language
|
||||
module. It can consist of Scheme data types, data types distinct for
|
||||
the language, or a mixture.
|
||||
|
||||
There is one requirement, however: Distinct data types must be
|
||||
instances of a subclass of `language-specific-class'.
|
||||
|
||||
This procedure will be called during interactive use (the user
|
||||
types expressions at a prompt) and when the system `read'
|
||||
procedure is called at a time when a module using this language is
|
||||
selected.
|
||||
|
||||
Some languages (for example Python) parse differently depending if
|
||||
its an interactive or non-interactive session. Guile prvides the
|
||||
predicate `interactive-port?' to test for this.
|
||||
|
||||
*** language-specific-class
|
||||
|
||||
This variable contains the superclass of all non-Scheme data-types
|
||||
provided by the language.
|
||||
|
||||
*** native-write OBJECT PORT
|
||||
|
||||
This procedure prints the OBJECT on PORT using the specific
|
||||
language syntax.
|
||||
|
||||
*** write-foreign-syntax OBJECT LANGUAGE NATIVE-WRITE PORT
|
||||
|
||||
Write OBJECT in the foreign language escape syntax of this module.
|
||||
The object is specific to language LANGUAGE and can be written using
|
||||
NATIVE-WRITE.
|
||||
|
||||
Here's an implementation for Scheme:
|
||||
|
||||
(define (write-foreign-syntax object language native-write port)
|
||||
(format port "#(~A " language))
|
||||
(native-write object port)
|
||||
(display #\) port)
|
||||
|
||||
*** translate EXPRESSION --> SCHEMECODE
|
||||
|
||||
Translate an EXPRESSION into SCHEMECODE.
|
||||
|
||||
EXPRESSION can be anything returned by `read'.
|
||||
|
||||
SCHEMECODE is Scheme source code represented using ordinary Scheme
|
||||
data. It will be passed to `eval' in an environment containing
|
||||
bindings in the environment returned by `language-environment'.
|
||||
|
||||
This procedure will be called duing interactive use and when the
|
||||
system `eval
|
||||
|
||||
*** translate-all PORT [ALIST] --> THUNK
|
||||
|
||||
Translate the entire stream of characters PORT until #<eof>.
|
||||
Return a THUNK which can be called repeatedly like this:
|
||||
|
||||
THUNK --> SCHEMECODE
|
||||
|
||||
Each call will yield a new piece of scheme code. The THUNK signals
|
||||
end of translation by returning the value *end-of-translation* (which
|
||||
is tested using the predicate `end-of-translation?').
|
||||
|
||||
The optional argument ALIST provides compilation options for the
|
||||
translator:
|
||||
|
||||
(debug . #t) means produce code suitable for debugging
|
||||
|
||||
This procedure will be called by the system `load' command and by
|
||||
the module system when loading files.
|
||||
|
||||
The intensions are:
|
||||
|
||||
1. To let the language module decide when and in how large chunks
|
||||
to do the processing. It may choose to do all processing at
|
||||
the time translate-all is called, all processing when THUNK is
|
||||
called the first time, or small pieces of processing each time
|
||||
THUNK is called, or any conceivable combination.
|
||||
|
||||
2. To let the language module decide in how large chunks to output
|
||||
the resulting Scheme code in order not to overload memory.
|
||||
|
||||
3. To enable the language module to use temporary files, and
|
||||
whole-module analysis and optimization techniques.
|
||||
|
||||
*** untranslate SCHEMECODE --> EXPRESSION
|
||||
|
||||
Attempt to do the inverse of `translate'. An approximation is OK. It
|
||||
is also OK to return #f. This procedure will be called from the
|
||||
debugger, when generating error messages, backtraces etc.
|
||||
|
||||
The debugger uses the local evaluation environment to determine from
|
||||
which module an expression come. This is how the debugger can know
|
||||
which `untranslate' procedure to call for a given expression.
|
||||
|
||||
(This is used currently to decide whether which backtrace frames to
|
||||
display. System modules use the option :no-backtrace to prevent
|
||||
displaying of Guile's internals to the user.)
|
||||
|
||||
Note that `untranslate' can use source-properties set by `native-read'
|
||||
to give hints about how to do the reverse translation. Such hints
|
||||
could for example be the filename, and line and column numbers for the
|
||||
source expression, or an actual copy of the source expression.
|
||||
|
||||
** How Guile system procedures `read', `eval', `write' use language modules
|
||||
|
||||
*** read
|
||||
|
||||
The idea is that the `read' exported from the R5RS library will
|
||||
continue work when called from other languages, and will keep its
|
||||
semantics.
|
||||
|
||||
A call to `read' simply means "read in an expression from PORT using
|
||||
the syntax associated with that port".
|
||||
|
||||
Each module carries information about its language.
|
||||
|
||||
When an input port is created for a module to be read or during
|
||||
interaction with a given module, this information is copied to the
|
||||
port object.
|
||||
|
||||
read uses this information to call `native-read' in the correct
|
||||
language module.
|
||||
|
||||
*** eval
|
||||
|
||||
[To be written.]
|
||||
|
||||
*** write
|
||||
|
||||
[To be written.]
|
||||
|
||||
* Error handling
|
||||
|
||||
** Errors during translation
|
||||
|
||||
Errors during translation are generated as usual by calling scm-error
|
||||
(from Scheme) or scm_misc_error etc (from C). The effect of
|
||||
throwing errors from within `translate-all' is the same as when they
|
||||
are generated within a call to the THUNK returned from
|
||||
`translate-all'.
|
||||
|
||||
scm-error takes a fifth argument. This is a property list (alist)
|
||||
which you can use to pass extra information to the error reporting
|
||||
machinery.
|
||||
|
||||
Currently, the following properties are supported:
|
||||
|
||||
filename filename of file being translated
|
||||
line line number of errring expression
|
||||
column column number
|
||||
|
||||
** Run-time errors (errors in SCHEMECODE)
|
||||
|
||||
This section pertains to what happens when a run-time error occurs
|
||||
during evaluation of the translated code.
|
||||
|
||||
In order to get "foreign code" in error messages, make sure that
|
||||
`untranslate' yields good output. Note the possibility of maintaining
|
||||
a table (preferably using weak references) mapping SCHEMECODE to
|
||||
EXPRESSION.
|
||||
|
||||
Note the availability of source-properties for attaching filename,
|
||||
line and column number, and other, information, such as EXPRESSION, to
|
||||
SCHEMECODE. If filename, line, and, column properties are defined,
|
||||
they will be automatically used by the error reporting machinery.
|
||||
|
||||
* Proposed changes to Guile
|
||||
|
||||
** Implement the above proposal.
|
||||
|
||||
** Add new field `reader' and `translator' to all module objects
|
||||
|
||||
Make sure they are initialized when a language is specified.
|
||||
|
||||
** Use `untranslate' during error handling.
|
||||
|
||||
** Implement the use of arg 5 to scm-error
|
||||
|
||||
(specified in "Errors during translation")
|
||||
|
||||
** Implement a generic lexical analyzer with interface similar to read/rp
|
||||
|
||||
Mikael is working on this. (It might take a few days, since he is
|
||||
busy with his studies right now.)
|
||||
|
||||
** Remove scm:eval-transformer
|
||||
|
||||
This is replaced by new fields in each module object (environment).
|
||||
|
||||
`eval' will instead directly the `transformer' field in the module
|
||||
passed as second arg.
|
||||
|
||||
Internal evaluation will, similarly, use the transformer of the module
|
||||
representing the top-level of the local environment.
|
||||
|
||||
Note that this level of transformation is something independent of
|
||||
language translation. *This* is a hook for adding Scheme macro
|
||||
packages and belong to the core language.
|
||||
|
||||
We also need to check the new `translator' field, potentially using
|
||||
it.
|
||||
|
||||
** Package local environments as smobs
|
||||
|
||||
so that environment list structures can't leak out on the Scheme
|
||||
level. (This has already been done in SCM.)
|
||||
|
||||
** Introduce new fields in input ports
|
||||
|
||||
These carries state information such as
|
||||
|
||||
*** which keyword syntax to support
|
||||
|
||||
*** whether to be case sensitive or not
|
||||
|
||||
*** which lexical grammar to use
|
||||
|
||||
*** whether the port is used in an interactive session or not
|
||||
|
||||
There will be a new Guile primitive `interactive-port?' testing for this.
|
||||
|
||||
** Move configuration of keyword syntax and case sensitivity to the read-state
|
||||
|
||||
Add new fields to the module objects for these values, so that the
|
||||
read-state can be initialized from them.
|
||||
|
||||
*fixme* When? Why? How?
|
||||
|
||||
Probably as soon as the language has been determined during file loading.
|
||||
|
||||
Need to figure out how to set these values.
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: outline
|
||||
End:
|
Loading…
Add table
Add a link
Reference in a new issue