mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-05-01 04:10:18 +02:00
Clarify 'file-encoding' docs: heuristics may be improved later.
* doc/ref/api-evaluation.texi (Character Encoding of Source Files): Mention UTF-8 as another common encoding used for Scheme source files, and that it is used by default. Change the description to leave open the possibility of adding additional heuristics in the future. Mention that if the coding declaration is in a #!-style block comment, it must be the first such comment in the file. Mention the '#:guess-encoding' keyword argument.
This commit is contained in:
parent
3ace9a8e4e
commit
7099eec4fb
1 changed files with 22 additions and 14 deletions
|
@ -991,17 +991,19 @@ three arguments.
|
||||||
@cindex source file encoding
|
@cindex source file encoding
|
||||||
@cindex primitive-load
|
@cindex primitive-load
|
||||||
@cindex load
|
@cindex load
|
||||||
Scheme source code files are usually encoded in ASCII, but, the
|
Scheme source code files are usually encoded in ASCII or UTF-8, but the
|
||||||
built-in reader can interpret other character encodings. The
|
built-in reader can interpret other character encodings as well. When
|
||||||
procedure @code{primitive-load}, and by extension the functions that
|
Guile loads Scheme source code, it uses the @code{file-encoding}
|
||||||
call it, such as @code{load}, first scan the top 500 characters of the
|
procedure (described below) to try to guess the encoding of the file.
|
||||||
file for a coding declaration.
|
In the absence of any hints, UTF-8 is assumed. One way to provide a
|
||||||
|
hint about the encoding of a source file is to place a coding
|
||||||
|
declaration in the top 500 characters of the file.
|
||||||
|
|
||||||
A coding declaration has the form @code{coding: XXXXXX}, where
|
A coding declaration has the form @code{coding: XXXXXX}, where
|
||||||
@code{XXXXXX} is the name of a character encoding in which the source
|
@code{XXXXXX} is the name of a character encoding in which the source
|
||||||
code file has been encoded. The coding declaration must appear in a
|
code file has been encoded. The coding declaration must appear in a
|
||||||
scheme comment. It can either be a semicolon-initiated comment or a block
|
scheme comment. It can either be a semicolon-initiated comment, or the
|
||||||
@code{#!} comment.
|
first block @code{#!} comment in the file.
|
||||||
|
|
||||||
The name of the character encoding in the coding declaration is
|
The name of the character encoding in the coding declaration is
|
||||||
typically lower case and containing only letters, numbers, and hyphens,
|
typically lower case and containing only letters, numbers, and hyphens,
|
||||||
|
@ -1050,15 +1052,21 @@ the port's character encoding should be set to the encoding returned
|
||||||
by @code{file-encoding}, if any, again by using
|
by @code{file-encoding}, if any, again by using
|
||||||
@code{set-port-encoding!}. Then the code can be read as normal.
|
@code{set-port-encoding!}. Then the code can be read as normal.
|
||||||
|
|
||||||
|
Alternatively, one can use the @code{#:guess-encoding} keyword argument
|
||||||
|
of @code{open-file} and related procedures. @xref{File Ports}.
|
||||||
|
|
||||||
@deffn {Scheme Procedure} file-encoding port
|
@deffn {Scheme Procedure} file-encoding port
|
||||||
@deffnx {C Function} scm_file_encoding (port)
|
@deffnx {C Function} scm_file_encoding (port)
|
||||||
Scan the port for an Emacs-like character coding declaration near the
|
Attempt to scan the first few hundred bytes from the @var{port} for
|
||||||
top of the contents of a port with random-accessible contents
|
hints about its character encoding. Return a string containing the
|
||||||
(@pxref{Recognize Coding, how Emacs recognizes file encoding,, emacs,
|
encoding name or @code{#f} if the encoding cannot be determined. The
|
||||||
The GNU Emacs Reference Manual}). The coding declaration is of the form
|
port is rewound.
|
||||||
@code{coding: XXXXX} and must appear in a Scheme comment. Return a
|
|
||||||
string containing the character encoding of the file if a declaration
|
Currently, the only supported method is to look for an Emacs-like
|
||||||
was found, or @code{#f} otherwise. The port is rewound.
|
character coding declaration (@pxref{Recognize Coding, how Emacs
|
||||||
|
recognizes file encoding,, emacs, The GNU Emacs Reference Manual}). The
|
||||||
|
coding declaration is of the form @code{coding: XXXXX} and must appear
|
||||||
|
in a Scheme comment. Additional heuristics may be added in the future.
|
||||||
@end deffn
|
@end deffn
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue