1
Fork 0
mirror of https://git.savannah.gnu.org/git/guile.git synced 2025-04-29 19:30:36 +02:00

PEG: Add full support for PEG + some extensions

This commit adds support for PEG as described in:

    <https://bford.info/pub/lang/peg.pdf>

It adds support for the missing features (comments, underscores in
identifiers and escaping) while keeping the extensions (dashes in
identifiers, < and <--).

The naming system tries to be as close as possible to the one proposed
in the paper.

* module/ice-9/peg/string-peg.scm: Rewrite PEG parser.
* test-suite/tests/peg.test: Fix import

Signed-off-by: Ludovic Courtès <ludo@gnu.org>
This commit is contained in:
Ekaitz Zarraga 2024-09-11 21:19:26 +02:00 committed by Ludovic Courtès
parent 47807c9b11
commit ff11753df1
No known key found for this signature in database
GPG key ID: 090B11993D9AEBB5
4 changed files with 313 additions and 180 deletions

View file

@ -17,6 +17,10 @@ Wikipedia has a clear and concise introduction to PEGs if you want to
familiarize yourself with the syntax:
@url{http://en.wikipedia.org/wiki/Parsing_expression_grammar}.
The paper that introduced PEG contains a more detailed description of how PEG
works, and describes its syntax in detail:
@url{https://bford.info/pub/lang/peg.pdf}
The @code{(ice-9 peg)} module works by compiling PEGs down to lambda
expressions. These can either be stored in variables at compile-time by
the define macros (@code{define-peg-pattern} and
@ -216,8 +220,8 @@ should propagate up the parse tree. The normal @code{<-} propagates the
matched text up the parse tree, @code{<--} propagates the matched text
up the parse tree tagged with the name of the nonterminal, and @code{<}
discards that matched text and propagates nothing up the parse tree.
Also, nonterminals may consist of any alphanumeric character or a ``-''
character (in normal PEGs nonterminals can only be alphabetic).
Also, nonterminals may include ``-'' character, while in normal PEG it is not
allowed.
For example, if we:
@lisp