Grammar modularization¶
Grammar can be split across multiple files and imported using the import
statement. This statement accepts a path to the target grammar file relative to
the source grammar file and optional name of the target grammar after the as
keyword.
For example,
import '../../othergrammar.pg';
or
import '../../othergrammar.pg' as og;
Rules from imported grammar can be referenced by fully qualified name consisting
of dot-separated target grammar name and the rule name. By default the name of
the target grammar is the base name of the grammar file. In the first example we
can reference rules by using othergrammar.
prefix as for example:
SomeRule: INT othergrammar.SomeTargetRule+;
Or in the second example we are using name og
as the target grammar name so
the previous example will be written as:
SomeRule: INT og.SomeTargetRule+;
import
statement supports diamond imports as well as recursive imports.
Each rule in the overall grammar has a fully qualified name (FQN). This name is constructed by the dot-separated chain of target grammar names ending with the rule name, using the first chain of imports that lead to the rule. This naming scheme enables the import of grammar files from arbitrary locations while still preserving a deterministic FQN for each rule of the grammar.
For example, if there is a grammar file A.pg
importing B.pg
and C.pg
,
where B.pg
also imports C.pg
than, if the A.pg
is the root of the grammar,
all the rules in C.pg
have FQN in the form of B.C.some_rule
. Notice that the
first path to C.pg
was from B.pg
as the B.pg
grammar is imported first in
A.pg.
. See fqn
tests
for an example.
Imported rules override¶
Grammar rules defined in imported grammar can be overridden by using appropriate FQN rule name. For example, if we have following grammars:
file: first.pg
import 'second.pg' as s;
...
file: second.pg
import 'third.pg' as t;
...
file third.pg
S: A B+;
B: 'foo' C;
In first.pg
we could then write:
S: s.t.S;
s.t.B: 'bar' C+;
Which will effectively override rule B
from third.pg
(note FQN s.t.B
).
All grammar rules that used old B
from third.pg
will now use s.t.B
rule
from first.pg
. This enables a flexible modification of existing grammars from
importing grammars just in places where it is needed without having to rewrite
all the rules that use the rule that needs to be modified.
Grammar file recognizers¶
Grammar file can optionally provide its recognizers. These
should be given in a Python file named <base grammar name>_recognizers.py
and
should be found in the same folder where the grammar file is found.
For example, if grammar file is named mygrammar.pg
than recognizers module
should be named mygrammar_recognizers.py
.
For a parglare to be able to collect all recognizers defined in a module a
collector
is used. It is a decorator constructed in the Python recognizer
module and used to decorate each recognizer function.
For example, mygrammar_recognizers.py
might be given as:
from parglare import get_collector
recognizer = get_collector()
@recognizer
def term_a(input, position):
... some recognition
@recognizer
def term_b(input, position):
...
recognizer
object is collector
in this case. It will construct a dictionary
of all recognizers decorated by it and that dictionary will be provided as
recognizer.all
. parglare recognizer loader will implicitly search for
recognizer.all
.
By default, a name of a decorated function will serve as a terminal name this
recognizer is defined for. But, you can provide different name using a string
parameter to recognizer
decorator, like:
@recognizer('NUMERIC_ID')
def number(input, pos):
...
In this case grammar terminal is named NUMERIC_ID
while the recognition
function is named number
. This can be used, for example, to create a library
of common recognizer function and use them in grammar for terminals with
different names, like:
from somemodule import myrecognizer
from parglare import get_collector
recognizer = get_collector()
recognizer('NUMERIC_ID')(myrecognizer)
You can use a fully qualified terminal name to override recognizer for imported terminal:
@recognizer('base.COMMA')
def comma_recognizer(input, pos):
if input[pos] == ',':
return input[pos:pos + 1]
In this case there is an imported grammar base
whose terminal COMMA
recognizer has been overridden by comma_recognizer
recognizer function.
Warning
Since the way a recognizer module is imported in Python you must use only Python absolute module imports inside the recognizer module.
Recognizer search order¶
Recognizers are loaded from a grammar module but can be overridden from importing
grammars, using FQN of the terminal, or by using recognizers
parameter of the
grammar.
Grammar file actions¶
Similarly to recognizers, actions can be provided in a Python file named <base
grammar name>_actions.py
that should be found in the same folder where the
grammar file is found.
For example, if grammar file is named mygrammar.pg
than actions module should
be named mygrammar_actions.py
.
For a parglare to be able to collect all actions defined in a module a
collector
is used in very much the same way as it is used for recognizers. It
is a decorator constructed in the Python actions module and used to decorate
each action function.
For example, mygrammar_actions.py
might be given as:
from parglare import get_collector
action = get_collector()
@action
def first_rule(context, nodes):
...
@action
def second_rule(context, nodes):
...
action
object is collector
in this case. It will construct a dictionary of
all actions decorated by it and that dictionary will be provided as
action.all
. parglare action loader will implicitly search for action.all
.
By default, a name of a decorated function will serve as a grammar symbol name
or in-grammar defined action name (using @
, see syntax for action
specification)
this action is defined for. But, you can provide different name using a string
parameter to action
decorator:
Warning
Since the way an action module is imported in Python you must use only Python absolute module imports inside of it.
Action search order¶
Actions are searched in the order of specificity by searching the following (in the given order):
- Actions given using
actions
parameter by FQN of the symbol, - Actions loaded from actions module using FQN of the symbol,
- Actions given using
actions
parameter by FQN of the action, - Actions loaded from actions module using FQN of the action,
- Actions given using
actions
parameter by symbol name, - Actions loaded from actions module using symbol name,
- Actions given using
actions
parameter by action name, - Actions loaded from actions module using action name,
- parglare built-in actions using action name.