Actions¶
Actions (a.k.a. semantic actions or reductions actions) are Python callables (functions or lambdas mostly) that get called to reduce the recognized pattern to some higher concept. E.g. in the calc example actions are called to calculate sub-expressions.
Note
LR parser can call the actions during parsing. GLR parser always build the parse
forest and the actions can be called afterwards on a chosen tree with
parser.call_actions
.
There are two consideration to think of:
- Which actions are called?
- When actions are called?
Custom actions and built-in actions¶
If you don't provide actions of your own the parser will return nested list
corresponding to your grammar. Each non-terminal result in a list of evaluated
sub-expression while each terminal result in the matched string. If the parser
parameter build_tree
is set to True
the parser will build a parse
tree.
Custom actions are provided to the parser during parser instantiation as
actions
parameter which must be a Python dict where the keys are the names of
the rules from the grammar and values are the action callables or a list of
callables if the rule has more than one production/choice. You can provide
additional actions that are not named after the grammar rule names, these
actions may be referenced from the grammar using @
syntax for action
specification.
Lets take a closer look at the quick intro example:
grammar = r"""
E: E '+' E {left, 1}
| E '-' E {left, 1}
| E '*' E {left, 2}
| E '/' E {left, 2}
| E '^' E {right, 3}
| '(' E ')'
| number;
terminals
number: /\d+(\.\d+)?/;
"""
actions = {
"E": [lambda _, nodes: nodes[0] + nodes[2],
lambda _, nodes: nodes[0] - nodes[2],
lambda _, nodes: nodes[0] * nodes[2],
lambda _, nodes: nodes[0] / nodes[2],
lambda _, nodes: nodes[0] ** nodes[2],
lambda _, nodes: nodes[1],
lambda _, nodes: nodes[0]],
"number": lambda _, value: float(value),
}
g = Grammar.from_string(grammar)
parser = Parser(g, actions=actions)
result = parser.parse("34 + 4.6 / 2 * 4^2^2 + 78")
Here you can see that for rule E
we provide a list of lambdas, one lambda for
each operation. The first element of the list corresponds to the first
production of the E
rule (E '+' E {left, 1}
), the second to the second and
so on. For number
rule there is only a single lambda which converts the
matched string to the Python float
type, because number
is a terminal
definition and thus the second parameter in action call will not be a list but a
matched value itself.
At the end we instantiate the parser and pass in our actions
using the
parameter.
Each action callable receive two parameters. The first is the context object
which gives parsing context information (like
the start and end position where the match occurred, the parser instance etc.).
The second parameters nodes
is a list of actual results of sub-expressions
given in the order defined in the grammar.
For example:
lambda _, nodes: nodes[0] * nodes[2],
In this line we don't care about the context thus giving it the _
name.
nodes[0]
will cary the value of the left sub-expression while nodes[2]
will
carry the result of the right sub-expression. nodes[1]
must be *
and we
don't need to check that as the parser already did that for us.
The result of the parsing will be the evaluated expression as the actions will
get called along the way and the result of each actions will be used as an
element of the nodes
parameter in calling actions higher in the hierarchy.
If we don't provide actions
, by default parglare will return a matched string
for each terminal and a list of sub-expressions for each non-terminal
effectively producing nested lists. If we set build_tree
parameter of the
parser to True
the parser will produce a parse tree
whose elements are instances of NodeNonTerm
and NodeTerm
classes
representing a non-terminals and terminals respectively.
action
decorator¶
You can use a special decorator/collector factory parglare.get_collector
to
create decorator that can be used to collect all actions.
from parglare import get_collector
action = get_collector()
@action
def number(_, value):
return float(value)
@action('E')
def sum_act(_, nodes):
return nodes[0] + nodes[2]
@action('E')
def pass_act_E(_, nodes):
return nodes[0]
@action
def T(_, nodes):
if len(nodes) == 3:
return nodes[0] * nodes[2]
else:
return nodes[0]
@action('F')
def parenthesses_act(_, nodes):
return nodes[1]
@action('F')
def pass_act_F(_, nodes):
return nodes[0]
p = Parser(grammar, actions=action.all)
In the previous example action
decorator is created using get_collector
factory. This decorator is parametrized where optional parameter is the name of
the action. If the name is not given the name of the decorated function will be
used. As you can see in the previous example, same name can be used multiple
times (e.g. E
for sum_act
and pass_act_E
). If same name is used multiple
times all action functions will be collected as a list in the order of
definition. Dictionary holding all actions for the created action decorator is
action.all
.
Time of actions call¶
Note
This applies for LR parsing only. GLR parser always build a forest and actions
are called afterwards with parser.call_actions
.
In parglare actions can be called during parsing (i.e. on the fly) which you could use if you want to transform input immediately without building the parse tree. But there are times when you would like to build a tree first and call actions afterwards.
To get the tree and call actions afterwards you supply actions
parameter to
the parser as usual and set build_tree
to True
. When the parser finishes
successfully it will return the parse tree which you pass to the call_actions
method of the parser object to execute actions. For example:
parser = Parser(g, actions=actions)
tree = parser.parse("34 + 4.6 / 2 * 4^2^2 + 78")
result = parser.call_actions(tree)
Built-in actions¶
parglare provides some common actions in the module parglare.actions
. You can
reference these actions directly from the
grammar.
Built-in actions are used implicitly by parglare as default actions in
particular case (e.g. for syntactic
sugar) but you might want
to reference some of these actions directly.
Following are parglare built-in actions from the parglare.actions
module:
-
pass_none - returns
None
; -
pass_nochange - returns second parameter of action callable (
value
ornodes
) unchanged; -
pass_empty - returns an empty list
[]
; -
pass_single - returns
nodes[0]
. Used implicitly by rules where all productions have only a single rule reference on the RHS; -
pass_inner - returns
nodes[1:-1]
ornodes[1] if len(nodes)==3
. Handy to strip surrounding parentheses; -
collect - Used for rules of the form
Elements: Elements Element | Element;
. Implicitly used for+
operator. Returns list; -
collect_sep - Used for rules of the form
Elements: Elements separator Element | Element;
. Implicitly used for+
with separator. Returns list; -
collect_optional - Can be used for rules of the form
Elements: Elements Element | Element | EMPTY;
. Returns list; -
collect_sep_optional - Can be used for rules of the form
Elements: Elements separator Element | Element | EMPTY;
. Returns list; -
collect_right - Can be used for rules of the form
Elements: Element Elements | Element;
. Returns list; -
collect_right_sep - Can be used for rules of the form
Elements: Element separator Elements | Element;
. Returns list; -
collect_right_optional - Can be used for rules of the form
Elements: Element Elements | Element | EMPTY;
. Returns list; -
collect_right_sep_optional - Can be used for rules of the form
Elements: Element separator Elements | Element | EMPTY;
. Returns list; -
optional - Used for rules of the form
OptionalElement: Element | EMPTY;
. Implicitly used for?
operator. Returns either a sub-expression value orNone
if empty match. -
obj - Used implicitly by rules using named matches. Creates Python object with attributes derived from named matches. Objects created this way have additional attributes
_pg_start_position
/_pg_end_position
with start/end position in the input stream where the object is found.
Actions for rules using named matches¶
If named matches are used in
the grammar rule, action will be called with additional keyword parameters named
by the name of LHS of rule assignments. If no action is specified for the rule a
built-in action obj
is called and will produce instance of dynamically created
Python class corresponding to the grammar rule. See more in the section
on named matches.
Dynamically created Python objects will have attributes created from assignments
in the grammar rules. Also, a special _pg_children
attribute is provided with
child nodes in the order as they are matched in the input. This may be useful
for tree iteration order. Please see this
test
for an example. In addition, _pg_children_names
is a list of attribute names
(i.e. a LHS of the assignments in the grammar.). Each created object has a
to_str()
method which produce a nice textual tree representation.
For performance reasons, AST nodes created with the default obj
action uses
slots so no dynamic attribute creation is possible. However, there is
uninitialized _pg_extras
attribute which can be used to add additional
user-defined information on AST nodes.
If for some reason you want to override the default behavior which creates Python object you can create an action like this:
S: first=a second=digit+[comma];
terminals
a: "a";
digit: /\d+/;
now create an action function that accepts additional params:
def s_action(context, nodes, first, second):
... do some transformation and return the result of S evaluation
... nodes will contain subexpression results by position while
... first and second will contain the values of corresponding
... sub-expressions
register action on Parser
instance as usual:
parser = Parser(grammar, actions={"S": s_action})