Actions¶
Actions (a.k.a. semantic actions or reductions actions) are Python callables (functions or lambdas mostly) that get called to reduce the recognized pattern to some higher concept. E.g. in the calc example actions are called to calculate sub-expressions.
There are two consideration to think of:
- Which actions are called?
- When actions are called?
Custom actions and built-in actions¶
If you don't provide actions of your own the parser will return nested list
corresponding to your grammar. Each non-terminal result in a list of evaluated
sub-expression while each terminal result in the matched string. If the parser
parameter build_tree
is set to True
the parser will
build a parse tree.
Custom actions are provided to the parser during parser instantiation as
actions
parameter which must be a Python dict where the keys are the names of
the rules from the grammar and values are the action callables or a list of
callables if the rule has more than one production/choice. You can provide
additional actions that are not named after the grammar rule names, these
actions may be referenced from the grammar
using
@
syntax for action specification.
Lets take a closer look at the quick intro example:
grammar = r"""
E: E '+' E {left, 1}
| E '-' E {left, 1}
| E '*' E {left, 2}
| E '/' E {left, 2}
| E '^' E {right, 3}
| '(' E ')'
| number;
number: /\d+(\.\d+)?/;
"""
actions = {
"E": [lambda _, nodes: nodes[0] + nodes[2],
lambda _, nodes: nodes[0] - nodes[2],
lambda _, nodes: nodes[0] * nodes[2],
lambda _, nodes: nodes[0] / nodes[2],
lambda _, nodes: nodes[0] ** nodes[2],
lambda _, nodes: nodes[1],
lambda _, nodes: nodes[0]],
"number": lambda _, value: float(value),
}
g = Grammar.from_string(grammar)
parser = Parser(g, actions=actions)
result = parser.parse("34 + 4.6 / 2 * 4^2^2 + 78")
Here you can see that for rule E
we provide a list of lambdas, one lambda for
each operation. The first element of the list corresponds to the first
production of the E
rule (E '+' E {left, 1}
), the second to the second and
so on. For number
rule there is only a single lambda which converts the
matched string to the Python float
type, because number
has only a single
production. Actually, number
is a terminal definition and thus the second
parameter in action call will not be a list but a matched value itself. At the
end we instantiate the parser and pass in our actions
using the parameter.
Each action callable receive two parameters. The first is the context object
which gives parsing context information (like the start
and end position where the match occured, the parser instance etc.). The second
parameters nodes
is a list of actual results of sub-expressions given in the
order defined in the grammar.
For example:
lambda _, nodes: nodes[0] * nodes[2],
In this line we don't care about context thus giving it the _
name. nodes[0]
will cary the value of the left sub-expression while nodes[2]
will carry the
result of the right sub-expression. nodes[1]
must be *
and we don't need to
check that as the parser already did that for us.
The result of the parsing will be the evaluated expression as the actions will
get called along the way and the result of each actions will be used as an
element of the nodes
parameter in calling actions higher in the hierarchy.
If we don't provide actions
, by default parglare will return a matched string
for each terminal and a list of sub-expressions for each non-terminal
effectively producing nested lists. If we set build_tree
parameter of the
parser to True
the parser will produce a parse tree whose
elements are instances of NodeNonTerm
and NodeTerm
classes representing a
non-terminals and terminals respectively.
action
decorator¶
You can use a special decorator/collector factory parglare.get_collector
to
create decorator that can be used to collect all actions.
from parglare import get_collector
action = get_collector()
@action
def number(_, value):
return float(value)
@action('E')
def sum_act(_, nodes):
return nodes[0] + nodes[2]
@action('E')
def pass_act_E(_, nodes):
return nodes[0]
@action
def T(_, nodes):
if len(nodes) == 3:
return nodes[0] * nodes[2]
else:
return nodes[0]
@action('F')
def parenthesses_act(_, nodes):
return nodes[1]
@action('F')
def pass_act_F(_, nodes):
return nodes[0]
p = Parser(grammar, actions=action.all)
In the previous example action
decorator is created using get_collector
factory. This decorator is parametrized where optional parameter is the name of
the action. If the name is not given the name of the decorated function will be
used. As you can see in the previous example, same name can be used multiple
times (e.g. E
for sum_act
and pass_act_E
). If same name is used multiple
times all action functions will be collected as a list in the order of
definition. Dictionary holding all actions for the created action decorator is
action.all
.
Time of actions call¶
In parglare actions can be called during parsing (i.e. on the fly) which you could use if you want to transform input immediately without building the parse tree. But there are times when you would like to build a tree first and call actions afterwards. For example, a very good reason is if you are using GLR and you want to be sure that actions are called only on the final tree.
Note
If you are using GLR be sure that your actions has no side-effects, as the
dying parsers will left those side-effects behind leading to unpredictable
behaviour. In case of doubt create trees first, choose the right one and
call actions afterwards with the call_actions
parser method.
To get the tree and call actions afterwards you supply actions
parameter to
the parser as usual and set build_tree
to True
. When the parser finishes
successfully it will return the parse tree which you pass to the call_actions
method of the parser object to execute actions. For example:
parser = Parser(g, actions=actions)
tree = parser.parse("34 + 4.6 / 2 * 4^2^2 + 78")
result = parser.call_actions(tree)
The Context object¶
The first parameter passed to the action function is the Context object. This object provides us with the parsing context of where the match occurred.
Following attributes are available on the context object:
-
start_position/end_position - the beginning and the end in the input stream where the match occured.
start_position
is the location of the first element/character in the input while theend_position
is one past the last element/character of the match. Thusend_position - start_position
will give the lenght of the match including the layout. You can useparglare.pos_to_line_col(input, position)
function to get line and column of the position. This function returns a tuple(line, column)
. -
file_name - the name/path of the file being parsed.
None
if Python string is parsed. -
input_str - the input string (or list of objects) that is being parsed.
-
layout_content - is the layout (whitespaces, comments etc.) that are collected from the previous non-layout match. Default actions for building the parse tree will attach these layouts to the tree nodes.
-
symbol - the grammar symbol (instance of either
Terminal
orNonTerminal
which inheritsGrammarSymbol
) this match is for. -
production - an instance of
parglare.grammar.Production
class available only on reduction actions (not on shifts). Represents the grammar production. -
node - this is available only if the actions are called over the parse tree using
call_actions
. It represens the instance ofNodeNonTerm
orNodeTerm
classes from the parse tree where the actions is executed. -
parser - is the reference to the parser instance. You should use this only to investigate parser configuration not to alter its state.
You can also use context object to pass information between lower level and upper level actions. You can attach any attribute you like, the context object is shared between action calls. It is shared with the internal layout parser too.
Built-in actions¶
parglare provides some common actions in the module parglare.actions
. You
can
reference these actions directly from the grammar.
Built-in actions are used implicitly by parglare as default actions in
particular case (e.g.
for syntactic sugar) but
you might want to reference some of these actions directly.
Following are parglare built-in actions from the parglare.actions
module:
-
pass_none - returns
None
; -
pass_nochange - returns second parameter of action callable (
value
ornodes
) unchanged; -
pass_empty - returns an empty list
[]
; -
pass_single - returns
nodes[0]
. Used implicitly by rules where all productions have only a single rule reference on the RHS; -
pass_inner - returns
nodes[1]
. Handy to extract sub-expression value for values in parentheses; -
collect - Used for rules of the form
Elements: Elements Element | Element;
. Implicitly used for+
operator. Returns list; -
collect_sep - Used for rules of the form
Elements: Elements separator Element | Element;
. Implicitly used for+
with separator. Returns list; -
collect_optional - Can be used for rules of the form
Elements: Elements Element | Element | EMPTY;
. Returns list; -
collect_sep_optional - Can be used for rules of the form
Elements: Elements separator Element | Element | EMPTY;
. Returns list; -
collect_right - Can be used for rules of the form
Elements: Element Elements | Element;
. Returns list; -
collect_right_sep - Can be used for rules of the form
Elements: Element separator Elements | Element;
. Returns list; -
collect_right_optional - Can be used for rules of the form
Elements: Element Elements | Element | EMPTY;
. Returns list; -
collect_right_sep_optional - Can be used for rules of the form
Elements: Element separator Elements | Element | EMPTY;
. Returns list; -
optional - Used for rules of the form
OptionalElement: Element | EMPTY;
. Implicitly used for?
operator. Returns either a sub-expression value orNone
if empty match. -
obj - Used implicitly by rules using named matches. Creates Python object with attributes derived from named matches. Objects created this way have additional attributes
_pg_start_position
/_pg_end_position
with start/end position in the input stream where the object is found.
Actions for rules using named matches¶
If named matches are used in
the grammar rule, action will be called with additional keyword parameters named
by the name of LHS of rule assignments. If no action is specified for the rule a
built-in action obj
is called and will produce instance of dynamically created
Python class corresponding to the grammar rule. See more in the section
on named matches.
If for some reason you want to override default behavior that create Python object you can create action like this:
S: first=a second=digit+[comma];
a: "a";
digit: /\d+/;
now create action function that accepts additional params:
def s_action(context, nodes, first, second):
... do some transformation and return the result of S evaluation
... nodes will contain subexpression results by position while
... first and second will contain the values of corresponding
... sub-expressions
register action on Parser
instance as usual:
parser = Parser(grammar, actions={"S": s_action})