Parser parameters¶
There are several parameters you can pass during the parser construction. The
mandatory first parameter is the Grammar instance. Other parameters are
explained in the rest of this section.
All parameters described here work for both parglare.Parser and
parglare.GLRParser classes.
actions¶
This parameters is a dict of actions keyed by the name of the grammar rule.
layout_actions¶
This parameter is used to specify actions called when the rules of layout sub-grammar are reduced. This is rarely needed but there are times when you would like to process matched layout (e.g. whitespaces, comments).
It is given in the same format as actions parameter, a dict of callables keyed
by grammar rule names.
ws¶
This parameter specifies a string whose characters are considered to be
whitespace. By default its value is '\n\r\t '. It is used if layout
sub-grammar
(LAYOUT grammar rule) is not defined. If LAYOUT rule is given in the grammar
it is used instead and this parameter is ignored.
build_tree¶
A boolean whose default value is False. If set to True parser will call
implicit actions that will build the parse tree.
call_actions_during_tree_build¶
By default, this parameter is set to False. If set to True, parser will call
actions during the parse tree parse tree building
process. The return value of each action will be discarded, since they directly
affect the parse tree building process.
Warning
Use this parameter with a special care when GLR is used, since actions will be called even on trees that can't be completed (unsuccessful parses).
consume_input¶
A boolean whose value is True by default. If True the whole input must be
consumed for the parse to be considered successful. This is most of the time
what you want. If set to False then LR parser will parse as much as possible
and leave the rest of the input unconsumed while GLR parser will produce all
possible parses with both completely and incompletely consumed input.
Warning
Be aware that setting this option to False for GLR usually leads to high
level of ambiguity and multiple parses as any substring from beginning of
the input that parses will be considered a valid parse.
prefer_shifts¶
By default set to True for LR parser and to False for GLR parser. In case of
shift/reduce conflicts this strategy would favor shift over
reduce. You can still use associativity
rules to decide per production.
You can disable this rule on per-production basis by using nops on the
production.
Warning
Do not use prefer_shifts if you don't understand the implications. Try to
understand conflicts and
resolution strategies.
prefer_shifts_over_empty¶
By default set to True for LR parser and to False for GLR parser. In case
of shift/reduce conflicts on empty reductions this strategy
would favor shift over reduce. You can still
use associativity rules to decide per
production.
You can disable this rule on per-production basis by using nopse on the
production.
Warning
Do not use prefer_shifts_over_empty if you don't understand the
implications. Try to understand conflicts and
resolution strategies.
error_recovery¶
By default set to False. If set to True default error recovery will be used.
If set to a Python function, the function will be called to recover from errors.
For more information see Error recovery.
debug/debug_layout¶
This parameter if set to True will put the parser in debug mode. In this mode
parser will print a detailed information of its actions to the standard output.
To put layout subparser in the debug mode use the debug_layout parameter. Both
parameters are set to False by default.
For more information see Debugging
debug_colors¶
Set this to True to enable terminal colors in debug/trace output. False by
default.
tables¶
The value of this parameter is either parglare.LALR or parglare.SLR and it
is used to choose the type of LR tables to create. By default LALR tables are
used with a slight twist to avoid Reduce/Reduce conflicts that may happen with
pure LALR tables. This parameter should not be used in normal circumstances and
is provided more for experimentation purposes.
force_load_table¶
LR table is loaded from <grammar_file_name>.pgt file if the file exists and is
newer than all of the grammar files, root and imported. If any of the grammar
file modification time is greater than the modification time of the cached LR
table file, table is recalculated and persisted. If you are deploying the parser
in a way that will change file modification times which would trigger table
calculation you can set force_load_table to True. If this flag is set no
modification check will be performed and table calculation will happen only if
.pgt file doesn't exist.
table¶
You can pass precomputed parsing table here. This is useful for implementing
custom parse table caching. None value for this parameter (the default)
instructs parser to build (or fetch from cache) it's own tables internally.
Example flow for custom caching is shown in an example.
Warning
Be careful to provide parse tables compatible with parser type. Passing
tables containing conflicts to Parser class will probably result in an
error, but passing tables with automatically resolved conflicts
(prefer_shifts=True) to GLRParser will result in parser which may skip
proper parses.
parse and parse_file calls¶
parse call is used to parse input string or list of objects. For parsing of
textual file parse_file is used.
These two calls accepts the following parameters:
-
input_str- first positional and mandatory parameter only forparsecall - the input string/list of objects. -
position- the start position to parse from. By default 0. -
extra- an object used for arbitrary user state kept during parsing. It will be accessible on context-like objects. If not given an instance ofdictwill be created. -
file_name- first positional and mandatory parameter only forparse_filecall - the name/path of the file to parse.
Token class¶
This class from parglare.parser is used to represent lookahead tokens. Token
is a concrete matched terminal from the input stream.
Attributes:
-
symbol(Terminal) - terminal grammar symbol represented by this token, -
value(listorstr) - matched part of the input stream, -
additional_data(list) - additional information returned by a custom recognizer. -
length(int) - length of the matched input.