Parsers
Parsers use tokens from lexer as inputs and recognize syntactic elements. Then, they call a builder to produce the final output.
There are two flavours of parsers supported by Rustemo:
- Deterministic LR
- Non-deterministic GLR, or more precise Right-Nulled GLR
GLR parsing is more complex as it must handle all possibilities so there is some overhead and LR parsing is generally faster. Thus, use GLR only if you know that you need it or in the early development process when you want to deal with SHIFT/REDUCE conflicts later.
Another benefit of LR parsing is that it is deterministic and non-ambiguous. If the input can be parsed there is only one possible way to do it with LR.
The API for both flavours is similar. You create an instance of the generated
parser type and call either parse
or parse_file
where the first method
accepts the input directly while the second method accepts the path to the file
that needs to be parsed.
For example, in the calculator tutorial, we create a new parser instance and
call parse
to parse the input supplied by the user on the stdin:
fn main() { let mut expression = String::new(); // Read the line from the input println!("Expression:"); io::stdin() .read_line(&mut expression) .expect("Failed to read line."); // Parse the line and get the result. let result = CalculatorParser::new().parse(&expression); // Print the result using Debug formatter. println!("{:#?}", result); }
The parser type CalculatorParser
is generated by Rustemo from grammar
calculator.rustemo
.
The result of the parsing process is a Result
value which contains either the
result of parsing if successful, in the Ok
variant, or the error value in
Err
variant.
If deterministic parsing is used the result will be the final output constructed by the configured builder.
For GLR the result will be Forest
which contains all the possible
trees/solution for the given input. For the final output you have to choose the
tree and call the builder over it.
To generate GLR parser either set the algorithm using settings API (e.g. from build.rs
script):
#![allow(unused)] fn main() { rustemo_compiler::Settings::new().parser_algo(ParserAlgo::GLR).process_dir() }
or call rcomp
CLI with --parser-algo glr
over your grammar file.
For example of calling GLR parser see this test:
#![allow(unused)] fn main() { #[test] fn glr_extract_tree_from_forest() { let forest = CalcParser::new().parse("1 + 4 * 9 + 3 * 2 + 7").unwrap(); output_cmp!( "src/glr/forest/forest_tree_first.ast", format!("{:#?}", forest.get_first_tree().unwrap()) ); output_cmp!( "src/glr/forest/forest_tree_17.ast", format!("{:#?}", forest.get_tree(17).unwrap()) ); output_cmp!( "src/glr/forest/forest_tree_last.ast", format!("{:#?}", forest.get_tree(41).unwrap()) ); // Accessing a tree past the last. assert!(forest.get_tree(42).is_none()); let tree = forest.get_tree(41).unwrap(); output_cmp!( "src/glr/forest/forest_tree_children.ast", format!("{:#?}", tree.children()[0].children()) ); } }
The most useful API calls for Forest
are get_tree
and get_first_tree
.
There is also solutions
which gives your the number of trees in the forest.
Forest
supports into_iter()
and iter()
so it can be used in the context of
a for loop.
#![allow(unused)] fn main() { #[test] fn glr_forest_into_iter() { let forest = CalcParser::new().parse("1 + 4 * 9 + 3 * 2 + 7").unwrap(); let mut forest_get_tree_string = String::new(); let mut forest_iter_string = String::new(); for tree_idx in 0..forest.solutions() { forest_get_tree_string.push_str(&format!("{:#?}", forest.get_tree(tree_idx).unwrap())) } for tree in forest { forest_iter_string.push_str(&format!("{tree:#?}")); } assert_eq!(forest_get_tree_string, forest_iter_string); output_cmp!("src/glr/forest/forest_into_iter.ast", forest_iter_string); } #[test] fn glr_forest_iter() { let forest = CalcParser::new().parse("1 + 4 * 9 + 3 * 2 + 7").unwrap(); let mut forest_get_tree_string = String::new(); let mut forest_iter_string = String::new(); let mut forest_iter_ref_string = String::new(); for tree_idx in 0..forest.solutions() { forest_get_tree_string.push_str(&format!("{:#?}", forest.get_tree(tree_idx).unwrap())) } for tree in forest.iter() { forest_iter_string.push_str(&format!("{tree:#?}")); } for tree in &forest { forest_iter_ref_string.push_str(&format!("{tree:#?}")); } assert_eq!(forest_get_tree_string, forest_iter_string); assert_eq!(forest_get_tree_string, forest_iter_ref_string); output_cmp!("src/glr/forest/forest_iter.ast", forest_iter_string); } }
A tree can accept a builder using the build
method. For an example of calling
the default builder over the forest tree see this test:
#![allow(unused)] fn main() { }