Parsers
Parsers use tokens from lexer as inputs and recognize syntactic elements. Then, they call a builder to produce the final output.
There are two flavours of parsers supported by Rustemo:
- Deterministic LR
- Non-deterministic GLR, or more precise Right-Nulled GLR
GLR parsing is more complex as it must handle all possibilities so there is some overhead and LR parsing is generally faster. Thus, use GLR only if you know that you need it or in the early development process when you want to deal with SHIFT/REDUCE conflicts later.
Another benefit of LR parsing is that it is deterministic and non-ambiguous. If the input can be parsed there is only one possible way to do it with LR.
The API for both flavours is similar. You create an instance of the generated
parser type and call either parse or parse_file where the first method
accepts the input directly while the second method accepts the path to the file
that needs to be parsed.
For example, in the calculator tutorial, we create a new parser instance and
call parse to parse the input supplied by the user on the stdin:
fn main() { let mut expression = String::new(); // Read the line from the input println!("Expression:"); io::stdin() .read_line(&mut expression) .expect("Failed to read line."); // Parse the line and get the result. let result = CalculatorParser::new().parse(&expression); // Print the result using Debug formatter. println!("{result:#?}"); }
The parser type CalculatorParser is generated by Rustemo from grammar
calculator.rustemo.
The result of the parsing process is a Result value which contains either the
result of parsing if successful, in the Ok variant, or the error value in
Err variant.
If deterministic parsing is used the result will be the final output constructed by the configured builder.
For GLR the result will be Forest which contains all the possible
trees/solution for the given input. For the final output you have to choose the
tree and call the builder over it.
To generate GLR parser either set the algorithm using settings API (e.g. from build.rs script):
#![allow(unused)] fn main() { rustemo_compiler::Settings::new().parser_algo(ParserAlgo::GLR).process_dir() }
or call rcomp CLI with --parser-algo glr over your grammar file.
For example of calling GLR parser see this test:
#![allow(unused)] fn main() { #[test] fn glr_extract_tree_from_forest() { let forest = CalcParser::new().parse("1 + 4 * 9 + 3 * 2 + 7").unwrap(); output_cmp!( "src/glr/forest/forest_tree_first.ast", format!("{:#?}", forest.get_first_tree().unwrap()) ); output_cmp!( "src/glr/forest/forest_tree_17.ast", format!("{:#?}", forest.get_tree(17).unwrap()) ); output_cmp!( "src/glr/forest/forest_tree_last.ast", format!("{:#?}", forest.get_tree(41).unwrap()) ); // Accessing a tree past the last. assert!(forest.get_tree(42).is_none()); let tree = forest.get_tree(41).unwrap(); output_cmp!( "src/glr/forest/forest_tree_children.ast", format!("{:#?}", tree.children()[0].children()) ); } }
The most useful API calls for Forest are get_tree and get_first_tree.
There is also solutions which gives your the number of trees in the forest.
Forest supports into_iter() and iter() so it can be used in the context of
a for loop.
#![allow(unused)] fn main() { #[test] fn glr_forest_into_iter() { let forest = CalcParser::new().parse("1 + 4 * 9 + 3 * 2 + 7").unwrap(); let mut forest_get_tree_string = String::new(); let mut forest_iter_string = String::new(); for tree_idx in 0..forest.solutions() { forest_get_tree_string.push_str(&format!("{:#?}", forest.get_tree(tree_idx).unwrap())) } for tree in forest { forest_iter_string.push_str(&format!("{tree:#?}")); } assert_eq!(forest_get_tree_string, forest_iter_string); output_cmp!("src/glr/forest/forest_into_iter.ast", forest_iter_string); } #[test] fn glr_forest_iter() { let forest = CalcParser::new().parse("1 + 4 * 9 + 3 * 2 + 7").unwrap(); let mut forest_get_tree_string = String::new(); let mut forest_iter_string = String::new(); let mut forest_iter_ref_string = String::new(); for tree_idx in 0..forest.solutions() { forest_get_tree_string.push_str(&format!("{:#?}", forest.get_tree(tree_idx).unwrap())) } for tree in forest.iter() { forest_iter_string.push_str(&format!("{tree:#?}")); } for tree in &forest { forest_iter_ref_string.push_str(&format!("{tree:#?}")); } assert_eq!(forest_get_tree_string, forest_iter_string); assert_eq!(forest_get_tree_string, forest_iter_ref_string); output_cmp!("src/glr/forest/forest_iter.ast", forest_iter_string); } }
A tree can accept a builder using the build method. For an example of calling
the default builder over the forest tree see this test:
#![allow(unused)] fn main() { }