Smart parser
This software is a library that generates source code of recursive descent parsers based on a grammar consisting of the parsing expressions and native Dart language source code
Version: 1.0.3
- Smart parser
- About this software
- Practical use
- Grammar
- Generating the parser source code
- Error handling system
- Expressions
- Expression
AnyCharacter - Expression
AndPredicate - Expression
CharacterClass - Expression
Group - Expression
Literal - Expression
NotPredicate - Expression
OneOrMore - Expression
Optional - Expression
OrderedChoice - Expression
Sequence - Expression
ZeroOrMore - Expression
Action - Expression
Capture - Expression
Predicate - Meta expression
@position - Meta expression
@while - Semantic values
- Parsing case-insensitive data
- Parsing data from files
- Examples of generated errors
About this software
This software is a library that generates source code of recursive descent parsers based on a grammar consisting of the parsing expressions and native Dart language source code.
Productions are generated as functions, and expressions are generated as statements.
The expression generator generates source code in several stages.
In the first stage, a states is generated from expressions.
In the second stage, parsing code generators are created for expressions within the states.
In the third stage, the state machine is built, and as each individual state is built, the production code is filled with expression code, in accordance with the order of the corresponding state events.
This approach allows for the generation of more efficient parsing algorithms.
In particular, the advantages are as follows:
- The number of variables (RHS) used is reduced
- The number of direct uses of values (LHS), including constant values, increases
- The efficiency of control transfer increases
As a result, the output (parsing source code) is the same algorithms that can be written manually or generated by a regular generator, but with some degree of optimization due to the use of a state machine.
An example of transferring control using the return statement.
Grammar code:
`int` ABC =>
[a] / [b] / [c]
Dart code:
/// [int] **ABC**
/// ```txt
/// `int` ABC =>
/// [a] / [b] / [c]
/// ```
Result<int>? parseABC(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
return const Ok(97);
}
// [b]
if ($0 == 98) {
state.position += 1;
return const Ok(98);
}
// [c]
if ($0 == 99) {
state.position += 1;
return const Ok(99);
}
return null;
}
An example of transferring control using the labeled break statements.
Grammar code:
`(int, int)` ABC =>
ab = ([a] / [b])
c = [c]
$ = { (ab, c) }
Dart code:
/// [(int, int)] **ABC**
/// ```txt
/// `(int, int)` ABC =>
/// ab = ([a] / [b])
/// c = [c]
/// $ = { (ab, c) }
/// ```
Result<(int, int)>? parseABC(State state) {
final $0 = state.position;
Result<int>? $1;
$l:
{
final $2 = state.peek();
// [a]
if ($2 == 97) {
state.position += 1;
$1 = const Ok(97);
break $l;
}
// [b]
if ($2 == 98) {
state.position += 1;
$1 = const Ok(98);
break $l;
}
}
if ($1 != null) {
final ab = $1.$1;
final $3 = state.peek();
// [c]
if ($3 == 99) {
state.position += 1;
const c = 99;
final $4 = (ab, c);
return Ok($4);
} else {
state.backtrack($0);
}
}
return null;
}
An example of transferring control using the continue statements.
Grammar code:
`List<int>` AB =>
$ = ([a] / [b])+
Dart code:
/// [List<int>] **AB**
/// ```txt
/// `List<int>` AB =>
/// $ = ([a] / [b])+
/// ```
Result<List<int>>? parseAB(State state) {
final $0 = <int>[];
// (1)
while (true) {
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0.add(97);
continue;
}
// [b]
if ($1 == 98) {
state.position += 1;
$0.add(98);
continue;
}
break;
}
if ($0.isNotEmpty) {
return Ok($0);
}
return null;
}
Practical use
The grammar is simple and intuitive. Understanding the grammar should not be difficult.
The quality of the generated code is quite acceptable.
The performance of the generated parsers is quite good.
All of the above allows this software to be used for the implementation of practical applications, including the tokenizers and real-time parsers (such as JSON, CSV, XML and others).
The planned feature is to generate a parsers that parses the tokenized input produced by the tokenizers.
Example of parsing a C escape sequence (partially).
Grammar code:
`String` Escape =>
"n"
$ = `const` { '\n' }
----
"r"
$ = `const` { '\r' }
----
"t"
$ = `const` { '\t' }
Dart code:
/// [String] **Escape**
/// ```txt
/// `String` Escape =>
/// "n"
/// $ = `const` { '\n' }
/// ----
/// "r"
/// $ = `const` { '\r' }
/// ----
/// "t"
/// $ = `const` { '\t' }
/// ```
Result<String>? parseEscape(State state) {
final $0 = state.peek();
// 'n'
if ($0 == 110) {
state.position += 1;
const $1 = '\n';
return const Ok($1);
}
// 'r'
if ($0 == 114) {
state.position += 1;
const $2 = '\r';
return const Ok($2);
}
// 't'
if ($0 == 116) {
state.position += 1;
const $3 = '\t';
return const Ok($3);
}
return null;
}
Example of punctuation token generation.
The generator does not analyze the code into actions, but the source code of the action must have balanced pairs of { and } characters, and for this reason these characters are presented in a different form.
Grammar code:
`Token` Punctuation =>
{ final start = state.position; }
","
$ = { _token(start, state.position, ",", tokenKind.comma) }
----
"}"
$ = { _token(start, state.position, "\u007B", tokenKind.openBrace) }
----
"{"
$ = { _token(start, state.position, "\u007D", tokenKind.closeBrace) }
----
":"
$ = { _token(start, state.position, ":", tokenKind.colon) }
----
"=>"
$ = { _token(start, state.position, "=>", tokenKind.rightArrow) }
Dart code:
/// [Token] **Punctuation**
/// ```txt
/// `Token` Punctuation =>
/// { final start = state.position; }
/// ","
/// $ = { _token(start, state.position, ",", tokenKind.comma) }
/// ----
/// "}"
/// $ = { _token(start, state.position, "\u007B", tokenKind.openBrace) }
/// ----
/// "{"
/// $ = { _token(start, state.position, "\u007D", tokenKind.closeBrace) }
/// ----
/// ":"
/// $ = { _token(start, state.position, ":", tokenKind.colon) }
/// ----
/// "=>"
/// $ = { _token(start, state.position, "=>", tokenKind.rightArrow) }
/// ```
Result<Token>? parsePunctuation(State state) {
final start = state.position;
final $0 = state.peek();
// ','
if ($0 == 44) {
state.position += 1;
final $1 = _token(start, state.position, ",", tokenKind.comma);
return Ok($1);
}
// '}'
if ($0 == 125) {
state.position += 1;
final $2 = _token(start, state.position, "\u007B", tokenKind.openBrace);
return Ok($2);
}
// '{'
if ($0 == 123) {
state.position += 1;
final $3 = _token(start, state.position, "\u007D", tokenKind.closeBrace);
return Ok($3);
}
// ':'
if ($0 == 58) {
state.position += 1;
final $4 = _token(start, state.position, ":", tokenKind.colon);
return Ok($4);
}
if ($0 == 61 && state.startsWith('=>')) {
state.position += 2;
final $5 = _token(start, state.position, "=>", tokenKind.rightArrow);
return Ok($5);
}
return null;
}
Example of identifier token generation.
Grammar code:
`String` Identifier =>
# !keyword
!(
# keywords
(
"foreach"
---
"for"
)
# !identifier cont.
! [a-zA-Z0-9]
)
# identifier
$ = <
[a-zA-Z]
[a-zA-Z0-9]*
>
Dart code:
/// [String] **Identifier**
/// ```txt
/// `String` Identifier =>
/// # !keyword
/// !(
/// # keywords
/// (
/// "foreach"
/// ---
/// "for"
/// )
/// # !identifier cont.
/// ! [a-zA-Z0-9]
/// )
/// # identifier
/// $ = <
/// [a-zA-Z]
/// [a-zA-Z0-9]*
/// >
/// ```
Result<String>? parseIdentifier(State state) {
final $0 = state.position;
state.predicate++;
var $1 = true;
var $2 = false;
$l:
{
final $3 = state.peek();
if ($3 == 102 && state.startsWith('foreach')) {
state.position += 7;
$2 = true;
break $l;
}
if ($3 == 102 && state.startsWith('for')) {
state.position += 3;
$2 = true;
break $l;
}
}
if ($2) {
final $4 = state.position;
state.predicate++;
var $5 = true;
final $6 = state.peek();
// [a-zA-Z0-9]
final $7 = $6 <= 90 ? $6 >= 65 || $6 >= 48 && $6 <= 57 : $6 >= 97 && $6 <= 122;
if ($7) {
state.position += 1;
$5 = false;
state.backtrack($4);
}
state.predicate--;
if ($5) {
$1 = false;
state.backtrack($0);
} else {
state.backtrack($0);
}
}
state.predicate--;
if ($1) {
final $8 = state.peek();
// [a-zA-Z]
final $9 = $8 <= 90 ? $8 >= 65 : $8 >= 97 && $8 <= 122;
if ($9) {
state.position += 1;
// (0)
while (true) {
final $10 = state.peek();
// [a-zA-Z0-9]
final $11 = $10 <= 90 ? $10 >= 65 || $10 >= 48 && $10 <= 57 : $10 >= 97 && $10 <= 122;
if ($11) {
state.position += 1;
continue;
}
break;
}
final $12 = state.substring($0, state.position);
return Ok($12);
}
}
return null;
}
In practice, this algorithm works quite quickly.
Although this is not the fastest parsing method in the world, but it is simple and understandable.
Grammar
Grammar declaration is made using sections, like sections for a preprocessor, but at the same time, it should be noted that preprocessing is not performed and grammar processing (parsing) occurs in one stage.
3 sections are used to declare the grammar:
- Section for declaring directives and global members
- Section for declaring members of instances of the parser class
- Section for declaring grammar rules
Example of a grammar declaration:
%{
import 'foo.dart';
}%
%%
const SimpleParser();
%%
AZ => [A-Za-z]*
The grammar must contain at least one production rule, which means that using a section to declare grammar rules is mandatory. The use of other sections is optional and is determined by the actual needs based on the chosen method of declaring the grammar.
Generating the parser source code
The parser source code is generated using the ParserGenerator class.
An example of generating parser source code.
import 'dart:io';
import 'package:smart_parser/parser_generator.dart';
void main(List<String> args) {
const inputFile = 'lib/src/smart_parser/smart_parser.grammar';
const outputFile = 'lib/src/smart_parser/smart_parser.dart';
final source = File(inputFile).readAsStringSync();
final options = ParserGeneratorOptions(name: 'SmartParser');
final parserGenerator = ParserGenerator(options: options, source: source);
final output = parserGenerator.generate();
File(outputFile).writeAsStringSync(output);
Process.runSync(Platform.executable, ['format', outputFile]);
}
Error handling system
The error handling system is based on the recommendation of Brian Ford, who introduced PEG.
Though there is probably no perfect method of deciding exactly what information is the “most relevant” to an error, a simple heuristic that provides good results in practice is simply to prefer information produced at positions farthest to the right in the input stream.
-- Brian Ford
This means that everything that has been successfully parsed is considered valid input data.
However, this method does not in any way determine how exactly to generate errors.
This method only offers a way to determine the location of the most relevant error.
The error handling system used in this software classifies parsing failures into two types:
- Failure
- Error
The difference between an error and a failure is that an error is a failure with additional information about the cause of the failure.
This, in turn, means that failure can be detected automatically, while errors must be generated explicitly.
For this purpose, the ability to define error handlers and error generation procedures is provided.
An error handler can be defined at the end of a Sequence expression using the ~{} notation.
⚠ Important information:
To reduce the size of the generated code and increase the performance of error handling, the generator will check for the presence of procedure calls in the handler code.
This is a trivial check for the presence of certain signatures.
Thus, for correct operation it is necessary that calls to processing procedures be implemented directly in the error handler code.
Example of incorrect usage.
Grammar code:
`Expression` Expression =>
Additional
~ { _handle_errors(state); }
Dart code:
/// [Expression] **Expression**
/// ```txt
/// `Expression` Expression =>
/// Additional
/// ~ { _handle_errors(state); }
/// ```
Result<Expression>? parseExpression(State state) {
final $0 = parseAdditional(state);
if ($0 != null) {
return $0;
} else {
_handle_errors(state);
}
return null;
}
Examples of correct usage.
Grammar code:
`String` For =>
$ = <[fF][oO][rR]>
~ { state.errorExpected('FOR'); }
Dart code:
/// [String] **For**
/// ```txt
/// `String` For =>
/// $ = <[fF][oO][rR]>
/// ~ { state.errorExpected('FOR'); }
/// ```
Result<String>? parseFor(State state) {
Result<String>? $0;
$l:
{
final $1 = state.position;
final $2 = state.peek();
// [fF]
final $3 = $2 == 70 || $2 == 102;
if ($3) {
state.position += 1;
final $4 = state.peek();
// [oO]
final $5 = $4 == 79 || $4 == 111;
if ($5) {
state.position += 1;
final $6 = state.peek();
// [rR]
final $7 = $6 == 82 || $6 == 114;
if ($7) {
state.position += 1;
final $8 = state.substring($1, state.position);
$0 = Ok($8);
break $l;
} else {
state.backtrack($1);
}
} else {
state.backtrack($1);
}
}
}
if ($0 != null) {
return $0;
} else {
state.errorExpected('FOR');
}
return null;
}
Grammar code:
`Expression` Expression =>
Additional
~ {
state.removeRecentErrors();
state.errorExpected('expression');
}
Dart code:
/// [Expression] **Expression**
/// ```txt
/// `Expression` Expression =>
/// Additional
/// ~ {
/// state.removeRecentErrors();
/// state.errorExpected('expression');
/// }
/// ```
Result<Expression>? parseExpression(State state) {
final $0 = state.setErrorState();
final $1 = parseAdditional(state);
if ($1 != null) {
state.restoreErrorState($0);
return $1;
} else {
state.removeRecentErrors();
state.errorExpected('expression');
state.restoreErrorState($0);
}
return null;
}
Grammar code:
`num` Number =>
NumberRaw
~ {
state.errorIncorrect('Unterminated number');
state.errorExpected('number');
}
Dart code:
/// [num] **Number**
/// ```txt
/// `num` Number =>
/// NumberRaw
/// ~ {
/// state.errorIncorrect('Unterminated number');
/// state.errorExpected('number');
/// }
/// ```
Result<num>? parseNumber(State state) {
final $0 = state.beginErrorHandling();
final $1 = parseNumberRaw(state);
if ($1 != null) {
state.endErrorHandling($0);
return $1;
} else {
state.errorIncorrect('Unterminated number');
state.errorExpected('number');
state.endErrorHandling($0);
}
return null;
}
The following procedures are available for generating errors:
state.error()state.errorExpected()state.errorIncorrect()
A procedure state.removeRecentErrors() is also available. It removes the recent errors generated at the starting position of the parsing Sequence expression.
Example of state.errorIncorrect() procedure.
Grammar code:
`int` HexValue =>
n = <
@while (4, 4) {
Hex
}
>
$ = { int.parse(n, radix: 16) }
~ {
state.errorIncorrect('Invalid four-digit number', true);
state.errorExpected('hex number');
}
Dart code:
/// [int] **HexValue**
/// ```txt
/// `int` HexValue =>
/// n = <
/// @while (4, 4) {
/// Hex
/// }
/// >
/// $ = { int.parse(n, radix: 16) }
/// ~ {
/// state.errorIncorrect('Invalid four-digit number', true);
/// state.errorExpected('hex number');
/// }
/// ```
Result<int>? parseHexValue(State state) {
final $0 = state.beginErrorHandling();
final $1 = state.position;
var $2 = 0;
// (4, 4)
while ($2 < 4) {
final $3 = parseHex(state);
if ($3 != null) {
$2++;
continue;
}
break;
}
if ($2 >= 4) {
final $4 = state.substring($1, state.position);
final n = $4;
final $5 = int.parse(n, radix: 16);
state.endErrorHandling($0);
return Ok($5);
} else {
state.backtrack($1);
state.errorIncorrect('Invalid four-digit number', true);
state.errorExpected('hex number');
state.endErrorHandling($0);
}
return null;
}
Expressions
The following parsing expressions are supported:
- AnyCharacter
- AndPredicate
- CharacterClass
- Group
- Literal
- NotPredicate
- OneOrMore
- Optional
- OrderedChoice
- Sequence
- ZeroOrMore
Detailed information about these expressions can be obtained from the following sources:
Further in the text, the above expressions will be described only as examples of generated code or with a description of features applicable to this software.
The following additional parsing expressions are supported:
- Action
- Capture
- Predicate
The following parsing meta-expressions are supported:
- @position
- @while
Additional features:
- Semantic values
Expression AnyCharacter
The AnyCharacter expression . is a parsing expression that matches any single character.
The AnyCharacter expression does not add any errors to the error buffer.
Grammar code:
`int` AnyCharacter =>
.
Dart code:
/// [int] **AnyCharacter**
/// ```txt
/// `int` AnyCharacter =>
/// .
/// ```
Result<int>? parseAnyCharacter(State state) {
final $0 = state.peek();
if ($0 >= 0) {
state.position += $0 > 0xffff ? 2 : 1;
return Ok($0);
}
return null;
}
Grammar code:
`void` AnyCharacter =>
.
Dart code:
/// [void] **AnyCharacter**
/// ```txt
/// `void` AnyCharacter =>
/// .
/// ```
Result<void>? parseAnyCharacter(State state) {
final $0 = state.peek();
if ($0 >= 0) {
state.position += $0 > 0xffff ? 2 : 1;
return Result.none;
}
return null;
}
Example of the expression eof.
Grammar code:
`void` Eof =>
! .
Dart code:
/// [void] **Eof**
/// ```txt
/// `void` Eof =>
/// ! .
/// ```
Result<void>? parseEof(State state) {
final $0 = state.position;
state.predicate++;
var $1 = true;
final $2 = state.peek();
if ($2 >= 0) {
state.position += $2 > 0xffff ? 2 : 1;
$1 = false;
state.backtrack($0);
}
state.predicate--;
if ($1) {
return Result.none;
}
return null;
}
Another example of the expression eof.
Grammar code:
`void` Eof =>
& { state.peek() < 0 }
Dart code:
/// [void] **Eof**
/// ```txt
/// `void` Eof =>
/// & { state.peek() < 0 }
/// ```
Result<void>? parseEof(State state) {
final $0 = state.peek() < 0;
if ($0) {
return Result.none;
}
return null;
}
Expression AndPredicate
The AndPredicate expression & e invokes the sub-expression e, and then succeeds if e succeeds and fails if e fails, but in either case never consumes any input.
Example with single branch.
Grammar code:
`String` AndPredicate =>
$ = <[a-zA-Z]>
& "=>"
Dart code:
/// [String] **AndPredicate**
/// ```txt
/// `String` AndPredicate =>
/// $ = <[a-zA-Z]>
/// & "=>"
/// ```
Result<String>? parseAndPredicate(State state) {
final $0 = state.position;
final $1 = state.peek();
// [a-zA-Z]
final $2 = $1 <= 90 ? $1 >= 65 : $1 >= 97 && $1 <= 122;
if ($2) {
state.position += 1;
final $3 = state.substring($0, state.position);
state.predicate++;
final $4 = state.position;
final $5 = state.peek();
if ($5 == 61 && state.startsWith('=>')) {
state.position += 2;
state.backtrack($4);
state.predicate--;
return Ok($3);
} else {
state.predicate--;
state.backtrack($0);
}
}
return null;
}
Expression CharacterClass
The CharacterClass expression [] is a parsing expression that matches a character.
The CharacterClass expression does not add any errors to the error buffer.
The following forms of character specifiers are supported:
- Single character in natural form, eg.
[a] - Multiple character ranges in natural form, eg.
[a-z],[0-9] - Single character in hexadecimal form, eg.
[{20}],[\u{20}] - Multiple character ranges in hexadecimal form, eg.
[{30-39}],[\u{30}-\u{39}] - C-escape sequences, eg.
[\n\r\t] - Escaping special characters:
\,^,-,[,],{,}eg.[\^],[\]] - Matching characters with negation in all available forms, eg.
[^a-z],[^{30-39}]
Examples of single character.
Grammar code:
`int` A =>
[a]
Dart code:
/// [int] **A**
/// ```txt
/// `int` A =>
/// [a]
/// ```
Result<int>? parseA(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
return const Ok(97);
}
return null;
}
Grammar code:
`void` A =>
[a]
Dart code:
/// [void] **A**
/// ```txt
/// `void` A =>
/// [a]
/// ```
Result<void>? parseA(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
return Result.none;
}
return null;
}
Example of character range.
Grammar code:
`int` Digits =>
[0-9]
Dart code:
/// [int] **Digits**
/// ```txt
/// `int` Digits =>
/// [0-9]
/// ```
Result<int>? parseDigits(State state) {
final $0 = state.peek();
// [0-9]
final $1 = $0 >= 48 && $0 <= 57;
if ($1) {
state.position += 1;
return Ok($0);
}
return null;
}
Example of negated character range.
Grammar code:
`int` NotDigits =>
[^0-9]
Dart code:
/// [int] **NotDigits**
/// ```txt
/// `int` NotDigits =>
/// [^0-9]
/// ```
Result<int>? parseNotDigits(State state) {
final $0 = state.peek();
// [^0-9]
final $1 = !($0 >= 48 && $0 <= 57) && !($0 < 0);
if ($1) {
state.position += $0 > 0xffff ? 2 : 1;
return Ok($0);
}
return null;
}
Example of negated character ranges.
Grammar code:
`int` NotDigitsNotLetters =>
[^0-9a-zA-Z]
Dart code:
/// [int] **NotDigitsNotLetters**
/// ```txt
/// `int` NotDigitsNotLetters =>
/// [^0-9a-zA-Z]
/// ```
Result<int>? parseNotDigitsNotLetters(State state) {
final $0 = state.peek();
// [^0-9a-zA-Z]
final $1 = !($0 <= 90 ? $0 >= 65 || $0 >= 48 && $0 <= 57 : $0 >= 97 && $0 <= 122) && !($0 < 0);
if ($1) {
state.position += $0 > 0xffff ? 2 : 1;
return Ok($0);
}
return null;
}
Example of hexadecimal value.
Grammar code:
`int` Space =>
[{20}]
Dart code:
/// [int] **Space**
/// ```txt
/// `int` Space =>
/// [{20}]
/// ```
Result<int>? parseSpace(State state) {
final $0 = state.peek();
// [ ]
if ($0 == 32) {
state.position += 1;
return const Ok(32);
}
return null;
}
Example of hexadecimal range.
Grammar code:
`int` Digits =>
[{30-39}]
Dart code:
/// [int] **Digits**
/// ```txt
/// `int` Digits =>
/// [{30-39}]
/// ```
Result<int>? parseDigits(State state) {
final $0 = state.peek();
// [0-9]
final $1 = $0 >= 48 && $0 <= 57;
if ($1) {
state.position += 1;
return Ok($0);
}
return null;
}
Example of Unicode code point.
Grammar code:
`int` Space =>
[\u{20}]
Dart code:
/// [int] **Space**
/// ```txt
/// `int` Space =>
/// [\u{20}]
/// ```
Result<int>? parseSpace(State state) {
final $0 = state.peek();
// [ ]
if ($0 == 32) {
state.position += 1;
return const Ok(32);
}
return null;
}
Examples of escaping special characters.
Grammar code:
`int` Space =>
[\^]
Dart code:
/// [int] **Space**
/// ```txt
/// `int` Space =>
/// [\^]
/// ```
Result<int>? parseSpace(State state) {
final $0 = state.peek();
// [\^]
if ($0 == 94) {
state.position += 1;
return const Ok(94);
}
return null;
}
Grammar code:
`int` Space =>
[\{]
Dart code:
/// [int] **Space**
/// ```txt
/// `int` Space =>
/// [\{]
/// ```
Result<int>? parseSpace(State state) {
final $0 = state.peek();
// [\{]
if ($0 == 123) {
state.position += 1;
return const Ok(123);
}
return null;
}
Expression Group
The Group expression groups expressions into a single expression.
⚠ Important information:
For performance reasons, the Group expression does not create a separate naming scope.
Thus, conflicts of names of semantic values are possible.
To avoid duplicate name conflicts, it is necessary to use different semantic value identifiers within a scope.
Example of a Group expression at the end of a Sequence expression.
Grammar code:
`int` AB =>
[a]
$ = ([b] / [c])
Dart code:
/// [int] **AB**
/// ```txt
/// `int` AB =>
/// [a]
/// $ = ([b] / [c])
/// ```
Result<int>? parseAB(State state) {
final $0 = state.position;
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
final $2 = state.peek();
// [b]
if ($2 == 98) {
state.position += 1;
return const Ok(98);
}
// [c]
if ($2 == 99) {
state.position += 1;
return const Ok(99);
}
state.backtrack($0);
}
return null;
}
Example of a Group expression not at the end of a Sequence expression.
Grammar code:
`int` AB =>
$ = ([b] / [c])
[a]
Dart code:
/// [int] **AB**
/// ```txt
/// `int` AB =>
/// $ = ([b] / [c])
/// [a]
/// ```
Result<int>? parseAB(State state) {
final $0 = state.position;
Result<int>? $1;
$l:
{
final $2 = state.peek();
// [b]
if ($2 == 98) {
state.position += 1;
$1 = const Ok(98);
break $l;
}
// [c]
if ($2 == 99) {
state.position += 1;
$1 = const Ok(99);
break $l;
}
}
if ($1 != null) {
final $3 = state.peek();
// [a]
if ($3 == 97) {
state.position += 1;
return $1;
} else {
state.backtrack($0);
}
}
return null;
}
Expression Literal
The Literal expression is a parsing expression that matches a string.
The Literal expression can be specified in both normal and extended forms.
A Literal expression in its normal form is specified using double quotes "", its extended form is specified using single quotes ''.
The difference between the normal form and the extended form is that when using the extended form, an expected error is added to the error buffer if parsing fails.
Examples of normal form.
Grammar code:
`String` For =>
"for"
Dart code:
/// [String] **For**
/// ```txt
/// `String` For =>
/// "for"
/// ```
Result<String>? parseFor(State state) {
final $0 = state.peek();
if ($0 == 102 && state.startsWith('for')) {
state.position += 3;
return const Ok('for');
}
return null;
}
Grammar code:
`void` For =>
"for"
Dart code:
/// [void] **For**
/// ```txt
/// `void` For =>
/// "for"
/// ```
Result<void>? parseFor(State state) {
final $0 = state.peek();
if ($0 == 102 && state.startsWith('for')) {
state.position += 3;
return Result.none;
}
return null;
}
Examples of extended form.
Grammar code:
`String` For =>
'for'
Dart code:
/// [String] **For**
/// ```txt
/// `String` For =>
/// 'for'
/// ```
Result<String>? parseFor(State state) {
final $0 = state.peek();
if ($0 == 102 && state.startsWith('for')) {
state.position += 3;
return const Ok('for');
} else {
state.errorExpected('for');
}
return null;
}
Grammar code:
`void` For =>
'for'
Dart code:
/// [void] **For**
/// ```txt
/// `void` For =>
/// 'for'
/// ```
Result<void>? parseFor(State state) {
final $0 = state.peek();
if ($0 == 102 && state.startsWith('for')) {
state.position += 3;
return Result.none;
} else {
state.errorExpected('for');
}
return null;
}
The expanded form is very similar to this expression, but nevertheless they are not the same.
Grammar code:
`String` For =>
"for"
~ { state.errorExpected('foo'); }
Dart code:
/// [String] **For**
/// ```txt
/// `String` For =>
/// "for"
/// ~ { state.errorExpected('foo'); }
/// ```
Result<String>? parseFor(State state) {
final $0 = state.peek();
if ($0 == 102 && state.startsWith('for')) {
state.position += 3;
return const Ok('for');
} else {
state.errorExpected('foo');
}
return null;
}
Example of parsing an empty string.
Grammar code:
`void` EmptyString =>
""
Dart code:
/// [void] **EmptyString**
/// ```txt
/// `void` EmptyString =>
/// ""
/// ```
Result<void> parseEmptyString(State state) {
return Result.none;
}
Expression NotPredicate
The NotPredicate expression ! e invokes the sub-expression e, and then succeeds if e fails and fails if e succeeds, but in either case never consumes any input.
Example with a child expression with single branch.
Grammar code:
`void` NotPredicate =>
! [a]
Dart code:
/// [void] **NotPredicate**
/// ```txt
/// `void` NotPredicate =>
/// ! [a]
/// ```
Result<void>? parseNotPredicate(State state) {
final $0 = state.position;
state.predicate++;
var $1 = true;
final $2 = state.peek();
// [a]
if ($2 == 97) {
state.position += 1;
$1 = false;
state.backtrack($0);
}
state.predicate--;
if ($1) {
return Result.none;
}
return null;
}
Example with a child expression with multiple branches.
Grammar code:
`void` NotPredicate =>
! ([a] / [b])
Dart code:
/// [void] **NotPredicate**
/// ```txt
/// `void` NotPredicate =>
/// ! ([a] / [b])
/// ```
Result<void>? parseNotPredicate(State state) {
final $0 = state.position;
state.predicate++;
var $1 = true;
$l:
{
final $2 = state.peek();
// [a]
if ($2 == 97) {
state.position += 1;
$1 = false;
state.backtrack($0);
break $l;
}
// [b]
if ($2 == 98) {
state.position += 1;
$1 = false;
state.backtrack($0);
break $l;
}
}
state.predicate--;
if ($1) {
return Result.none;
}
return null;
}
Expression OneOrMore
The OneOrMore expression e+ matches a sequence of one or more repetitions of a sub-expression e.
Examples with a child expression with single branch.
Grammar code:
`List<int>` OneOrMore =>
[a]+
Dart code:
/// [List<int>] **OneOrMore**
/// ```txt
/// `List<int>` OneOrMore =>
/// [a]+
/// ```
Result<List<int>>? parseOneOrMore(State state) {
final $0 = <int>[];
// (1)
while (true) {
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0.add(97);
continue;
}
break;
}
if ($0.isNotEmpty) {
return Ok($0);
}
return null;
}
Grammar code:
`void` OneOrMore =>
[a]+
Dart code:
/// [void] **OneOrMore**
/// ```txt
/// `void` OneOrMore =>
/// [a]+
/// ```
Result<void>? parseOneOrMore(State state) {
var $0 = false;
// (1)
while (true) {
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0 = true;
continue;
}
break;
}
if ($0) {
return Result.none;
}
return null;
}
Examples with a child expression with multiple branches.
Grammar code:
`List<int>` OneOrMore =>
([a] / [b])+
Dart code:
/// [List<int>] **OneOrMore**
/// ```txt
/// `List<int>` OneOrMore =>
/// ([a] / [b])+
/// ```
Result<List<int>>? parseOneOrMore(State state) {
final $0 = <int>[];
// (1)
while (true) {
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0.add(97);
continue;
}
// [b]
if ($1 == 98) {
state.position += 1;
$0.add(98);
continue;
}
break;
}
if ($0.isNotEmpty) {
return Ok($0);
}
return null;
}
Grammar code:
`void` OneOrMore =>
([a] / [b])+
Dart code:
/// [void] **OneOrMore**
/// ```txt
/// `void` OneOrMore =>
/// ([a] / [b])+
/// ```
Result<void>? parseOneOrMore(State state) {
var $0 = false;
// (1)
while (true) {
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0 = true;
continue;
}
// [b]
if ($1 == 98) {
state.position += 1;
$0 = true;
continue;
}
break;
}
if ($0) {
return Result.none;
}
return null;
}
Expression Optional
The Optional expression e? matches zero or one expression e, and then succeeds with or without result.
Example with single branch.
Grammar code:
`int?` Optional =>
[a]?
Dart code:
/// [int?] **Optional**
/// ```txt
/// `int?` Optional =>
/// [a]?
/// ```
Result<int?> parseOptional(State state) {
int? $0;
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0 = 97;
}
return Ok($0);
}
Grammar code:
`void` Optional =>
[a]?
Dart code:
/// [void] **Optional**
/// ```txt
/// `void` Optional =>
/// [a]?
/// ```
Result<void> parseOptional(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
}
return Result.none;
}
Example with multiple branches.
Grammar code:
`int?` Optional =>
([a] / [b])?
Dart code:
/// [int?] **Optional**
/// ```txt
/// `int?` Optional =>
/// ([a] / [b])?
/// ```
Result<int?> parseOptional(State state) {
int? $0;
$l:
{
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0 = 97;
break $l;
}
// [b]
if ($1 == 98) {
state.position += 1;
$0 = 98;
break $l;
}
}
return Ok($0);
}
Grammar code:
`void` Optional =>
([a] / [b])?
Dart code:
/// [void] **Optional**
/// ```txt
/// `void` Optional =>
/// ([a] / [b])?
/// ```
Result<void> parseOptional(State state) {
$l:
{
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
break $l;
}
// [b]
if ($0 == 98) {
state.position += 1;
break $l;
}
}
return Result.none;
}
Example with Production expression.
Grammar code:
`int?` Optional =>
P?
Dart code:
/// [int?] **Optional**
/// ```txt
/// `int?` Optional =>
/// P?
/// ```
Result<int?> parseOptional(State state) {
final $0 = parseP(state);
return $0;
}
Grammar code:
`int?` Optional =>
p = P?
$ = { p ?? 41 }
Dart code:
/// [int?] **Optional**
/// ```txt
/// `int?` Optional =>
/// p = P?
/// $ = { p ?? 41 }
/// ```
Result<int?> parseOptional(State state) {
final $0 = parseP(state);
final p = $0?.$1;
final $1 = p ?? 41;
return Ok($1);
}
Grammar code:
`void` Optional =>
P?
Dart code:
/// [void] **Optional**
/// ```txt
/// `void` Optional =>
/// P?
/// ```
Result<void> parseOptional(State state) {
parseP(state);
return Result.none;
}
Expression OrderedChoice
The OrderedChoice expression has the following syntax.
e1 / e2
Where e1 and e2 are alternative expressions.
If the first alternative successfully parses the input, it is accepted. If it fails, the parser then attempts the next alternative, and so on, until a match is found or all alternatives have been exhausted.
Examples of the OrderedChoice expression.
Grammar code:
`int` AOrB =>
[a]
/
[b]
Dart code:
/// [int] **AOrB**
/// ```txt
/// `int` AOrB =>
/// [a]
/// /
/// [b]
/// ```
Result<int>? parseAOrB(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
return const Ok(97);
}
// [b]
if ($0 == 98) {
state.position += 1;
return const Ok(98);
}
return null;
}
Grammar code:
`void` AOrB =>
[a]
/
[b]
Dart code:
/// [void] **AOrB**
/// ```txt
/// `void` AOrB =>
/// [a]
/// /
/// [b]
/// ```
Result<void>? parseAOrB(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
return Result.none;
}
// [b]
if ($0 == 98) {
state.position += 1;
return Result.none;
}
return null;
}
Examples using alternative syntax.
Grammar code:
`int` AOrB =>
[a]
----
[b]
Dart code:
/// [int] **AOrB**
/// ```txt
/// `int` AOrB =>
/// [a]
/// ----
/// [b]
/// ```
Result<int>? parseAOrB(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
return const Ok(97);
}
// [b]
if ($0 == 98) {
state.position += 1;
return const Ok(98);
}
return null;
}
Grammar code:
`void` AOrB =>
[a]
----
[b]
Dart code:
/// [void] **AOrB**
/// ```txt
/// `void` AOrB =>
/// [a]
/// ----
/// [b]
/// ```
Result<void>? parseAOrB(State state) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
return Result.none;
}
// [b]
if ($0 == 98) {
state.position += 1;
return Result.none;
}
return null;
}
Expression Sequence
The Sequence expression e1 e2 first invokes e1, and if e1 succeeds, subsequently invokes e2 on the remainder of the input data left unconsumed by e1, and returns the result. If either e1 or e2 fails, then the sequence expression e1 e2 fails (consuming no input).
Examples of the Sequence expression.
Grammar code:
`(int, int)` AB =>
a = [a]
b = [b]
$ = `const` { (a, b) }
Dart code:
/// [(int, int)] **AB**
/// ```txt
/// `(int, int)` AB =>
/// a = [a]
/// b = [b]
/// $ = `const` { (a, b) }
/// ```
Result<(int, int)>? parseAB(State state) {
final $0 = state.position;
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
const a = 97;
final $2 = state.peek();
// [b]
if ($2 == 98) {
state.position += 1;
const b = 98;
const $3 = (a, b);
return const Ok($3);
} else {
state.backtrack($0);
}
}
return null;
}
Grammar code:
`void` AB =>
[a]
[b]
Dart code:
/// [void] **AB**
/// ```txt
/// `void` AB =>
/// [a]
/// [b]
/// ```
Result<void>? parseAB(State state) {
final $0 = state.position;
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
final $2 = state.peek();
// [b]
if ($2 == 98) {
state.position += 1;
return Result.none;
} else {
state.backtrack($0);
}
}
return null;
}
Expression ZeroOrMore
The ZeroOrMore expression e* matches a sequence of zero or more repetitions of a sub-expression e.
Examples with a child expression with single branch.
Grammar code:
`List<int>` OneOrMore =>
[a]*
Dart code:
/// [List<int>] **OneOrMore**
/// ```txt
/// `List<int>` OneOrMore =>
/// [a]*
/// ```
Result<List<int>> parseOneOrMore(State state) {
final $0 = <int>[];
// (0)
while (true) {
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0.add(97);
continue;
}
break;
}
return Ok($0);
}
Grammar code:
`void` OneOrMore =>
[a]*
Dart code:
/// [void] **OneOrMore**
/// ```txt
/// `void` OneOrMore =>
/// [a]*
/// ```
Result<void> parseOneOrMore(State state) {
// (0)
while (true) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
continue;
}
break;
}
return Result.none;
}
Examples with a child expression with multiple branches.
Grammar code:
`List<int>` OneOrMore =>
([a] / [b])*
Dart code:
/// [List<int>] **OneOrMore**
/// ```txt
/// `List<int>` OneOrMore =>
/// ([a] / [b])*
/// ```
Result<List<int>> parseOneOrMore(State state) {
final $0 = <int>[];
// (0)
while (true) {
final $1 = state.peek();
// [a]
if ($1 == 97) {
state.position += 1;
$0.add(97);
continue;
}
// [b]
if ($1 == 98) {
state.position += 1;
$0.add(98);
continue;
}
break;
}
return Ok($0);
}
Grammar code:
`void` OneOrMore =>
([a] / [b])*
Dart code:
/// [void] **OneOrMore**
/// ```txt
/// `void` OneOrMore =>
/// ([a] / [b])*
/// ```
Result<void> parseOneOrMore(State state) {
// (0)
while (true) {
final $0 = state.peek();
// [a]
if ($0 == 97) {
state.position += 1;
continue;
}
// [b]
if ($0 == 98) {
state.position += 1;
continue;
}
break;
}
return Result.none;
}
Expression Action
The Action expression { } is a piece of code, that always succeeds, with or without a result, depending on how it is used.
The following usage methods are supported:
- List of statements
- Expression
- Sub-expression
⚠ Important information:
- Changing the parsing
positionin the action code is prohibited. This will cause the parser to malfunction. For this case, there is a meta expression@position - The source code must have balanced pairs of
{and}characters. Unbalanced characters should be presented in a different form, eg. '\u007B'
Example with a list of statements.
Grammar code:
`List<int>` Action =>
{
final list = [];
list.add(41);
}
$ = { list }
Dart code:
/// [List<int>] **Action**
/// ```txt
/// `List<int>` Action =>
/// {
/// final list = [];
/// list.add(41);
/// }
/// $ = { list }
/// ```
Result<List<int>> parseAction(State state) {
final list = [];
list.add(41);
final $0 = list;
return Ok($0);
}
The first action (without assigning semantic value) is the method used to define a list of statements.
The second action (with assignment of semantic value) is the method used to define an expression.
That is, if the action is used with the assignment of semantic value then it is a method used to define an expression, otherwise for defining statements.
In all cases, the expression completes parsing successfully.
Example with sub-expression.
Grammar code:
`void` Action =>
& { some_expression }
Dart code:
/// [void] **Action**
/// ```txt
/// `void` Action =>
/// & { some_expression }
/// ```
Result<void>? parseAction(State state) {
final $0 = some_expression;
if ($0) {
return Result.none;
}
return null;
}
Expression Capture
The Capture expression <e> invokes the expression e, and then succeeds if the expression e succeeds, and fails otherwise. If successful, the substring of the input data from the beginning to the end of the expression e is returned.
Examples of the Capture expression.
Grammar code:
`String` Digits =>
<[0-9]+>
Dart code:
/// [String] **Digits**
/// ```txt
/// `String` Digits =>
/// <[0-9]+>
/// ```
Result<String>? parseDigits(State state) {
final $0 = state.position;
var $1 = false;
// (1)
while (true) {
final $2 = state.peek();
// [0-9]
final $3 = $2 >= 48 && $2 <= 57;
if ($3) {
state.position += 1;
$1 = true;
continue;
}
break;
}
if ($1) {
final $4 = state.substring($0, state.position);
return Ok($4);
}
return null;
}
Grammar code:
`void` SkipDigits =>
<[0-9]+>
Dart code:
/// [void] **SkipDigits**
/// ```txt
/// `void` SkipDigits =>
/// <[0-9]+>
/// ```
Result<void>? parseSkipDigits(State state) {
var $0 = false;
// (1)
while (true) {
final $1 = state.peek();
// [0-9]
final $2 = $1 >= 48 && $1 <= 57;
if ($2) {
state.position += 1;
$0 = true;
continue;
}
break;
}
if ($0) {
return Result.none;
}
return null;
}
Expression Predicate
The Predicate expression & { } invokes the action { }, and then succeeds if the action code evaluates to true, and fails otherwise, without consuming any input.
The Predicate expression ! { } invokes the action { }, and then succeeds if the action code evaluates to false, and fails otherwise, without consuming any input.
Example of positive predicate.
Grammar code:
`void` Action =>
& { some_expression }
Dart code:
/// [void] **Action**
/// ```txt
/// `void` Action =>
/// & { some_expression }
/// ```
Result<void>? parseAction(State state) {
final $0 = some_expression;
if ($0) {
return Result.none;
}
return null;
}
Example of negative predicate.
Grammar code:
`void` Action =>
! { some_expression }
Dart code:
/// [void] **Action**
/// ```txt
/// `void` Action =>
/// ! { some_expression }
/// ```
Result<void>? parseAction(State state) {
final $0 = some_expression;
if (!$0) {
return Result.none;
}
return null;
}
Meta expression @position
The Position meta expression @position(n) changes the parsing position to n, then succeeds and does not return any value.
Example of input data scanning.
Grammar code:
`String` EndTag =>
{ final index = state.indexOf('-->'); }
@position({ index != -1 ? index : state.length })
$ = '-->'
Dart code:
/// [String] **EndTag**
/// ```txt
/// `String` EndTag =>
/// { final index = state.indexOf('-->'); }
/// @position({ index != -1 ? index : state.length })
/// $ = '-->'
/// ```
Result<String>? parseEndTag(State state) {
final $0 = state.position;
final index = state.indexOf('-->');
state.position = index != -1 ? index : state.length;
final $1 = state.peek();
if ($1 == 45 && state.startsWith('-->')) {
state.position += 3;
return const Ok('-->');
} else {
state.errorExpected('-->');
state.backtrack($0);
}
return null;
}
Meta expression @while
The @while meta expression is a repetition expression and works similarly to the while statement.
A slight difference is that this expression takes two positional parameters, m and n.
The first parameter m is required and specifies the minimum number of repetitions.
The second parameter n is optional and specifies the maximum number of repetitions.
If the n parameter is not specified, the number of repetitions is unlimited.
Examples of repetitions from 0 and no limit on the maximum number of repetitions.
Grammar code:
`List<int>` Letters =>
@while (0) {
[a-zA-Z]
}
Dart code:
/// [List<int>] **Letters**
/// ```txt
/// `List<int>` Letters =>
/// @while (0) {
/// [a-zA-Z]
/// }
/// ```
Result<List<int>> parseLetters(State state) {
final $0 = <int>[];
// (0)
while (true) {
final $1 = state.peek();
// [a-zA-Z]
final $2 = $1 <= 90 ? $1 >= 65 : $1 >= 97 && $1 <= 122;
if ($2) {
state.position += 1;
$0.add($1);
continue;
}
break;
}
return Ok($0);
}
Grammar code:
`void` Letters =>
@while (0) {
[a-zA-Z]
}
Dart code:
/// [void] **Letters**
/// ```txt
/// `void` Letters =>
/// @while (0) {
/// [a-zA-Z]
/// }
/// ```
Result<void> parseLetters(State state) {
// (0)
while (true) {
final $0 = state.peek();
// [a-zA-Z]
final $1 = $0 <= 90 ? $0 >= 65 : $0 >= 97 && $0 <= 122;
if ($1) {
state.position += 1;
continue;
}
break;
}
return Result.none;
}
Examples of repetitions of at least 1 and no limit on the maximum number of repetitions.
Grammar code:
`List<int>` Letters =>
@while (1) {
[a-zA-Z]
}
Dart code:
/// [List<int>] **Letters**
/// ```txt
/// `List<int>` Letters =>
/// @while (1) {
/// [a-zA-Z]
/// }
/// ```
Result<List<int>>? parseLetters(State state) {
final $0 = <int>[];
// (1)
while (true) {
final $1 = state.peek();
// [a-zA-Z]
final $2 = $1 <= 90 ? $1 >= 65 : $1 >= 97 && $1 <= 122;
if ($2) {
state.position += 1;
$0.add($1);
continue;
}
break;
}
if ($0.isNotEmpty) {
return Ok($0);
}
return null;
}
Grammar code:
`void` Letters =>
@while (1) {
[a-zA-Z]
}
Dart code:
/// [void] **Letters**
/// ```txt
/// `void` Letters =>
/// @while (1) {
/// [a-zA-Z]
/// }
/// ```
Result<void>? parseLetters(State state) {
var $0 = false;
// (1)
while (true) {
final $1 = state.peek();
// [a-zA-Z]
final $2 = $1 <= 90 ? $1 >= 65 : $1 >= 97 && $1 <= 122;
if ($2) {
state.position += 1;
$0 = true;
continue;
}
break;
}
if ($0) {
return Result.none;
}
return null;
}
Examples of repetitions of not less than 2 and not more than 3 repetitions.
Grammar code:
`List<int>` Letters =>
@while (2, 3) {
[a-zA-Z]
}
Dart code:
/// [List<int>] **Letters**
/// ```txt
/// `List<int>` Letters =>
/// @while (2, 3) {
/// [a-zA-Z]
/// }
/// ```
Result<List<int>>? parseLetters(State state) {
final $0 = state.position;
final $1 = <int>[];
// (2, 3)
while ($1.length < 3) {
final $2 = state.peek();
// [a-zA-Z]
final $3 = $2 <= 90 ? $2 >= 65 : $2 >= 97 && $2 <= 122;
if ($3) {
state.position += 1;
$1.add($2);
continue;
}
break;
}
if ($1.length >= 2) {
return Ok($1);
} else {
state.backtrack($0);
}
return null;
}
Grammar code:
`void` Letters =>
@while (2, 3) {
[a-zA-Z]
}
Dart code:
/// [void] **Letters**
/// ```txt
/// `void` Letters =>
/// @while (2, 3) {
/// [a-zA-Z]
/// }
/// ```
Result<void>? parseLetters(State state) {
final $0 = state.position;
var $1 = 0;
// (2, 3)
while ($1 < 3) {
final $2 = state.peek();
// [a-zA-Z]
final $3 = $2 <= 90 ? $2 >= 65 : $2 >= 97 && $2 <= 122;
if ($3) {
state.position += 1;
$1++;
continue;
}
break;
}
if ($1 >= 2) {
return Result.none;
} else {
state.backtrack($0);
}
return null;
}
Examples of 4 repetitions.
Grammar code:
`List<int>` Letters =>
@while (4, 4) {
[a-zA-Z]
}
Dart code:
/// [List<int>] **Letters**
/// ```txt
/// `List<int>` Letters =>
/// @while (4, 4) {
/// [a-zA-Z]
/// }
/// ```
Result<List<int>>? parseLetters(State state) {
final $0 = state.position;
final $1 = <int>[];
// (4, 4)
while ($1.length < 4) {
final $2 = state.peek();
// [a-zA-Z]
final $3 = $2 <= 90 ? $2 >= 65 : $2 >= 97 && $2 <= 122;
if ($3) {
state.position += 1;
$1.add($2);
continue;
}
break;
}
if ($1.length >= 4) {
return Ok($1);
} else {
state.backtrack($0);
}
return null;
}
Grammar code:
`void` Letters =>
@while (4, 4) {
[a-zA-Z]
}
Dart code:
/// [void] **Letters**
/// ```txt
/// `void` Letters =>
/// @while (4, 4) {
/// [a-zA-Z]
/// }
/// ```
Result<void>? parseLetters(State state) {
final $0 = state.position;
var $1 = 0;
// (4, 4)
while ($1 < 4) {
final $2 = state.peek();
// [a-zA-Z]
final $3 = $2 <= 90 ? $2 >= 65 : $2 >= 97 && $2 <= 122;
if ($3) {
state.position += 1;
$1++;
continue;
}
break;
}
if ($1 >= 4) {
return Result.none;
} else {
state.backtrack($0);
}
return null;
}
Semantic values
Semantic values are the values produced by parsing expressions.
Semantic values are used to forming parsing results.
The syntax for using semantic values is as follows.
v = e
v = `type` e
Where v is a semantic value, e is a parsing expression and type is a native type.
An example of the use of semantic value.
Grammar code:
`Sting` Digit =>
n = [0-9]
$ = { n - 48 }
Dart code:
/// [Sting] **Digit**
/// ```txt
/// `Sting` Digit =>
/// n = [0-9]
/// $ = { n - 48 }
/// ```
Result<Sting>? parseDigit(State state) {
final $0 = state.peek();
// [0-9]
final $1 = $0 >= 48 && $0 <= 57;
if ($1) {
state.position += 1;
final n = $0;
final $2 = n - 48;
return Ok($2);
}
return null;
}
The value $ is the resulting semantic value. This value is not accessible by name; therefore, it is only available for assignment.
This value is used exclusively when assigning a result value if the Sequence expression.
In certain cases, it may be useful to specify the type of semantic value.
For example, if the value is a constant or if it is necessary for the code generator to automatically infer the type of the parent expression.
Example with constant value.
Grammar code:
`String` For =>
[fF][oO][rR]
$ = `const` { 'FOR' }
~ { state.errorExpected('FOR'); }
Dart code:
/// [String] **For**
/// ```txt
/// `String` For =>
/// [fF][oO][rR]
/// $ = `const` { 'FOR' }
/// ~ { state.errorExpected('FOR'); }
/// ```
Result<String>? parseFor(State state) {
Result<String>? $0;
$l:
{
final $1 = state.position;
final $2 = state.peek();
// [fF]
final $3 = $2 == 70 || $2 == 102;
if ($3) {
state.position += 1;
final $4 = state.peek();
// [oO]
final $5 = $4 == 79 || $4 == 111;
if ($5) {
state.position += 1;
final $6 = state.peek();
// [rR]
final $7 = $6 == 82 || $6 == 114;
if ($7) {
state.position += 1;
const $8 = 'FOR';
$0 = const Ok($8);
break $l;
} else {
state.backtrack($1);
}
} else {
state.backtrack($1);
}
}
}
if ($0 != null) {
return $0;
} else {
state.errorExpected('FOR');
}
return null;
}
Example with explicit specifications the value type.
Grammar code:
`Expression` Primary =>
# ...
n = (
{ final pos = state.position; }
e = SequenceElement
{ e.sourceCode = state.substring(pos, state.position).trimRight(); }
$ = `Expression` { e }
)+
# ...
Dart code:
/// [Expression] **Primary**
/// ```txt
/// `Expression` Primary =>
/// # ...
/// n = (
/// { final pos = state.position; }
/// e = SequenceElement
/// { e.sourceCode = state.substring(pos, state.position).trimRight(); }
/// $ = `Expression` { e }
/// )+
/// # ...
/// ```
Result<Expression>? parsePrimary(State state) {
final $0 = <Expression>[];
// (1)
while (true) {
final pos = state.position;
final $1 = parseSequenceElement(state);
if ($1 != null) {
final e = $1.$1;
e.sourceCode = state.substring(pos, state.position).trimRight();
final Expression $2 = e;
$0.add($2);
continue;
}
break;
}
if ($0.isNotEmpty) {
final n = $0;
return Ok($0);
}
return null;
}
Parsing case-insensitive data
There are no special features for parsing case-insensitive data.
Parsing such data is only possible character by character.
Below are examples of how this can be implemented.
Example for a case when the result value is not important.
Grammar code:
`void` For =>
([fF][oO][rR])
~ { state.errorExpected('FOR'); }
Dart code:
/// [void] **For**
/// ```txt
/// `void` For =>
/// ([fF][oO][rR])
/// ~ { state.errorExpected('FOR'); }
/// ```
Result<void>? parseFor(State state) {
var $0 = false;
$l:
{
final $1 = state.position;
final $2 = state.peek();
// [fF]
final $3 = $2 == 70 || $2 == 102;
if ($3) {
state.position += 1;
final $4 = state.peek();
// [oO]
final $5 = $4 == 79 || $4 == 111;
if ($5) {
state.position += 1;
final $6 = state.peek();
// [rR]
final $7 = $6 == 82 || $6 == 114;
if ($7) {
state.position += 1;
$0 = true;
break $l;
} else {
state.backtrack($1);
}
} else {
state.backtrack($1);
}
}
}
if ($0) {
return Result.none;
} else {
state.errorExpected('FOR');
}
return null;
}
Example for a case when the result value is not very important.
Grammar code:
`String` For =>
[fF][oO][rR]
$ = `const` { 'FOR' }
~ { state.errorExpected('FOR'); }
Dart code:
/// [String] **For**
/// ```txt
/// `String` For =>
/// [fF][oO][rR]
/// $ = `const` { 'FOR' }
/// ~ { state.errorExpected('FOR'); }
/// ```
Result<String>? parseFor(State state) {
Result<String>? $0;
$l:
{
final $1 = state.position;
final $2 = state.peek();
// [fF]
final $3 = $2 == 70 || $2 == 102;
if ($3) {
state.position += 1;
final $4 = state.peek();
// [oO]
final $5 = $4 == 79 || $4 == 111;
if ($5) {
state.position += 1;
final $6 = state.peek();
// [rR]
final $7 = $6 == 82 || $6 == 114;
if ($7) {
state.position += 1;
const $8 = 'FOR';
$0 = const Ok($8);
break $l;
} else {
state.backtrack($1);
}
} else {
state.backtrack($1);
}
}
}
if ($0 != null) {
return $0;
} else {
state.errorExpected('FOR');
}
return null;
}
Example for a case when the result value is important.
Grammar code:
`String` For =>
$ = <[fF][oO][rR]>
~ { state.errorExpected('FOR'); }
Dart code:
/// [String] **For**
/// ```txt
/// `String` For =>
/// $ = <[fF][oO][rR]>
/// ~ { state.errorExpected('FOR'); }
/// ```
Result<String>? parseFor(State state) {
Result<String>? $0;
$l:
{
final $1 = state.position;
final $2 = state.peek();
// [fF]
final $3 = $2 == 70 || $2 == 102;
if ($3) {
state.position += 1;
final $4 = state.peek();
// [oO]
final $5 = $4 == 79 || $4 == 111;
if ($5) {
state.position += 1;
final $6 = state.peek();
// [rR]
final $7 = $6 == 82 || $6 == 114;
if ($7) {
state.position += 1;
final $8 = state.substring($1, state.position);
$0 = Ok($8);
break $l;
} else {
state.backtrack($1);
}
} else {
state.backtrack($1);
}
}
}
if ($0 != null) {
return $0;
} else {
state.errorExpected('FOR');
}
return null;
}
Parsing data from files
To implement data parsing from files, it is necessary to extend the State class.
The following class members must be overridden:
charSizeindexOflengthpeekstartsWithstrlensubstring
When generating a parser, need specify InputType.file as the value of the inputType parameter.
Examples of generated errors
A small program in the file example/example_json_errors.dart demonstrates what errors can be generated by a fairly simple JSON parser.
----------------------------------------
Source: "string
FormatException: line 1, column 8: Expected: '"'
╷
1 │ "string
│ ^
╵
----------------------------------------
Source: {"key" : "value"
FormatException: line 1, column 17: Expected: ',', '}'
╷
1 │ {"key" : "value"
│ ^
╵
----------------------------------------
Source: [0, ]
FormatException: line 1, column 5: Expected: '"', '[', 'false', 'null', 'number', 'true', '{'
╷
1 │ [0, ]
│ ^
╵
----------------------------------------
Source: [0, 1
FormatException: line 1, column 6: Expected: ',', ']'
╷
1 │ [0, 1
│ ^
╵
----------------------------------------
Source: -
FormatException: line 1, column 1: Unterminated number
╷
1 │ -
│ ^
╵
----------------------------------------
Source: 1.
FormatException: line 1, column 3: Fractional part is missing a number
╷
1 │ 1.
│ ^
╵
----------------------------------------
Source: 1E
FormatException: line 1, column 3: Exponent part is missing a number
╷
1 │ 1E
│ ^
╵
----------------------------------------
Source: "\
FormatException: line 1, column 3: Expected: 'escape character'
╷
1 │ "\
│ ^
╵
----------------------------------------
Source: "\z
FormatException: line 1, column 3: Illegal escape character
╷
1 │ "\z
│ ^
╵
----------------------------------------
Source: "\u
FormatException: line 1, column 4: Expected: '4 hexadecimal digit number'
╷
1 │ "\u
│ ^
╵
----------------------------------------
Source: "\u00
FormatException: line 1, column 6: Expected hexadecimal digit
╷
1 │ "\u00
│ ^
╵
line 1, column 4: Unterminated 4 hexadecimal digit number
╷
1 │ "\u00
│ ^^
╵