rookie_yaml
A (rookie) Dart
YAML 1.2+ parser.
Warning
The parser is still in active development and has missing features/intermediate functionalities. Until a stable 1.0.0
is released, package API may have breaking changes on each version.
Documentation is also limited to this README for now to provide a quick overview of the features this parser supports.
Despite the stick YAML
gets, under all that complexity, is a highly configurable data format that you can bend to your will with the right tools. Most people blame the spec. Once you start reading, it's like going down a rabbit hole. Think matryoshka doll but each layer has a low chance of recurring with quantum branching. At the start, the spec looks inviting (as it should be). At the end, you end up having several ways of doing a single thing. All in all, YAML
has potential.
Main features of this parser include:
- Guarantees a
YamlSourceNode
and aDart
object equality if they are the same kind (type). ForMap
andList
, they must have the same entries/elements respectively. The==
operator can be used (This behaviour may change to require use ofDeepCollectionEquality
for it to evaluate to true). - Preserves the parsed integer radix.
- Extensive node support and an expressive way to declare directives and tags that aligns with the
YAML
spec. - Support for custom tags and their resolvers based on their tag shorthand suffix. There is no limitation (currently) if the
!
has no global tag as its prefix.
Note
Verbatim tags have no suffix and are usually not resolved.
Parsing Documents
Based on the process model, the current parser provides a YamlDocument
and YamlSourceNode
that is an almagamation of the last two stages. Future changes may (not) separate these stages based on programmer (actual user) sentiment.
- Bare documents - Clean documents with no directives
const yaml = '''
# Okay if empty
...
Wow! Nice! This looks clean
...
''';
final docs = YamlParser(yaml).parseDocuments();
print(docs.length); // 2
// True
print(
docs.every(
(doc) =>
doc.hasExplicitEnd &&
!doc.hasExplicitStart &&
doc.docType == YamlDocType.bare,
),
);
- Explicit documents - Documents with directive end markers (
---
) and optionally document end markers (...
). Why optionally? The directive end markers signify the start of a document.
const yaml = '''
--- # Ends after the next comment
# LFG
...
---
"This one has a double quoted scalar, but no doc end"
---
status: Started immediately the marker was seen.
''';
final docs = YamlParser(yaml).parseDocuments();
print(docs.length); // 3
// True
print(
docs.every(
(doc) => doc.hasExplicitStart && doc.docType == YamlDocType.explicit,
),
);
- Directive documents - Documents with directives. The directives must always end with marker (
---
) even if the document is empty!
const yaml = '''
%YAML 1.1
%SUPPORT on that version is limited
%TAG !for-real! !yah-for-real
---
"You can just do this things. Do them with version 1.2+ features"
''';
final doc = YamlParser(yaml).parseDocuments().first;
// True
print(
doc.hasExplicitStart &&
doc.docType == YamlDocType.directiveDoc &&
doc.tagDirectives.isNotEmpty &&
doc.otherDirectives.isNotEmpty &&
doc.versionDirective == YamlDirective.ofVersion('1.1'),
);
Parsing Nodes
Declare block
or flow
nodes. You can use flow
in block
but not the other way around.
- Scalars
/// Do not use dynamic (anti-pattern in Dart).
/// This is for demo purposes to showcase equality.
dynamic value = 24;
final node = YamlParser('$value').parseNodes().first.castTo<Scalar>();
print(node == value); // True.
- Sequences (Lists) - allows all node types
import 'package:collection/collection.dart';
const yaml = '''
- rookie_yaml. The
- new kid
- in town
''';
final node = YamlParser('$yaml').parseNodes().first.castTo<Sequence>();
// True.
print(
DeepCollectionEquality().equals(node, [
'rookie_yaml. The',
'new kid',
'in town',
]),
);
/// For the skeptics, if you cheat the analyzer. Order is maintained!
/// A Sequence is a direct subtype of a Dart List
print((node as List) == ['rookie_yaml. The', 'new kid', 'in town']);
- Mapping (Map) - allows all node types.
// Let's get funky.
const funkyMap = {
'name': 'rookie_yaml',
'in_active_development': true,
'supports': {
1: 'Full YAML spec',
2: 'Custom tags and resolvers',
}
};
// Native Dart objects as strings are just flow nodes in yaml
final node = YamlParser(
funkyMap.toString(),
).parseNodes().first.castTo<Mapping>();
// True.
print(DeepCollectionEquality().equals(node, funkyMap));
/// Dart Analyzer: "unrelated type equality lint without casting?" Valid, but..
/// "Just put my fries in the bag, bro"
print((node as Map) == funkyMap);
Caution
The parser does not restrict implicit keys to at most 1024 unicode character as instructed by YAML
for flow and for block. This may change in later versions.
Anchors & Alias in Nodes
An alias
acts as a reference to an anchor
. Think pointer
s in C
and any language that has them and object references in Dart
and any other language that is object oriented. You must declare an anchor
before using it. Their characters must also be valid uri characters.
A node cannot have both an anchor
and alias
. YAML
demands them to be mutually exclusive. This also disqualifies an alias
from having a tag
since it "borrows" its kind from the anchor
node.
- Flow nodes - Anchors and aliases for flow nodes are straightforward
const yaml = '''
# Indent is moot in flow styles
# It used for readability
{
"double quoted": &ref-seq [
&ref-single-quoted 'single quoted',
&ref-plain plain,
&ref-map {key: value}
],
# Colon ":" is a valid uri char. Do not forget space
*ref-plain : *ref-single-quoted ,
# Use sequence as a key
*ref-seq : *ref-map
}
''';
final expectedMap = {
'double quoted': [
'double quoted',
'single quoted',
'plain',
{'key': 'value'},
],
'plain': 'single quoted',
[
'double quoted',
'single quoted',
'plain',
{'key': 'value'},
]: {'key': 'value'}
};
final node = YamlParser(yaml).parseNodes().first.castTo<Mapping>();
/// Currently aliases don't behave the same way as normal nodes do.
/// Equality seems a bit off (will be fixed)
/// The underlying node is still the same!
print(node.toString() == expectedMap.toString()); // True
// If we use the map without aliases
final noAliasNode = YamlParser(
expectedMap.toString(),
).parseNodes().first.castTo<Mapping>();
// If we cast, equal.
print((noAliasNode as Map) == expectedMap);
- Block map nodes are somewhat unique in this aspect. You need to declare the entire node on a new line for properties to be assigned to the node if it degenerates to a map. However, in this case, the first node can never have properties. This is because the parser can never know if the first block scalar is an implicit key to a block map unless it sees the
": "
(colon + space combination).
// This goes to the entire map
const yaml = '''
&map-anchor !!map
key: value
--- # Next document!
&key-anchor !!str key: value
''';
final docs = YamlParser(yaml).parseDocuments();
// Anchor in first document goes to the root map
print(docs[0].root.anchorOrAlias != null); // True
// Anchor in second document goes to the first key
print(docs[1].root.anchorOrAlias != null); // False
- Block explicit keys and block sequences cannot have properties before their
?
and-
indicators respectively. Their node content begins after these indicators. The parser currently allows you to declare such properties only if they are multiline and the block sequence entry or explicit key entry is the first entry in a block list and map respectively.
const yaml = '''
# This is okay
&map-anchor !!map
? key
: value
...
# This is also okay
&list-anchor !!seq
- entry
- next
''';
final docs = YamlParser(yaml).parseDocuments();
// True
print(docs.every((d) => d.root.anchorOrAlias != null));
/// All the yaml declared below will fail and also applies to block sequences.
/// Rule of thumb:
/// - If it is the first entry, okay if multiline
/// - In all other cases, it is an error
const mapErr = '''
# Invalid use in block map
key: value
# Throws. This is the second key. We already know this is a map.
# Even if it is multiline.
&anchor
? next-key
: value
''';
const anotherMapErr = '''
# First key. Properties are inline. Error
&anchor ? key
: value
''';
const listErr = '''
# Invalid use in list map
- value
# Throws. This is the second entry. Even if it is multiline.
&anchor
- anothervalue
''';
const anotherListErr = '''
# First entry. Properties are inline. Error
&anchor - entry
''';
Important
An alias
cannot be recursive. The node must be parsed completely and resolved before an anchor
can be used. In addition to those in the spec, the parser CURRENTLY abides by the following rules:
- An
anchor
to a collection cannot be used by an entry in the same collection. In programming terms, you cannot use a variable before it has been declared or its value determined and assigned. - An
anchor
can be redeclared to point to another node. Ergo, if rule1
and2
are satisfied and theanchor
exists, analias
is valid.
Tags
Tags are also node properties declared in tandem with an anchor
but never with an alias
. The package has an expressive way to write tags in Dart
(in line with the spec). They form the backbone of the custom resolvers this package allows. Do not skip this section (compacted version of the spec). You can use this as a reference to read the spec (Create an issue if there is an error).
Every node tag begins with the !
indicator. This signifies the start of a node tag's tag handle
. There are 3 types of tag handles in YAML
:
-
Primary - Has a single
!
. You can use this to declare your own tags without declaring a global tag to resolve them. -
Secondary -
!!
. Reserved to tags that instruct the parser on how these node's can be represented based on theYAML
spec. To that effect, tags with this handle are restricted to those the spec recognizes and are always resolved to the officialYAML
global tag prefix,tag:yaml.org,2002:
. This handle can be overriden by a custom global tag prefix. See supported tags. -
Named - Starts with a
!
+ a custom name +!
closing indicator. This handle must have a corresponding global tag (more on this later).
%TAG !example! !named-must-have-global
---
[
!my-tag scalar, # Custom primary tag
!!int 24, # Secondary tag supported by yaml
# Named tag resolved to "!named-must-have-global-tag"
!example!-tag just-a-value,
]
Based on the tag's use, there are 4 ways you can declare a YAML
tag:
- As a tag shorthand
- As a non-specific tag
- As a global tag
- As a verbatim tag
Tag shorthand
This represents a valid tag handle
and non-empty suffix
. This is the tag declared with(out) the anchor
. Using the example in the tag handle paragraph:
%TAG !example! !named-must-have-global
---
[
# Custom primary tag
#
# "!" - tag handle
# "my-tag" - suffix
#
!my-tag scalar,
# Secondary tag
#
# "!!" - tag handle
# "int" - suffix
#
!!int 24,
# Named tag
#
# "!example! - tag handle
# "-tag" - suffix
#
!example!-tag just-a-value,
]
Non-specific tag
This is a primary tag handle without a suffix which is, objectively, just a tag shorthand with an empty suffix.
[
# Non-specific tag "!"
! scalar,
! 24,
! just-a-value,
]
Important
No other tag handle can be used as a non-specific tag. Only primary tag handles.
Global Tag
A global tag MUST be declared with other directives before the document is parsed. This is the only tag form that must be known ahead of time. It is restricted to a single line with three parts (separated by whitespace) in the following order:
-
Directive -
%TAG
-
Handle - The
tag handle
this global is a prefix to. -
Prefix - a valid uri or tag shorthand that is the prefix to the handle. The uri must have a scheme.
Every handle
can only have a single global tag per document. Every global tag is restricted to the document it was declared in. This means a handle in one YamlDocument
cannot be the same in another YamlDocument
. This can only be true if you explicitly declare that global tag for each document!
By default, the secondary tag handle (!!
) resolves to the YAML
tag prefix, tag:yaml.org,2002:
.
Important
A named tag handle must have a corresponding global tag
%TAG ! !non-specific-looks-naked
%TAG !! !no-yaml-uri-in-
%TAG !meme! meme://look.at.me
---
[
# Resolved as: !non-specific-looks-naked
! "I am so non-specific",
# Resolved as: !no-yaml-uri-in-int
!!int 24,
# Resolved as: !meme://look.at.me:iAmTheCaptainNow
!meme!iAmTheCaptainNow "Oh captain, my captain"
]
--- # This next document has no global tags.
# For the same node, the parser throws when it sees the named tag handle
[
# Resolved as: !!str.
# Non-specific tags let the parser resolve it to a specific tag based in its
# kind (type)
! "I am so non-specific",
# Resolved as: !!int
!!int 24,
# Parser throws here
!meme!iAmTheCaptainNow "Oh captain, my captain"
]
Verbatim tag
Every valid (un)resolved tag can be declared in "verbatim" based on its resolution status. Using the global tag example:
%TAG ! !non-specific-looks-naked
%TAG !! !no-yaml-uri-in-
%TAG !meme! meme://look.at.me
---
[
# Resolved as: !non-specific-looks-naked
# Suffixes for non-specific tags are empty.
#
# Verbatim: !<!non-specific-looks-naked>
! "I am so non-specific",
# Resolved as: !no-yaml-uri-in-int
#
# Verbatim: !<!no-yaml-uri-in-int>
!!int 24,
# Resolved as: !meme://look.at.me:iAmTheCaptainNow
#
# Verbatim: !<!meme://look.at.me:iAmTheCaptainNow>
!meme!iAmTheCaptainNow "Oh captain, my captain"
]
--- # This next document has no global tags.
# Global tags are never carried over
[
# Inferred as: !!str. This is its kind. Verbatim uses the actual tag
#
# Verbatim: !<!my-custom-tag>
!my-custom-tag "I am so custom",
# Resolved as: !!int. Uses the global yaml prefix for handle "!!"
#
# Verbatim: !<!tag:yaml.org,2002:int>
!!int 24,
]
Warning
At this time, a non-specific tag !
with no global tag prefix will be printed as !<!>
which is invalid. This will be fixed in the next (breaking) version.
A verbatim tag is a node's tag declared in verbatim rather than as a tag shorthand. Such tags are handed off "as is" with no resolution to any global tag.
- !<!my-custom-tag> "I am so custom",
- !<!tag:yaml.org,2002:int> 24,
Important
Tags only accept characters considered valid uri
characters. Therefore:
- Any uri character that must be escaped as required by the URI RFC must also be escaped.
- All collection flow indicators (
"{", "}", "[", "]" and ","
) must be escaped as hex using%
indicator. - The tag indicator
!
must be escaped if used within the tag suffix.
Tags in code
The package provides an expressive way to declare any handle
, tag
or directive
which favours intent to minimize errors. Let's look at a few examples from earlier.
print(TagHandle.primary()); // !
print(TagHandle.secondary()); // !!
print(TagHandle.named('example')); // !example!
print(TagShorthand.fromTagUri(TagHandle.primary(), 'primary')); // !primary
print(TagShorthand.fromTagUri(TagHandle.secondary(), 'int')); // !!int
print(TagShorthand.fromTagUri(TagHandle.named('example'), 'named')); // !example!named
// Global tags allow uri or tag shorthand prefixes
/// %TAG !! !no-yaml-uri
///
/// Has 3 parts
/// - %TAG - is implied in the type `GlobalTag`
/// - !! - Secondary tag handle
/// - !no-yaml-uri - tag shorthand with primary tag handle
///
/// The syntax matches the YAML format discussed earlier
print(
GlobalTag.fromTagShorthand(
TagHandle.secondary(),
TagShorthand.fromTagUri(TagHandle.primary(), 'no-yaml-uri'),
),
);
/// %TAG !meme! meme://look.at.me
///
/// Unlike the previous example, this is a URI.
print(
GlobalTag.fromTagUri(
TagHandle.named('meme'),
'meme://look.at.me',
),
);
// %TAG ! !non-specific-looks-naked
print(
GlobalTag.fromTagShorthand(
TagHandle.primary(),
TagShorthand.fromTagUri(TagHandle.primary(), 'non-specific-looks-naked'),
),
);
- Verbatim tags - verbatim tags are, well, verbatim. You are to provide the tag in its fully resolved verbatim form unless its a tag shorthand.
/// Declaring !!int in verbatim:
///
/// 1. Has secondary tag handle. Use the "tag:yaml.org,2002" prefix
/// 2. Kind is "int".
/// 3. Use ":" or "/" as separator
/// 4. However, since this is a uri. We need to escape the "," in the prefix
/// as %2C
///
/// Final uri: "tag:yaml.org%2C2002:int". The object automatically wraps it for
/// you to match yaml requirements.
///
/// In verbatim: !<!tag:yaml.org,2002:int>
print(VerbatimTag.fromTagUri('tag:yaml.org%2C2002:int'));
/// Tag shorthand: "!my-custom-tag"
///
/// In verbatim: !<!my-custom-tag>
print(
VerbatimTag.fromTagShorthand(
TagShorthand.fromTagUri(TagHandle.primary(), 'my-custom-tag'),
),
);
Important
If you declared a tag shorthand in verbatim:
- It cannot have a named tag handle. All verbatim tags are resolved to their global tag prefixes
- It cannot be a non-specific tag
Custom Resolvers
Custom resolvers bind themselves to parsed tag shorthands and control how the parser resolves a node's kind (type). Currently, you can only declare a resolver in two ways:
-
NodeResolver
- resolves anyYamlSource
i.e.Sequence
,Mapping
orScalar
after the node has been fully parsed and instantiated. This is the safest option. The node is converted when theasCustomType
method is called. -
ContentResolver
- as the name suggests, this resolver resolves the parsed content. This is limited to theScalar
type which is a wrapper around basic types inferred from the parsed yaml content. Unlike aNodeResolver
, you must declare a function that converts the type back tostring
. This is because the type lives within theScalar
and aScalarValue
must declare a way to safely convert the type back to string.
You cannot declare these resolvers directly. Instead, you delegate this to the parser by creating a PreResolver
which has helper constructors for both. Let's see a few examples.
Simple Resolver
YAML
does not support a variety of modern encodings such as base64
out of the box.
In one swoop, let's decode a base64
string to a Dart string using a ContentResolver
and a sequence of code units to the same string using a NodeResolver
.
const string = 'I love Dart';
final codeUnits = string.codeUnits;
final encoded = base64Encode(codeUnits);
final handle = TagHandle.primary();
final utf16Tag = TagShorthand.fromTagUri(handle, 'utf16');
final base64Tag = TagShorthand.fromTagUri(handle, 'base64');
// A NodeResolver for the "codeUnits"
final utf16Resolver = PreResolver.node(
utf16Tag,
resolver: (node) => String.fromCharCodes(
node.castTo<Sequence>().map((e) => e.castTo<Scalar>().value),
),
);
// A ContentResolver for the "encoded" base64 string
final base64Resolver = PreResolver.string(
base64Tag,
contentResolver: (string) => String.fromCharCodes(base64Decode(string)),
toYamlSafe: (string) => string.codeUnits.toString(),
);
final yaml =
'''
- $base64Tag $encoded
- $utf16Tag $codeUnits
''';
final node = YamlParser(
yaml,
resolvers: [utf16Resolver, base64Resolver],
).parseNodes().first.castTo<Sequence>();
/// base64 string decoded and embedded in Scalar
/// Sequence values inferred as "int"
print((node as List) == [string, codeUnits]);
// Convert Sequence safely on demand
print(node[1].asCustomType<String>());
Tip
Avoid declaring callbacks that throw errors when defining a ContentResolver
. Return null
instead. The resolution process is an extension of the parsing capabilities. Allow the parser to infer a type (based on the ones available to it) to partially create the Scalar
.
As an example, let's try decoding a base64
string as base32
.
const radix = 32;
String toYamlSafe(int value) => value.toRadixString(radix);
final encoded = base64Encode('I love Dart'.codeUnits);
final base32Tag = TagShorthand.fromTagUri(TagHandle.primary(), 'base32');
final aggressiveResolver = PreResolver.string(
base32Tag,
contentResolver: (string) => int.parse(string, radix: radix),
toYamlSafe: toYamlSafe,
);
final safeResolver = PreResolver.string(
base32Tag,
contentResolver: (string) => int.tryParse(string, radix: radix),
toYamlSafe: toYamlSafe,
);
final yaml = '$base32Tag $encoded';
// Defaults to string
print(
YamlParser(
yaml,
resolvers: [safeResolver],
).parseNodes().first,
);
// Throws
print(
YamlParser(
yaml,
resolvers: [aggressiveResolver],
).parseNodes().first,
);
Note
The resolver functionality is optional.
Additionally, the parser limits each tag shorthand to a single resolver since a node cannot exist as two kinds at the same.
Schema Tags
The secondary tag handle !!
is limited to tags below which all resolve to the YAML global tag prefix, tag:yaml.org,2002
.
-
YAML
schema tags!!map
-Map
!!seq
-List
!!str
-String
-
JSON
schema tags!!null
-null
!!bool
- Boolean.!!int
- Integer.hex
,octal
andbase 10
should use this.!!float
- double.
-
Dart
-specific schema tags (More will be supported)!!uri
- URI