Extracting data

The parse method of a grammar returns a Match object through which you can access all the relevant information of the match. Named regex that match within the grammar may be accessed via the Match object similar to a hash where the keys are the regex names and the values are the Match object that represents that part of the overall regex match. Similarly, portions of the match that are captured with parentheses are available as positional elements of the Match object (as if it were an array).

Once you have the Match object, what can you do with it? You could recursively traverse this object and create data structures based on what you find or execute code. An alternative solution exists: action methods.

    class JSON::Tiny::Actions {
        method TOP($/)      { make $/.values.[0].ast }
        method object($/)   { make $<pairlist>.ast.hash }
        method pairlist($/) { make $<pair>».ast }
        method pair($/)     { make $<string>.ast => $<value>.ast }
        method array($/)    { make [$<value>».ast] }
        method string($/)   { make join '', $/.caps>>.value>>.ast }

        # TODO: make that
        # make +$/
        # once prefix:<+> is sufficiently polymorphic
        method value:sym<number>($/) { make eval $/ }
        method value:sym<string>($/) { make $<string>.ast }
        method value:sym<true>  ($/) { make Bool::True  }
        method value:sym<false> ($/) { make Bool::False }
        method value:sym<null>  ($/) { make Any }
        method value:sym<object>($/) { make $<object>.ast }
        method value:sym<array> ($/) { make $<array>.ast }

        method str($/)               { make ~$/ }

        method str_escape($/) {
            if $<xdigit> {
                make chr(:16($<xdigit>.join));
            } else {
                my %h = '\\' => "\\",
                        'n'  => "\n",
                        't'  => "\t",
                        'f'  => "\f",
                        'r'  => "\r";
                make %h{$/};
            }
        }
    }

    my $actions = JSON::Tiny::Actions.new();
    JSON::Tiny::Grammar.parse($str, :$actions);

This example passes an actions object to the grammar's parse method. Whenever the grammar engine finishes parsing a regex, it calls a method on the actions object with the same name as the current regex. If no such method exists, the grammar engine moves along. If a method does exist, the grammar engine passes the current match object as a positional argument.

Each match object has a slot called ast (short for abstract syntax tree) for a payload object. This slot can hold a custom data structure that you create from the action methods. Calling make $thing in an action method sets the ast attribute of the current match object to $thing.

In the case of the JSON parser, the payload is the data structure that the JSON string represents. For each matching rule, the grammar engine calls an action method to populate the ast slot of the match object. This process transforms the match tree into a different tree--in this case, the actual JSON tree.

Although the rules and action methods live in different namespaces (and in a real-world project probably even in separate files), here they are adjacent to demonstrate their correspondence:

    rule TOP        { ^ [ <object> | <array> ]$ }
    method TOP($/)  { make $/.values.[0].ast }

The TOP rule has an alternation with two branches, object and array. Both have a named capture. $/.values returns a list of all captures, here either the object or the array capture.

The action method takes the AST attached to the match object of that sub capture, and promotes it as its own AST by calling make.

    rule object        { '{' ~ '}' <pairlist>  }
    method object($/)  { make $<pairlist>.ast.hash }

The reduction method for object extracts the AST of the pairlist submatch and turns it into a hash by calling its hash method.

    rule pairlist       { [ <pair> ** [ \, ] ]? }
    method pairlist($/) { make $<pair>».ast; }

The pairlist rule matches multiple comma-separated pairs. The reduction method calls the .ast method on each matched pair and installs the result list in its own AST.

    rule pair       { <string> ':' <value> }
    method pair($/) { make $<string>.ast => $<value>.ast }

A pair consists of a string key and a value, so the action method constructs a Perl 6 pair with the => operator.

The other action methods work the same way. They transform the information they extract from the match object into native Perl 6 data structures, and call make to set those native structures as their own ASTs.

The action methods that belong to a proto token are parametric in the same way as the alternative:

    token value:sym<null>        { <sym>    };
    method value:sym<null>($/)   { make Any }

    token value:sym<object>      { <object> };
    method value:sym<object>($/) { make $<object>.ast }

When a <value> call matches, the action method with the same parametrization as the matching alternative executes.