Introduction
Tests are often repetitive occurrences; for instance, you have several implementations of a single interface and you want to test not the interface itself (this you will have already done), but the "production" of this interface.
I recently created grappa, a fork of parboiled (1) aimed at continuing its development. And I had stumbled upon this in tests:
        test(parser.Clause(), "1+5").hasNoErrors()
            .hasParseTree("" +
                "[Clause] '1+5'\n" +
                "  [digit] '1'\n" +
                "  [Operator] '+'\n" +
                "    ['+'] '+'\n" +
                "  [digit] '5'\n" +
                "  [EOI]\n");
Uh. OK, so what this is supposed to do is test a generated parse tree... You can see that this is very far from being sustainable. For one thing, if you change the string output for some reason, you're basically f*ed.
My plan was therefore to replace this with a more sustainable approach which did not depend on string outputs... And as I am a big fan of JSON, I decided to go JSON.
This is the result of my efforts so far; it is not perfect yet (I will have more things to test; parse inputs will also find their way into JSON or even separate files in the future), but it can give you a good indication of how to factorize your test code!
Software used
There are quite a few pieces of software involved here:
- first and foremost, Guava; if you don't know this library, you should really have a look at it, and use it;
- Jackson for JSON processing; if you don't know which library to choose for your JSON tasks, use this one;
- TestNg as a basic test framework;
- AssertJ as an assertion library.
How the parse tree is implemented
A parse tree is created if you annotate your parser class appropriately (by default, a parser class will not create a parse tree for obvious performance reasons); when created, it is a set of Nodes with one node at the root and children nodes. Among other information, a node contains:
- the name of the parsing rule having created this node;
- the start and end index of the match if the node has matched at all;
- an ordered list of its children.
Testing a parse tree therefore requires three things:
- a parser class;
- assertions for one single node;
- assertions for a whole tree.
All of this is done below... The code is only written once; after that you just have to write JSON files to test!
Assertions
THe code for the two assertions written is below; a paragraph will then explain how it all works.
Node assertion
Here is the code for the node assertion; it extends AssertJ's AbstractAssert:
public final class NodeAssert<V>
    extends AbstractAssert<NodeAssert<V>, Node<V>>
{
    private final InputBuffer buffer;
    NodeAssert(final Node<V> actual, final InputBuffer buffer)
    {   
        super(actual, NodeAssert.class);
        this.buffer = buffer;
    }   
    private NodeAssert<V> doHasLabel(final String expectedLabel)
    {   
        final String actualLabel = actual.getLabel();
        assertThat(actualLabel).overridingErrorMessage(
            "node's label is null! I didn't expect it to be" 
        ).isNotNull();
        assertThat(actualLabel).overridingErrorMessage(
            "node's label is not what was expected!\n"
            + "Expected: '%s'\nActual  : '%s'\n", 
            expectedLabel, actualLabel
        ).isEqualTo(expectedLabel);
        return this;
    }   
    NodeAssert<V> hasLabel(@Nonnull final Optional<String> label)
    {   
        return label.isPresent() ? doHasLabel(label.get()) : this;
    }   
    private NodeAssert<V> doHasMatch(final String expectedMatch)
    {   
        final String actualMatch = buffer.extract(
            actual.getStartIndex(), actual.getEndIndex());
        assertThat(actualMatch).overridingErrorMessage(
            "rule did not match what was expected!\n"
            + "Expected: -->%s<--\nActual  : -->%s<--\n",
            expectedMatch, actualMatch
        ).isEqualTo(expectedMatch);
        return this;
    }   
    NodeAssert<V> hasMatch(@Nonnull final Optional<String> match)
    {   
        return match.isPresent() ? doHasMatch(match.get()) : this;
    }   
}
Parse tree assertion
More Guava usage in there; heavy Guava usage, in fact, and this is also where Jackson is used:
public abstract class ParseTreeAssert<V>
{
    private static final ObjectMapper MAPPER = new ObjectMapper();
    private static final String RESOURCE_PREFIX = "/parseTrees/";
    public abstract void verify(
        @Nonnull final Optional<Node<V>> node);
    public static final class Builder<V>
    {
        private String label;
        private String match;
        private final List<Builder<V>> children 
            = Lists.newArrayList();
        Builder()
        {
        }
        @JsonProperty("label")
        public Builder<V> withLabel(@Nonnull final String label)
        {
            this.label = Preconditions.checkNotNull(label);
            return this;
        }
        @JsonProperty("children")
        public Builder<V> withChildren(
            @Nonnull final List<Builder<V>> list)
        {
            Preconditions.checkNotNull(list);
            children.addAll(list);
            return this;
        }
        @JsonProperty("match")
        public Builder<V> withMatch(@Nonnull final String match)
        {
            this.match = Preconditions.checkNotNull(match);
            return this;
        }
        public ParseTreeAssert<V> build(final InputBuffer buffer)
        {
            return new WithNode<V>(this, buffer);
        }
    }
    @ParametersAreNonnullByDefault
    private static final class WithNode<E>
        extends ParseTreeAssert<E>
    {
        private final InputBuffer buffer;
        private final Optional<String> label;
        private final Optional<String> match;
        private final List<ParseTreeAssert<E>> children;
        private WithNode(final Builder<E> builder,
            final InputBuffer buffer)
        {
            this.buffer = buffer;
            label = Optional.fromNullable(builder.label);
            match = Optional.fromNullable(builder.match);
            final ImmutableList.Builder<ParseTreeAssert<E>>
                listBuilder = ImmutableList.builder();
            for (final Builder<E> element: builder.children)
                listBuilder.add(element.build(buffer));
            children = listBuilder.build();
        }
        @Override
        public void verify(final Optional<Node<E>> node)
        {
            assertThat(node.isPresent()).overridingErrorMessage(
                "expected to have a node, but I didn't!"
            ).isTrue();
            final NodeAssert<E> nodeAssert
                = new NodeAssert<E>(node.get(), buffer);
            nodeAssert.hasLabel(label).hasMatch(match);
            verifyChildren(node.get());
        }
        private void verifyChildren(final Node<E> node)
        {
            final List<Node<E>> nodeList 
                = node.getChildren();
            final int size 
                = Math.max(children.size(), nodeChildren.size());
            ParseTreeAssert<E> childDescriptor;
            Optional<Node<E>> childNode;
            for (int i = 0; i < size; i++) {
                childDescriptor = Optional
                    .fromNullable(Iterables.get(children, i, null))
                    .or(new NoNode<E>(i));
                childNode = Optional
                    .fromNullable(Iterables.get(nodeList, i, null));
                childDescriptor.verify(childNode);
            }
        }
    }
    private static final class NoNode<E>
        extends ParseTreeAssert<E>
    {
        private final int index;
        private NoNode(final int index)
        {
            this.index = index;
        }
        @Override
        public void verify(@Nonnull final Optional<Node<E>> node)
        {
            fail("did not expect a node at index " + index);
        }
    }
    public static <E> ParseTreeAssert<E> read(
        final String resourceName, final InputBuffer buffer)
        throws IOException
    {
        final String path = RESOURCE_PREFIX + resourceName;
        final TypeReference<Builder<E>> typeRef
            = new TypeReference<Builder<E>>() {};
        final Closer closer = Closer.create();
        final InputStream in;
        final Builder<E> builder;
        try {
            in = closer.register(ParseTreeAssert.class
                .getResourceAsStream(path));
            if (in == null)
                throw new IOException("resource " + path 
                    + " not found");
            builder = MAPPER.readValue(in, typeRef);
            return builder.build(buffer);
        } finally {
            closer.close();
        }
    }
}
OK, so, how does it work?
First of all, you can see that the core assertions in NodeAssert are wrapped in Optionals (reminds you of Java 8? Yes, Java 8 stole it from Guava). This allows to only actually test what you want to test for a given parse tree. For instance, you may want to test the match only and not the label.
You will also have noticed that ParseTreeAssert is abstract and has two implementations, one where a node is expected (WithNode) and one where a node is not expected (NoNode); and also that its .verify() method also takes an Optional as an argument. This allows to test the following scenarios:
- you expected to see nmatchers but you only havemwheremis less thann: in this case,node.isPresent()will return false, and the failure is detected;
- the reverse: you expected to see less nodes than what the tree actually contains. In this case the ParseTreeAssertimplementation is aNoNode, which immediately fails by telling that it did not expect a node to exist at this point.
Finally, the static .read() method in ParseTreeAssert is where the "real magic" happens: it reads your JSON file which contains the tree you actually expect!
The test classes
Two things are needed: a base parser for testers to extend, and a core test file to extend when you want to run actual tests.
The basic test parser to extend
This one is quite simple; there is a "trick" however:
@BuildParseTree
public abstract class TestParser
    extends BaseParser<Object>
{
    public abstract Rule mainRule();
}
The trick here is the @BuildParseTree annotation; OK, its name is self explanatory but if you do not annotate a parser class with this, the build parse tree will not be generated and you won't be able to test it!
The core test class...
... Is abstract. Note that there is yet another trick in there:
@Test
public abstract class ParseTreeTest
{
    private final Node<Object> tree;
    private final ParseTreeAssert<Object> treeAssert;
    protected ParseTreeTest(final Class<? extends TestParser> c,
        final String resourceName, final String input)
        throws IOException
    {
        final TestParser parser = Parboiled.createParser(c);
        final ParseRunner<Object> runner
            = new ReportingParseRunner<Object>(parser.mainRule());
        final ParsingResult<Object> result = runner.run(input);
        tree = result.parseTreeRoot;
        treeAssert = ParseTreeAssert.read(resourceName, 
            result.inputBuffer);
    }
    @Test
    public final void treeIsWhatIsExpected()
    {
        treeAssert.verify(Optional.of(tree));
    }
}
The trick is the @Test annotation at the class level; if you do not do that, extending classes will not run @Test methods issued from the abstract class... And the only test method is in the abstract class (and what is more, it is final).
You will note the use of ParseTreeAssert.read() which will read from the JSON file (a resource in the classpath) and deserialize, all in one.
And now, finally, how to use it!
And here is what becomes of the test mentioned in the introduction. First, the test class:
public final class SplitParseTreeTest
    extends ParseTreeTest
{
    static class SplitParser
        extends TestParser
    {
        final Primitives primitives 
            = Parboiled.createParser(Primitives.class);
        public Rule clause() {
            return sequence(
                digit(),
                primitives.operator(),
                primitives.digit(),
                EOI
            );
        }
        @Override
        @DontLabel
        public Rule mainRule()
        {
            return clause();
        }
    }
    @BuildParseTree
    static class Primitives
        extends BaseParser<Object>
    {
        public Rule operator()
        {
            return firstOf('+', '-');
        }
    }
    public SplitParseTreeTest()
        throws IOException
    {
        super(SplitParser.class, "split.json", "1+5");
    }
}
The parser is coded within the test class here; you could create it outside of it, of course. The constructor of the abstract class is called with all necessary arguments: the parser class, the JSON file to read and the input to test. As to the JSON file, here it is:
{
    "label": "clause",
    "match": "1+5",
    "children": [
        {
            "label": "digit",
            "match": "1"
        },
        {
            "label": "operator",
            "match": "+",
            "children": [
                {
                    "label": "'+'",
                    "match": "+"
                }
            ]
        },
        {
            "label": "digit",
            "match": "5"
        },
        {
            "label": "EOI"
        }
    ]
}
And that's it! There are nearly 50 other tests to convert from the "string based testing" into that, but all I have to do now is write the JSON files and very simple test classes!
But this is not over yet...
As I said, this is the beginning of my effort; more can be done and will be done. For instance, handling parser inputs differently, and adding more assertions; the latter is quite simple:
- add the field to ParseTreeAssert.BuilderandParseTreeAssert.WithNode,
- update NodeAssert,
- update the relevant JSON files.
Much easier than modifying strings!
