portrait

End of Line blog

Thoughts on software development, by Adam Ruka

Graal Truffle tutorial part 1 – setup, Nodes, CallTarget

This article is part of a tutorial on GraalVM's Truffle language implementation framework.


Now that we know what Truffle (and Graal) is, let’s get started with a simple example that shows the basic Truffle features: Node, RootNode, and CallTarget.

To make the concepts in this series more concrete, I’ll be showing code of a Truffle implementation of a language I call EasyScript, which is a very simplified subset of JavaScript. We’ll start with basics expressions in this part, and add more features to the language as the series progresses.

Let’s begin by setting up GraalVM.

GraalVM setup

We need a GraalVM installation on our local machine. The free Community Edition works fine if you don’t have the paid Enterprise Edition. You can download it from here: https://github.com/graalvm/graalvm-ce-builds/releases (we need one of the binaries whose name starts with graalvm-ce). The example code I’ll be showing in these blog posts uses Java 11 features, so make sure to download a version for Java 11.

Once you’ve downloaded the correct archive for your operating system and extracted it somewhere on your machine, you need to set the JAVA_HOME environment variable to point to the directory containing the uncompressed contents:

$ export JAVA_HOME=/path/to/extracted/archive

You can verify the installation works by executing the java command using JAVA_HOME:

$ $JAVA_HOME/bin/java -version

openjdk version "11.0.8" 2020-07-14
OpenJDK Runtime Environment GraalVM CE 20.2.0 (build 11.0.8+10-jvmci-20.2-b03)
OpenJDK 64-Bit Server VM GraalVM CE 20.2.0 (build 11.0.8+10-jvmci-20.2-b03, mixed mode, sharing)

AST

Now that we have the necessary tools installed, let’s talk about the basic Truffle concepts.

A language’s compiler is usually visualized as a pipeline, where each stage’s output is used as the input to the next stage:

To simplify the process of implementing languages, Truffle replaces everything after stage 3. It asks you to create an interpreter of the abstract syntax tree (either annotated, or regular – it’s up to the language author to decide, Truffle doesn’t care about that distinction). Because of this, the concept of the AST is central to Truffle.

If you’ve never seen an abstract syntax tree before, here’s an example. Let’s say we have the following JavaScritpt code:

function factorial(n) {
    if (n < 3)
        return n;
    return n * factorial(n - 1);
}

The AST for this code will look something like:

FunctionNode(name: "factorial", arguments: ["n"])
    StatementBlockNode
        IfStatementNode
            LessThanNode
                ReferenceNode(name: "n")
                IntLiteralNode(value: 3)
            ReturnStatementNode
                ReferenceNode(name: "n")
        ReturnStatementNode
            MultiplyNode
                ReferenceNode(name: "n")
                FunctionCallNode
                    ReferenceNode(name: "factorial")
                    SubtractNode
                        ReferenceNode(name: "n")
                        IntLiteralNode(value: 1)

As we can see, there are different kinds of nodes, with different attributes (for example, FunctionNode has two attributes, name and arguments; IntLiteralNode has one attribute, value). As with all trees, the nodes also have children. Some node types have a set number of children (for example, both LessThanNode and SubtractNode always have 2 children, as they’re binary operations), while others have a variable number of children (like StatementBlockNode, that can have arbitrary many children, each being a statement in the block).

A Truffle language implementation consists of an interpreter of such an AST. In this series, we’ll explore the various APIs Truffle offers for representing and executing that interpreter.

Truffle Node

A vertex in the AST is represented in Truffle by the Node class from the com.oracle.truffle.api.nodes package. It’s an abstract class that you’re supposed to extend. Usually, each node kind like we’ve seen above will become a separate subclass of Node in the class hierarchy (so, there’ll be a subclass for IfStatementNode, a separate one for LessThanNode, etc.).

When you subclass Node, you might be surprised that the class doesn’t have any abstract methods. Since these nodes are meant to be interpreted, you might expect some abstract interpret method that you have to override. But it doesn’t work that way. Instead, Truffle expects you to define that interpretation method yourself (the reason it does that are type specializations, which are specific to the language being implemented; we will cover specializations in detail in later parts of the tutorial). For that reason, it’s very common to define your own abstract superclass for all node kinds of your language.

We’ll do that for EasyScript here. In this first iteration, our language will be extremely simple: it will only allow the addition of integer literals (so, expressions like 1 + 2 + 3). Because of that, our superclass will define the interpret method to return int:

import com.oracle.truffle.api.frame.VirtualFrame;
import com.oracle.truffle.api.nodes.Node;

public abstract class EasyScriptNode extends Node {
    public abstract int executeInt(VirtualFrame frame);
}

Truffle places pretty strict requirements on the “interpret” method. It has to start with execute, and take a frame as the only argument (a VirtualFrame corresponds to an activation record on the call stack, and is used for things like local variables. We’ll talk about it in more detail in later parts of the tutorial).

Our first node is the integer literal node, and it’s very simple:

public final class IntLiteralNode extends EasyScriptNode {
    private final int value;

    public IntLiteralNode(int value) {
        this.value = value;
    }

    @Override
    public int executeInt(VirtualFrame frame) {
        return this.value;
    }
}

And the second one is the addition node:

public final class AdditionNode extends EasyScriptNode {
    @SuppressWarnings("FieldMayBeFinal")
    @Child
    private EasyScriptNode leftNode, rightNode;

    public AdditionNode(EasyScriptNode leftNode, EasyScriptNode rightNode) {
        this.leftNode = leftNode;
        this.rightNode = rightNode;
    }

    @Override
    public int executeInt(VirtualFrame frame) {
        int leftValue = this.leftNode.executeInt(frame);
        int rightValue = this.rightNode.executeInt(frame);
        return leftValue + rightValue;
    }
}

The most interesting part here is the @Child annotation. That’s how you tell Truffle those attributes are actually subnodes of the AST. Interestingly, they cannot be final, as Truffle needs the capability to rewrite parts of the AST as it gathers profiling information about your code, hence the @SuppressWarnings("FieldMayBeFinal"). However, during partial evaluation, they will be treated as effectively final, thanks to the @Child annotation.

Entrypoint – RootNode

So, are we done – can we write a unit test that interprets a simple expression? Not so fast. While we have nodes to represent parts of the AST, we’re missing a top-level node that acts as the entrypoint for the interpreter. Those kinds of nodes are represented in Truffle as subclasses of RootNode (which is itself a subclass of Node), from the same package com.oracle.truffle.api.nodes.

A RootNode represents a callable AST. Most commonly, those are functions or methods, but they also represent the top-level language entrypoints (for example, in Java, you can execute a class, which will invoke its static main method; in NodeJs, you can execute the given code with the -e switch, like node -e 'console.log("Hello, world!");'; etc.).

Interestingly, RootNode does have an abstract execute method that you have to override. In our case, it will be easy to implement – we will just delegate straight to an EasyScriptNode that we get through the constructor:

import com.oracle.truffle.api.nodes.RootNode;

public final class EasyScriptRootNode extends RootNode {
    @SuppressWarnings("FieldMayBeFinal")
    @Child
    private EasyScriptNode exprNode;

    public EasyScriptRootNode(EasyScriptNode exprNode) {
        super(null);

        this.exprNode = exprNode;
    }

    @Override
    public Object execute(VirtualFrame frame) {
        return this.exprNode.executeInt(frame);
    }
}

Invokable element – CallTarget

So, now we have to be done, right? Now we can finally write our first unit test? Nope, there’s one more Truffle API we need to learn first, and that is the concept of CallTargets.

CallTargets are a layer of indirection that wrap RootNodes. Unlike nodes, this is an interface whose instances are created with static factory methods, not an abstract class that you’re supposed to extend. They are used for:

You create CallTargets by calling the createCallTarget() static factory method of the TruffleRuntime interface, which is a singleton you obtain by calling getRuntime() on the Truffle class, and passing it a RootNode instance.

With this information, we are finally ready to write our unit test:

import com.oracle.truffle.api.CallTarget;
import com.oracle.truffle.api.Truffle;
import org.junit.Test;
import static org.junit.Assert.assertEquals;

public class ExecuteNodesTest {
    @Test
    public void adds_12_and_34_correctly() {
        EasyScriptNode exprNode = new AdditionNode(
            new IntLiteralNode(12),
            new IntLiteralNode(34));
        var rootNode = new EasyScriptRootNode(exprNode);
        CallTarget callTarget = Truffle.getRuntime().createCallTarget(rootNode);

        var result = callTarget.call();

        assertEquals(46, result);
    }
}

Phew! It was a lot of work, but we finally managed to add 12 and 34 together using Truffle!

All of the code from the article is available on GitHub.

Next article

In part 2 of the series, we’ll introduce one of the most important concepts in Truffle – specializations.


This article is part of a tutorial on GraalVM's Truffle language implementation framework.