Writing an Interpreter from Scratch

Abou ZuhayrAbou Zuhayr
6 min read

Interpreters are the backbone of many programming languages, converting code into actions line by line. In this article, we'll build a simple interpreter from scratch using Python.

If you are new to this series, you can find the previous articles here.

The code to this series is available here

We’ll explore how the interpreter ties together with a lexer and a parser, transforming raw code into executable instructions. By the end, you'll understand how interpreters work and have your own fully functional interpreter capable of handling arithmetic operations, variables, assignments, and print statements.

What is an Interpreter, Anyway?

Before we start writing code, let’s get on the same page about what an interpreter is. An interpreter reads and executes code written in a programming language. It goes through each instruction, interprets it, and performs the action directly—think of it as the live translator for your code!

To get this working, we also need a lexer and a parser. Here’s how they fit into the process:

  1. Lexer: It reads the raw source code and breaks it into tokens—like words in a sentence. For example, in the line x = 5, the lexer identifies tokens: x, =, and 5.

  2. Parser: It takes these tokens and structures them into an Abstract Syntax Tree (AST), which is like the grammatical structure of a sentence. It groups and organizes tokens based on the rules of the language, turning x = 5 into something that shows the assignment relationship.

  3. Interpreter: Our focus today! The interpreter takes the AST and evaluates it, executing the instructions it represents.

Sound good so far? Perfect—let’s start writing the interpreter step by step.

Step 1: Creating the Interpreter Class

We need a place to manage our variables and a method to evaluate the nodes in the AST. We'll create a class called Interpreter:

class Interpreter:
    def __init__(self):
        self.variables = {}
  • self.variables: This is a dictionary where we'll store variables as we encounter them in the code. If the code says x = 10, we'll keep x and its value 10 here.

Step 2: Evaluating Numbers

Let’s add a method called eval that will process different types of nodes in our AST. We'll start simple—handling numbers. If the node is a number, we return its value.

def eval(self, node):
    if isinstance(node, Number):
        return node.value

This is straightforward. The Number node just wraps a value like 5 or 10, and our job is to unwrap it.

Step 3: Adding Support for Basic Operations

Now, let’s make things interesting! We want our interpreter to handle basic math: addition, subtraction, multiplication, and division. In our AST, these are represented as BinOp nodes (short for Binary Operation).

elif isinstance(node, BinOp):
    left_val = self.eval(node.left)
    right_val = self.eval(node.right)
    if node.op == '+':
        return left_val + right_val
    elif node.op == '-':
        return left_val - right_val
    elif node.op == '*':
        return left_val * right_val
    elif node.op == '/':
        return left_val / right_val
  • We check if the node is a BinOp.

  • We recursively call eval on the left and right parts (node.left and node.right) to get their values.

  • We perform the operation based on node.op, which could be +, -, *, or /.

This allows us to evaluate expressions like 3 + 5 or 7 * (2 - 1).

Step 4: Handling Variables

Next, let’s make our interpreter capable of understanding variables. In our language, a Variable node will hold a variable's name. We need to check if the variable is defined and retrieve its value:

elif isinstance(node, Variable):
    if node.name in self.variables:
        return self.variables[node.name]
    else:
        raise NameError(f"Variable '{node.name}' is not defined")
  • If the variable exists in our dictionary (self.variables), we return its value.

  • If it doesn’t exist, we raise an error, letting the user know the variable wasn’t found.

Step 5: Adding Assignment

Variables are useful, but they need to be assigned values. Let’s handle assignments with an Assign node. This is when the code says something like x = 5:

elif isinstance(node, Assign):
    value = self.eval(node.value)
    self.variables[node.name] = value
  • We evaluate the right side of the assignment (node.value).

  • We then store the result in self.variables under the variable’s name.

Now, if our code says y = 10, we can remember that y equals 10.

Step 6: Implementing Print Statements

We also want our language to be able to print values, right? Let’s handle the Print node:

elif isinstance(node, Print):
    value = self.eval(node.expression)
    print(value)
  • The interpreter evaluates the expression inside the Print node and displays its value.

  • Now, when the code says print(x), we can print whatever value x holds.

Step 7: Dealing with Unknown Node Types

Finally, if we encounter a node type that our interpreter doesn’t know about, it should raise an error. This helps us catch mistakes or unsupported operations.

else:
    raise TypeError("Unknown node type")

Putting It All Together

Here’s the complete Interpreter class so far:

class Interpreter:
    def __init__(self):
        self.variables = {}

    def eval(self, node):
        if isinstance(node, Number):
            return node.value
        elif isinstance(node, BinOp):
            left_val = self.eval(node.left)
            right_val = self.eval(node.right)
            if node.op == '+':
                return left_val + right_val
            elif node.op == '-':
                return left_val - right_val
            elif node.op == '*':
                return left_val * right_val
            elif node.op == '/':
                return left_val / right_val
        elif isinstance(node, Variable):
            if node.name in self.variables:
                return self.variables[node.name]
            else:
                raise NameError(f"Variable '{node.name}' is not defined")
        elif isinstance(node, Assign):
            value = self.eval(node.value)
            self.variables[node.name] = value
        elif isinstance(node, Print):
            value = self.eval(node.expression)
            print(value)
        else:
            raise TypeError("Unknown node type")

Step 8: Testing It All in the Main Function

Now, we need to tie everything together. In the main function, we read the code, pass it through the lexer and parser, and then interpret it:

def main():
    # Check if the file path is provided as a command-line argument
    if len(sys.argv) < 2:
        print("Usage: python interpreter.py <source_file>")
        sys.exit(1)

    # Get the file path from command-line arguments
    source_file = sys.argv[1]

    # Read the source code from the file
    with open(source_file, 'r') as file:
        source_code = file.read()

    # Lexical analysis
    tokens = lexer(source_code)

    # Parsing
    parser = Parser(tokens)
    ast = parser.parse()

    # Interpreting
    interpreter = Interpreter()
    for stmt in ast:
        interpreter.eval(stmt)

if __name__ == "__main__":
    main()

This function:

  1. Reads the source code from a file.

  2. Uses the lexer to tokenize the code and the parser to build the AST.

  3. Evaluates each statement in the AST with our Interpreter.

And there you have it! You’ve built an interpreter step by step. This code now runs simple arithmetic operations, variables, assignments, and prints.

Conclusion

Congratulations! You’ve successfully built a basic interpreter capable of evaluating expressions, managing variables, and executing print statements. This foundational knowledge opens the door to more advanced features, such as adding conditionals, loops, or even creating your own full-fledged programming language. As we move ahead in this series, we’ll try creating more complex features and add it to our compiler.

Till then, happy coding!


P.S. Got stuck or have questions? Drop a comment below or reach out—I'm all ears!

4
Subscribe to my newsletter

Read articles from Abou Zuhayr directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Abou Zuhayr
Abou Zuhayr

Hey, I’m Abou Zuhayr, Android developer by day, wannabe rockstar by night. With 6 years of Android wizardry under my belt and currently working at Gather.ai, I've successfully convinced multiple devices to do my bidding while pretending I'm not secretly just turning them off and on again. I’m also deeply embedded (pun intended) in systems that make you question if they’re alive. When I'm not fixing bugs that aren’t my fault (I swear!), I’m serenading my code with guitar riffs or drumming away the frustration of yet another NullPointerException. Oh, and I write sometimes – mostly about how Android development feels like extreme sport mixed with jazz improvisation. Looking for new challenges in tech that don’t involve my devices plotting against me!