Writing an Interpreter from Scratch
Interpreters are the backbone of many programming languages, converting code into actions line by line. In this article, we'll build a simple interpreter from scratch using Python.
If you are new to this series, you can find the previous articles here.
The code to this series is available here
We’ll explore how the interpreter ties together with a lexer and a parser, transforming raw code into executable instructions. By the end, you'll understand how interpreters work and have your own fully functional interpreter capable of handling arithmetic operations, variables, assignments, and print statements.
What is an Interpreter, Anyway?
Before we start writing code, let’s get on the same page about what an interpreter is. An interpreter reads and executes code written in a programming language. It goes through each instruction, interprets it, and performs the action directly—think of it as the live translator for your code!
To get this working, we also need a lexer and a parser. Here’s how they fit into the process:
Lexer: It reads the raw source code and breaks it into tokens—like words in a sentence. For example, in the line
x = 5
, the lexer identifies tokens:x
,=
, and5
.Parser: It takes these tokens and structures them into an Abstract Syntax Tree (AST), which is like the grammatical structure of a sentence. It groups and organizes tokens based on the rules of the language, turning
x = 5
into something that shows the assignment relationship.Interpreter: Our focus today! The interpreter takes the AST and evaluates it, executing the instructions it represents.
Sound good so far? Perfect—let’s start writing the interpreter step by step.
Step 1: Creating the Interpreter Class
We need a place to manage our variables and a method to evaluate the nodes in the AST. We'll create a class called Interpreter
:
class Interpreter:
def __init__(self):
self.variables = {}
self.variables
: This is a dictionary where we'll store variables as we encounter them in the code. If the code saysx = 10
, we'll keepx
and its value10
here.
Step 2: Evaluating Numbers
Let’s add a method called eval
that will process different types of nodes in our AST. We'll start simple—handling numbers. If the node is a number, we return its value.
def eval(self, node):
if isinstance(node, Number):
return node.value
This is straightforward. The Number
node just wraps a value like 5
or 10
, and our job is to unwrap it.
Step 3: Adding Support for Basic Operations
Now, let’s make things interesting! We want our interpreter to handle basic math: addition, subtraction, multiplication, and division. In our AST, these are represented as BinOp
nodes (short for Binary Operation).
elif isinstance(node, BinOp):
left_val = self.eval(node.left)
right_val = self.eval(node.right)
if node.op == '+':
return left_val + right_val
elif node.op == '-':
return left_val - right_val
elif node.op == '*':
return left_val * right_val
elif node.op == '/':
return left_val / right_val
We check if the node is a
BinOp
.We recursively call
eval
on the left and right parts (node.left
andnode.right
) to get their values.We perform the operation based on
node.op
, which could be+
,-
,*
, or/
.
This allows us to evaluate expressions like 3 + 5
or 7 * (2 - 1)
.
Step 4: Handling Variables
Next, let’s make our interpreter capable of understanding variables. In our language, a Variable
node will hold a variable's name. We need to check if the variable is defined and retrieve its value:
elif isinstance(node, Variable):
if node.name in self.variables:
return self.variables[node.name]
else:
raise NameError(f"Variable '{node.name}' is not defined")
If the variable exists in our dictionary (
self.variables
), we return its value.If it doesn’t exist, we raise an error, letting the user know the variable wasn’t found.
Step 5: Adding Assignment
Variables are useful, but they need to be assigned values. Let’s handle assignments with an Assign
node. This is when the code says something like x = 5
:
elif isinstance(node, Assign):
value = self.eval(node.value)
self.variables[node.name] = value
We evaluate the right side of the assignment (
node.value
).We then store the result in
self.variables
under the variable’s name.
Now, if our code says y = 10
, we can remember that y
equals 10
.
Step 6: Implementing Print Statements
We also want our language to be able to print values, right? Let’s handle the Print
node:
elif isinstance(node, Print):
value = self.eval(node.expression)
print(value)
The interpreter evaluates the expression inside the
Print
node and displays its value.Now, when the code says
print(x)
, we can print whatever valuex
holds.
Step 7: Dealing with Unknown Node Types
Finally, if we encounter a node type that our interpreter doesn’t know about, it should raise an error. This helps us catch mistakes or unsupported operations.
else:
raise TypeError("Unknown node type")
Putting It All Together
Here’s the complete Interpreter
class so far:
class Interpreter:
def __init__(self):
self.variables = {}
def eval(self, node):
if isinstance(node, Number):
return node.value
elif isinstance(node, BinOp):
left_val = self.eval(node.left)
right_val = self.eval(node.right)
if node.op == '+':
return left_val + right_val
elif node.op == '-':
return left_val - right_val
elif node.op == '*':
return left_val * right_val
elif node.op == '/':
return left_val / right_val
elif isinstance(node, Variable):
if node.name in self.variables:
return self.variables[node.name]
else:
raise NameError(f"Variable '{node.name}' is not defined")
elif isinstance(node, Assign):
value = self.eval(node.value)
self.variables[node.name] = value
elif isinstance(node, Print):
value = self.eval(node.expression)
print(value)
else:
raise TypeError("Unknown node type")
Step 8: Testing It All in the Main Function
Now, we need to tie everything together. In the main
function, we read the code, pass it through the lexer and parser, and then interpret it:
def main():
# Check if the file path is provided as a command-line argument
if len(sys.argv) < 2:
print("Usage: python interpreter.py <source_file>")
sys.exit(1)
# Get the file path from command-line arguments
source_file = sys.argv[1]
# Read the source code from the file
with open(source_file, 'r') as file:
source_code = file.read()
# Lexical analysis
tokens = lexer(source_code)
# Parsing
parser = Parser(tokens)
ast = parser.parse()
# Interpreting
interpreter = Interpreter()
for stmt in ast:
interpreter.eval(stmt)
if __name__ == "__main__":
main()
This function:
Reads the source code from a file.
Uses the lexer to tokenize the code and the parser to build the AST.
Evaluates each statement in the AST with our
Interpreter
.
And there you have it! You’ve built an interpreter step by step. This code now runs simple arithmetic operations, variables, assignments, and prints.
Conclusion
Congratulations! You’ve successfully built a basic interpreter capable of evaluating expressions, managing variables, and executing print statements. This foundational knowledge opens the door to more advanced features, such as adding conditionals, loops, or even creating your own full-fledged programming language. As we move ahead in this series, we’ll try creating more complex features and add it to our compiler.
Till then, happy coding!
P.S. Got stuck or have questions? Drop a comment below or reach out—I'm all ears!
Subscribe to my newsletter
Read articles from Abou Zuhayr directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Abou Zuhayr
Abou Zuhayr
Hey, I’m Abou Zuhayr, Android developer by day, wannabe rockstar by night. With 6 years of Android wizardry under my belt and currently working at Gather.ai, I've successfully convinced multiple devices to do my bidding while pretending I'm not secretly just turning them off and on again. I’m also deeply embedded (pun intended) in systems that make you question if they’re alive. When I'm not fixing bugs that aren’t my fault (I swear!), I’m serenading my code with guitar riffs or drumming away the frustration of yet another NullPointerException. Oh, and I write sometimes – mostly about how Android development feels like extreme sport mixed with jazz improvisation. Looking for new challenges in tech that don’t involve my devices plotting against me!