Building an Interpreter in Go - Chapter 2: Pratt Parsing
Overview
In the previous chapter, we built a basic lexer that could break down our source code into individual tokens. This provided the foundation for the next step - parsing those tokens into an Abstract Syntax Tree (AST) that represents the structure of the program.
In this chapter, we dive into Pratt Parsing, an elegant parsing technique that allows us to handle complex expressions in a clean and extensible way. We'll learn how to parse let
statements, return
statements, if-else
expressions, function definitions, and function calls using this method.
Key Concepts Learned
Pratt Parsing
Pratt Parsing is a top-down parsing technique developed by Vaughan Pratt. It allows us to parse expressions of arbitrary complexity by associating precedence and binding power to our language's operators.
The core idea is to have separate parsing functions for each precedence level, allowing the parser to "climb up and down" the expression tree as it encounters different operators.
Parsing let Statements
Let's start with the basic let
statement, which assigns a value to a variable:
let x = 5;
let y = x + 3;
We'll need to parse the variable name, the assignment operator, and the expression on the right-hand side. Pratt Parsing makes this relatively straightforward.
Parsing Return Statements
Next, we'll tackle return
statements, which allow us to exit a function and provide a return value:
let addTwo = fn(x) { return x + 2; };
Parsing a return
statement involves identifying the keyword, parsing the expression being returned, and creating the appropriate AST node.
Parsing If-Else Expressions
We'll also implement parsing for if-else
expressions, which provide conditional branching in our language:
let result = if (x > 10) {
return "greater";
} else {
return "less";
};
This requires parsing the if
keyword, the condition expression, the consequent block, the else
keyword (if present), and the alternative block.
Parsing Function Definitions
Next, we'll handle function definitions, which allow us to create reusable pieces of code:
let addTwo = fn(x) { return x + 2; };
Parsing a function definition involves identifying the fn
keyword, parsing the parameter list, and parsing the function body.
Parsing Function Calls
Finally, we'll implement parsing for function calls, which allow us to execute the defined functions:
let result = addTwo(5);
Parsing a function call requires identifying the function name, parsing the argument list, and creating the appropriate AST node.
Code Implementation Highlights
Parsing the Expression Statement and the Prefix and Infix Expressions does all the heavy lifting so I am including those as snippets here.
var precedences = map[token.TokenType]int{
token.EQ: EQUALS,
token.NOT_EQ: EQUALS,
token.LT: LESSGREATER,
token.GT: LESSGREATER,
token.PLUS: SUM,
token.MINUS: SUM,
token.SLASH: PRODUCT,
token.ASTERISK: PRODUCT,
token.LPAREN: CALL,
}
func (p *Parser) parseExpressionStatement() *ast.ExpressionStatement {
stmt := &ast.ExpressionStatement{Token: p.curToken}
stmt.Expression = p.parseExpression(LOWEST)
if p.peekTokenIs(token.SEMICOLON) {
p.nextToken()
}
return stmt
}
func (p *Parser) parseExpression(precedence int) ast.Expression {
prefix := p.prefixParseFns[p.curToken.Type]
if prefix == nil {
p.noPrefixParseFnError(p.curToken.Type)
return nil
}
leftExp := prefix()
for !p.peekTokenIs(token.SEMICOLON) && precedence < p.peekPrecedence() {
infix := p.infixParseFns[p.peekToken.Type]
if infix == nil {
return leftExp
}
p.nextToken()
leftExp = infix(leftExp)
}
return leftExp
}
func (p *Parser) parsePrefixExpression() ast.Expression {
pexp := &ast.PrefixExpression{
Token: p.curToken,
Operator: p.curToken.Literal,
}
p.nextToken()
pexp.Right = p.parseExpression(PREFIX)
return pexp
}
func (p *Parser) parseInfixExpression(left ast.Expression) ast.Expression {
expression := &ast.InfixExpression{
Token: p.curToken,
Operator: p.curToken.Literal,
Left: left,
}
precedence := p.curPrecedence()
p.nextToken()
expression.Right = p.parseExpression(precedence)
return expression
}
Results
As can be seen below, the parser is able to assign correct precedence to operators and is able to identify synctatical errors as well (notice how I got a cute monkey face, when I mistyped “let” as “ley”).
Challenges Faced
Understanding how the Pratt Parser Works was the biggest challenge in this stage. It would require multiple revisits and implementations for a greater clarity.
Go Language Features Learned
(Any new Go features used for the function definition and call parsing)
Project Progress
[✓] Lexer Implementation
[✓] Parsing let Statements
[✓] Parsing Return Statements
[✓] Parsing If-Else Expressions
[✓] Parsing Function Definitions
[✓] Parsing Function Calls
Parsing Array Literals
Parsing Hash Literals
Code Repository
github: https://github.com/AkshayContributes/interpreter
Resources
Next Steps
The next step would be adding Evaluative Capabilities to our Monkey Language
Personal Notes
Feels really good to be able to complete half of the book in a couple of weeks despite the time constraints at work.
Last Updated: November 5, 2024
Progress: 50% through the book
Next Update Expected: November 10, 2024
Subscribe to my newsletter
Read articles from Akshay Thakur directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Akshay Thakur
Akshay Thakur
Developer from India.