Learn How to Navigate Code Structures and Extract Details Using Tree-sitter


If you’ve ever wished you could query code like data, Tree-sitter might be your new best friend.
Whether you're building a code analysis tool, editor extension, or just exploring syntax trees—this guide will help you understand Tree-sitter Queries from scratch using real Python examples. Let’s dive in!
What Is Tree-sitter?
Tree-sitter is a parser generator and runtime for building fast, accurate parsers for programming languages. It's used in editors like Neovim, Zed, and VS Code extensions for:
Syntax highlighting
Structural editing
Code navigation
Language-aware tools
Tree-sitter Queries
A Tree-sitter query is a way to search through this syntax tree to find specific code patterns. Think of it like a super-powered search tool that not only looks for words but understands the structure of the code.
Let’s say we’re analyzing the following Python code using Tree-sitter:
from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework import generics, serializers
from django.contrib.auth.models import User
class UserView(APIView):
def get(self, request):
user_id = request.GET.get('id')
if user_id:
return Response({"user_id": user_id})
return Response({"error": "User ID missing"}, status=400)
def post(self, request):
data = request.data
username = data.get('username')
return Response({"username": username"})
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = ['id', 'username', 'email']
class UserDetailUpdateView(generics.RetrieveUpdateAPIView):
queryset = User.objects.all()
serializer_class = UserSerializer
lookup_field = 'pk'
We will go through various Tree Sitter queries to match parts of this code, so lets begin.
You can practice using tree sitter from the tree sitter playground. We will be using that here for the demo.
Every tree sitter query is composed of nodes. Lets go through some of the Node types first.
Node Types
Every piece of code is represented as a node in the syntax tree.
Some Examples (Python):
identifier
call
string
assignment
parameters
argument_list
return_statement
attribute
if_statement
identifier
An identifier is a name that the programmer gives to things like variables, functions, classes, or parameters.
(identifier) @var-name
string
In Tree-sitter, a (string) node represents a string literal in the source code — i.e., any value enclosed in quotation marks, like "hello" or 'world'.
(string) @string-val
call
In Tree-sitter, a (call) node represents a function call — when a function is being invoked/executed in the code.
(call
function: (identifier) @called-func)
assignment
In Tree-sitter, an (assignment) node represents an assignment statement, where a value is stored in a variable.
(assignment
left: (identifier) @left-var
right: (_) @right-value)
parameters
In Tree-sitter, a (parameters) node represents the list of parameters that a function accepts.
This query captures each (identifier) inside the parameter list and tags it as @param-name.
(parameters
(identifier) @param-name)
argument_list
In Tree-sitter, an (argument_list) node represents the list of arguments passed to a function when it's being called.
(argument_list
(string) @arg)
return_statement
(return_statement) @return-line
attribute
In Tree-sitter, a (return_statement) node represents a return statement in a function — used to send a value back to the caller.
(attribute
object: (identifier) @object
attribute: (identifier) @prop)
if_statement
In Tree-sitter, an (if_statement) node captures the structure of an if block in a language like Python
(if_statement
condition: (_) @cond
consequence: (_) @if-body)
Named vs Anonymous Nodes
Named nodes are meaningful parts of the code defined by the grammar, like function calls, variable names, or statements.
Anonymous nodes are just syntax symbols or punctuation like =, (, ), or commas — they don’t have special names in the grammar.
| Node Type | Description | Examples |
| ------------- | ---------------------- | ---------------------------------------- |
| Named | Grammar-defined | call
, identifier
, return_statement
|
| Anonymous | Just syntax characters | '='
, '('
, ')'
, ','
|
Logical Operators in Tree-sitter Queries
Logical operators help you choose exactly what you want when searching code with Tree-sitter.
Think of them like filters — they check if the thing you found matches or doesn't match certain words or patterns.
You write them with a # before the word, and they work on parts of the code you already found.
1. #match?
: Regex match
(function_definition
name: (identifier) @func-name
(#match? @func-name "^get"))
This matches any function whose name starts with get
, such as get_user
, getData
, etc.
In the output, you can observe that the matched text is highlighted as blue.
Continue reading the full article here
Subscribe to my newsletter
Read articles from Rijul Rajesh T P directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
