A Yogi's Guide to Debug Python Programs
There are many resources on how to write code but not many on how to debug. In this article, I am highlighting my approach to debug both synchronous and asynchronous Python programs.
Approach to debugging
Using IDE:
Running a program in debug mode in an IDE like PyCharm, or Vscode is the easiest way to debug for someone who loves debugging using IDE-assisted features. It provides information regarding the state of objects and available attributes for objects which makes debugging easier. You can set a breakpoint at any given line in your IDE and start debugging and jumping steps repeatedly until the bug is found and fixed.
The print statements or logging
module
When I started programming, I mostly used good old print statements and/or the logging
module to know the state of the program in each line. I used to insert random print statements to debug the problems, with funny print or logging statements.
The best debugging tool is still careful thought, coupled with judiciously placed print statements.
- Brian W. Kernighan
Rubber Duck
The gist of this debugging approach is to verbally explain the problem that you are facing to someone else, it may be a rubber duck or a colleague(if you are lucky). While explaining the problems to others, our understanding also gets better which will help in connect the dots required for solving problems.
REPL
REPL(Read, Evaluate, Print, and Loop), or the way of providing Python statements directly to the interpreter console to evaluate the result. This approach saves time as you can just evaluate the Python statements rather than executing the whole Python file. REPL is mostly useful when dealing with standard modules or just trying to find the results of common data types related functions and the results. REPL to evaluate the results of standard modules/datatypes is a convenient approach to understanding the behavior of underlying operations.
Using dir
to look up all attributes available for modules, objects, and data types is still my favorite thing. The REPL below shows the use of dir
to string
module and set
data type.
>>> import string
>>> dir(string)
['Formatter', 'Template', '_ChainMap', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_re', '_sentinel_dict', '_string', 'ascii_letters', 'ascii_lowercase', 'ascii_uppercase', 'capwords', 'digits', 'hexdigits', 'octdigits', 'printable', 'punctuation', 'whitespace']
>>> a={1,2,3}
>>> dir(a)
['__and__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__iand__', '__init__', '__init_subclass__', '__ior__', '__isub__', '__iter__', '__ixor__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update']
>>> b={3,4,5}
>>> a.intersection(b)
{3}
The Python debugger or pdb
This is something that I have used primarily in my software engineering career to debug Python programs. The module pdb
( breakpoint
in recent Python versions) temporarily stops the execution of a program and lets you interact with the states of a program. You can insert a breakpoint on any line and move over to the next statement to find and fix problems. Combining pdb
prompt with dir
is a match made in heaven when it comes to debugging. The Python debugger(pdb
) has a set of commands like n
,c
, l
that you can refer here to use within the debugging prompt.
❯ python test_requests.py
> /tmp/test_requests.py(6)<module>()
-> print(response.text)
(Pdb) l
1 import requests
2
3 response = requests.get('https://example.org/')
4
5 import pdb; pdb.set_trace()
6 -> print(response.text)
[EOF]
(Pdb) dir(response)
['__attrs__', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_content', '_content_consumed', '_next', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'is_permanent_redirect', 'is_redirect', 'iter_content', 'iter_lines', 'json', 'links', 'next', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']
(Pdb) response.headers
{'Content-Encoding': 'gzip', 'Accept-Ranges': 'bytes', 'Age': '411952', 'Cache-Control': 'max-age=604800', 'Content-Type': 'text/html; charset=UTF-8', 'Date': 'Sun, 10 Sep 2023 19:57:02 GMT', 'Etag': '"3147526947+gzip"', 'Expires': 'Sun, 17 Sep 2023 19:57:02 GMT', 'Last-Modified': 'Thu, 17 Oct 2019 07:18:26 GMT', 'Server': 'ECS (nyb/1D07)', 'Vary': 'Accept-Encoding', 'X-Cache': 'HIT', 'Content-Length': '648'}
(Pdb)
The pdbpp
or ipdb
Python packages available in PyPI to enhances the debugging experience further.
Traceback module
I have used the standard traceback
module to figure out the sequence of functions and their order of execution leading to the exception. It aids in debugging by displaying detailed information on the call stack, line numbers, and source of error. It is useful when the exception is handled without exposing many details of the error.
import requests
import traceback
def print_response():
try:
response = requests.get('https://thisdoesnot.exist/')
except Exception as e:
traceback.print_exc()
print("Request failed")
return
print(response.text)
def main():
print("inside main")
print_response()
main()
AI-Assisted debugging
LLM tools like Chat GPT, GitHub copilot, Codium AI, etc. are quite good at explaining and even generating code. It can be leveraged while debugging as it can sometimes provide valuable insights into the bug or issue faced.
Debugging Asynchronous Python Programs
Debugging synchronous programs is hard, but debugging asynchronous programs is harder.
As mentioned in python documentation, we can enable asyncio debug mode for easier debugging of asynchronous programs.
Ways to enable asyncio debug mode:
Setting the
PYTHONASYNCIODEBUG
environment variable to1
.Using the Python development mode with
python -x dev
or by setting the environment variablePYTHONDEVMODE
to 1Passing
debug=True
toasyncio.run
()
.Calling loop.set_debug() when the instance of
loop
is available.
These are the benefits of using asyncio debug mode:
Finds not awaited co-routines and logs them
Shows execution time of coroutine or I/O selector
Shows callbacks that take more than 100ms by default. We can also change this time by using
loop.slow_callback_duration
to define the minimum seconds to consider slow callbacks
In addition to the above debug mode, there are tools like aiomonitor
, which inspects the asyncio loop and provides debugging capabilities. This exposes the telnet server to provide REPL capabilities, to inspect the state of asyncio application. This can also be used while debugging async programs within a docker container or in a remote server.
Whatever the bug, debugging always requires mental clarity just like a yogi. Don't stress and happy debugging!! 🐍
References:
Feel free to comment and If you learned something from this article, please share it with your friends.
Subscribe to my newsletter
Read articles from Shiva Gaire directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Shiva Gaire
Shiva Gaire
I am a backend focused full stack software engineer