Python Object Caching - How does Python Optimize Memory Management for Integers?
Table of contents
The use of objects and references is one of the elements that distinguishes Python from other programming languages. In this blog article, I'll look at how Python manages the creation and management of objects in memory, specifically integer objects.
It is critical to note that everything in Python is an object, including numbers, strings, lists, functions, and more. Each object has its own identity, type, and value. When a variable is assigned a value, a reference to an object in memory is created.
In the case of integers, Python employs a technique known as object caching to boost the efficiency of the program. Object caching is a technique in which a small number of objects are pre-allocated and saved in memory to be reused later. This is especially handy for small integers, which are frequently used in multiple programs. This behavior is unique to the CPython implementation of Python and may differ in other implementations. See CPython for more details about CPython. So far, CPython is one of the most extensively used Python implementations and is written in C.
Examples of Object Caching
Suppose we have the following code snippet:
>>> print("I")
>>> print("Love")
>>> print("Python")
Assuming we are using a CPython implementation of Python3 with default options/configuration. Before the execution of line 2 (print("Love")), how many int objects have been created and are still in memory?
At first look, this code line appears to generate no int objects. However, this is not entirely correct. When the Python interpreter starts, int objects are pre-allocated for integers ranging from -5 to 256. Python does this to increase program efficiency by reusing int objects rather than creating new ones every time an integer is used in the code. As a result, all integers in this range are pre-allocated and cached in memory. Now, because the pre-allocated integer range is between -5 and 256, a total of 262 integers are pre-allocated. As a result, the number of int objects created before executing line 2 is 262.
To prove the above claim, let’s look at the code below:
>>> a = 98
>>> b = 98
>>> print(id(a))
>>> print(id(b))
>>> 11534016
>>> 11534016
In the above example, we create two variables, a and b, and assign them the value 98. When we print the ids of both variables, we can see that they are the same, suggesting that they are referencing the same object in memory. This is because Python reused the previously allocated int object for the value 98 rather than creating a new object (i.e., 98 is within the -5, 256 range).
Now, let's take a look at this code snippet:
>>> c = 259
>>> d = 259
>>> print(id(c))
>>> print(id(d))
>>> 140461041507472
>>> 140461041506672
We have 2 variables, c and d, and assign them the value 259. We can see that the ids of both variables are not the same when we print them, indicating that they are addressing different objects in memory. This is because the value 259 does not fall within the pre-allocated range of integers (-5, 256). The Python interpreter uses this technique of pre-allocating a range of integers to increase program efficiency. The interpreter can save time and memory by reusing pre-allocated int objects instead of creating new int instances.
Conclusion
In conclusion, using objects and references in Python helps in achieving efficient memory management and enhances program performance. Python accomplishes this in a variety of ways, one of which is object caching, specifically for int objects. However, it is worth noting that this behavior is unique to the CPython implementation of Python3, and other implementations may behave differently.
References
Subscribe to my newsletter
Read articles from Micah Ondiwa directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Micah Ondiwa
Micah Ondiwa
Software Engineer at IBM Research | Africa.