Understanding Reference Counting and Object Identity in Python

Harsh GuptaHarsh Gupta
4 min read

When working with Python, it's important to understand how reference counting and object identity work under the hood. Let's walk through these concepts step by step with practical examples.

Reference Counting in Python

We saw earlier that if two variables point to the same value (e.g., x = 10 and y = x, here reference of value 10 is being used by x and y, therefore, ref count is 2), there might be a mechanism keeping track of how many references exist for a value. That mechanism is reference counting.

Each object in memory has a reference count that tracks how many references are pointing to it.

Example:

import sys
print(sys.getrefcount(24601))  # Might return 3

Here, the number 3 indicates that the integer 24601 has three references at that point.

If we try:

print(sys.getrefcount("python"))  # Returns a huge number like 4294967295

But we haven’t defined any variable with “python“, so why we are seeing a huge ref count.

We might wonder why such a large number appears. This is due to Python's internal optimizations and compiler behavior. Python may already use and cache certain strings behind the scenes, hence the high reference count.

Another Reason: so what happens is, a loop of compiler optimization gets started which will give you such number every time, so it is not a problem, also there is no easy way to get so close to memory and get the reference count form the memory, so you will get the same number every time

Important point : the type of the data will always be inside memory, it will never go to the variable that we create such as count, score, etc.

therefore we don’t assign data type to a variable, but the value inside memory have a data type, and that data type is assigned in the memory itself but not in variable that is created by us.

Variable and Value Relationship

Remember: data types are associated with values in memory, not with variables.

Example:

score = 10
another_score = 3
another_score = score

Here, another_score now points to the same value (10) that score points to. The value 3 no longer has any reference and is a candidate for garbage collection.

But Python doesn’t immediately collect frequently used objects like integers and strings, as they're used very often.

Value Reassignment and Memory Allocation

a = 5
b = 2
a = a + 2
print(a)  # Outputs: 7

What happens under the hood:

  • a = a + 2 is evaluated as a = 5 + 2

  • A new object with value 7 is created and a is now pointing to it

  • The old value 5 is dereferenced by a, but it’s not immediately removed due to caching or deferred garbage collection

Mutable vs Immutable Example with Lists

myListOne = [1, 2, 3]
myListTwo = myListOne
myListOne = "python"
print(myListTwo)  # [1, 2, 3]

At this point, myListTwo still refers to the original list. myListOne was reassigned to a new string value, and myListOne was pointing to the reference of the value “python“.

Now:

# again assign [1,2,3] to myListOne 
myListOne = [1, 2, 3] 
myListOne[0] = 33
print(myListOne)  # [33, 2, 3]
print(myListTwo)  # [1, 2, 3]

Here, myListOne and myListTwo are no longer sharing the same reference.

Shared vs Copied References

listOne = [1,2,3]
listTwo = listOne
listOne[0] = 33
print(listOne) # Outputs: [33, 2, 3]
print(listTwo) # Outputs: [33, 2, 3]

# because listTwo is also pointing to the same reference

# we never introduced any new reference , but made changes to the value at a particular ref and listTwo
# was only pointing to the value at that particular reference

Both listOne and listTwo share the same reference.

h1 = [1, 2, 3]
h2 = h1[:]  # creating a slice but not passing anything so it creates a copy from start to the end, 
            # if we have passed it like this h2=h1[0:2], means start from 0 index till one less than 2, i.e h1=[1,2], but we havent passed anything so right now h2 would be [1,2,3]
h1[0] = 22
print(h1)  # [22, 2, 3]
print(h2)  # [1, 2, 3]

A copy is made using slicing. h2 does not reflect changes in h1.

Value vs Reference Comparison

n = [1, 2, 3]
m = n
print(m == n)  # True (same value)
print(m is n)  # True (same reference)

m = [1, 2, 3]
print(m == n)  # True (same value)
print(m is n)  # False (different references)

Use == to compare values and is to compare object identity (i.e., references)

Conclusion

Understanding reference counting and object identity helps you write more efficient Python code, especially when dealing with memory and performance issues. Python handles most of this for you with its garbage collector, but a deep understanding is beneficial for debugging and writing optimized code.

10
Subscribe to my newsletter

Read articles from Harsh Gupta directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Harsh Gupta
Harsh Gupta

I am a CSE undergrad learning DevOps and sharing what I learn.