Python “is” operator behaves unexpectedly with integers
Today I tried to debug my project and after a few hours of analysing I’d got this:
>>> (0-6) is -6 False
>>> (0-5) is -5 True
Could you explain to me, why?
Maybe this is some kind of bug or very strange behavior.
> Python 2.7.3 (default, Apr 24 2012, 00:00:54) [GCC 4.7.0 20120414 (prerelease)] on linux2 >>> type(0-6) <type 'int'> >>> type(-6) <type 'int'> >>> type((0-6) is -6) <type 'bool'> >>>
All integers from -5 to 256 inclusive are cached as global objects sharing the same address with CPython, thus the
is test passes.
This artifact is explained in detail in http://www.laurentluce.com/posts/python-integer-objects-implementation/, and we could check the current source code in http://hg.python.org/cpython/file/tip/Objects/longobject.c.
A specific structure is used to refer small integers and share them so access is fast. It is an array of 262 pointers to integer objects. Those integer objects are allocated during initialization in a block of integer objects we saw above. The small integers range is from -5 to 257. Many Python programs spend a lot of time using integers in that range so this is a smart decision.
This is only an implementation detail of CPython and you shouldn’t rely on this. For instance, PyPy implemented the
id of integer to return itself, so
(0-6) is -6 is always true even if they are “different objects” internally; it also allows you to configure whether to enable this integer caching, and even set the lower and upper bounds. But in general, objects retrieved from different origins will not be identical. If you want to compare equality, just use
Python stores integers in the range -5 – 256 in the interpreter: it has a pool of integer objects from which these integers are returned. That’s why those objects are the same:
-5 but not
-6 as these are created on the spot.
Here’s the source in the source code of CPython:
#define NSMALLPOSINTS 257 #define NSMALLNEGINTS 5 static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];
(view CPython source code:
/trunk/Objects/intobject.c). The source code includes the following comment:
/* References to small integers are saved in this array so that they can be shared. The integers that are saved are those in the range -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive). */
is operator will then compare them (
-5) as equal because they are the same object (same memory location) but the two other new integers (
-6) will be at different memory locations (and then
is won’t return
True). Note that
257 in the above source code is for the positive integers so that is
0 - 256 (inclusive).
It’s not a bug.
is is not an equality test.
== will give the expected results.
The technical reason for this behavior is that a Python implementation is free to treat different instances of the same constant value as either the same object, or as different objects. The Python implementation you’re using chooses to make certain small constants share the same object for memory-saving reasons. You can’t rely on this behavior being the same version to version or across different Python implementations.
It is happening because CPython caches some small integers and small strings and gives every instance of that object a same
-5 has same value for
id(), which is not true for
>>> id((0-6)) 12064324 >>> id((-6)) 12064276 >>> id((0-5)) 10022392 >>> id((-5)) 10022392
Similarly for strings :
>>> x = 'abc' >>> y = 'abc' >>> x is y True >>> x = 'a little big string' >>> y = 'a little big string' >>> x is y False
For more details on string caching, read:
is operator behaves differently when comparing strings with spaces