__cmp__ methods (on dict and unicode, ant least) I had to check how __cmp__ behaves on CPython. And got a few surprises:
>>> {} == ''
False
It sounds right, but...
>>> {}.__eq__('')
NotImplemented
Oh. So
== isn't using __eq__ to check for equality. It's using the old three way comparison function:
>>> cmp({}, '')
-1
So, as
'' is greater than {}, then they are not equal. But...
>>> {}.__cmp__('1')
Traceback (most recent call last):
File "", line 1, in
TypeError: dict.__cmp__(x,y) requires y to be a 'dict', not a 'str'
Oops. Isn't
cmp(foo, bar) the same that foo.__cmp__(bar), at least when hasattr(foo, '__cmp__')? Well, obviously, not always.For some reason, CPython does a bit of "type checking" when you indirectly use
dict.__cmp__. If you compare a dict with an instance of a incompatible type, it does a "default comparison" by class name, instead of raising TypeError. By looking at CPython sources it seems that this is the case for every type where tp_compare is implemented in C.So, we get a -1 from
cmp({}, '') because 'dict' < 'string'. Weird. But that isn't all. If it were, probably I wouldn't bothered to write this.Let's derive dict and check what happens:
>>> class dict_derived(dict): pass
...
>>> cmp(dict_derived(), '')
-1
>>> dict_derived().__cmp__('')
Traceback (most recent call last):
File "", line 1, in
TypeError: dict_derived.__cmp__(x,y) requires y to be a 'dict_derived', not a 'str'
No surprises: It inherits the behavior from dict. So, remembering what I said above:
If you compare a dict with an instance of an incompatible type, Python does a "default comparison" by class name, instead of raising TypeError.
Now we can extend it to:
If you compare a dict or an dict-derived instance with an instance of an incompatible type, Python does a "default comparison" by class name, instead of raising TypeError.
But, why am I saying that it applies only to dicts? [Or, AFAICS, special types where the comparision function is written in C] Why not to every type? Aswer:
>>> class Foo(object):
... def __cmp__(self, other):
... raise TypeError("Foos are not comparable")
...
>>> Foo() == ''
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __cmp__
TypeError: Foos are not comparable
>>> cmp(Foo(), '')
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __cmp__
TypeError: Foos are not comparable
So, on one hand we have
dict (and maybe other builtin types) where cmp() and comparison operators doesn't raise TypeError even if __cmp__ does. And on another, user-defined classes where the raised TypeError does "leak". In the middle, our dict_derived class inherited the behavior from dict. But look at this:
>>> class dict_derived2(dict):
... def __cmp__(self, other):
... super(dict_derived2, self).__cmp__(other)
...
>>> cmp(dict_derived2(), '')
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __cmp__
TypeError: dict_derived2.__cmp__(x,y) requires y to be a 'dict_derived2', not a 'str'
Dict-derived types inherit the behaviour of
dict, unless they override __cmp__. CPython doesn't care that the new __cmp__ just call the original dict.__cmp__. The only important thing is that there is a __cmp__ implemented on python code. Once you write a "custom" __cmp__, cmp(), == and all the other comparison operators will raise the exception.To summarize, here is final rule for
dict.__cmp__: If you compare a dict or an dict-derived instance with an instance of an incompatible type, and __cmp__ is not overriden, Python does a "default comparison" by class name, instead of raising TypeErrorNote that this rule is not directly applicable to other builtin types that implement __cmp__:
>>> set().__cmp__('')
Traceback (most recent call last):
File "", line 1, in
TypeError: set.__cmp__(x,y) requires y to be a 'set', not a 'str'
>>> cmp(set(), '')
Traceback (most recent call last):
File "", line 1, in
TypeError: can only compare to a set
>>> set() == ''
False
>>> set().__eq__('')
False
With
set, TypeError is raised on __cmp__ and on cmp(), but not on ==. That's because set.__eq__ takes care of returning False if the argument type is not compatible. The end result sounds quite reasonable, because you can still do check for equality against instances of other types (like set() != ''), but can't compare for ordering against them (set() > 1 raises an error instead of doing a weird class name comparison).I suppose that the roots of this inconsistency are historical accidents. I'm curious to see if all this changed on Python 3.0.


2 comments:
They actually got rid of __cmp__ in Py3k -- at least for now. Some folks are pushing to have it re-added.
Python 3.0a5 (r30a5:62856, May 10 2008, 10:34:28)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> cmp({},"")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < str()
Post a Comment