Consider you have a reference to some object, in
xsome variable. As soon as it contains a reference to an object (and it always does), you can access that object through the variable, by applying all sorts of operators to it:
x += 1and so on. Whether or not each of those accesses will succeed depends on the target object, but the worst thing that could happen if you mistreat an object is a runtime exception, for example:
x["foo"] = "bar"
x(1, 2)
x.foo("bar")
x = 1results in
x()
TypeError: 'int' object is not callable(Note on samples: they are in Python3k, with Python2x theory is the same, but some of the samples may need to be slightly modified.)
Let's keep on looking. As soon as Python is an OOP-capable language (whatever on Earth that means), it supports classes and methods:
class C:and allows overriding reaction to some of the operators, for example the following pieces of code have similar meaning:
def foo(self, x):
print(x)
class C: class Cand it might seem that there is no difference except for Python way of having a fancy double underscore method for anything advanced, but in fact Python offers more.
def __call__(self): {
pass -vs- public:
void operator()(void) {}
};
Python allows overriding of "dot" operator. For example, the following class (despite being a little unclean) appears to support just any method you throw at it:
class C:prints out
def __getattr__(self, name):
def any_method(*args, **kwargs):
print(name, args, kwargs)
return any_method
def i_exist(self):
print("i would not budge")
c = C()
c.ping()
c.add(1, 2)
c.lookup([1, 2], key = 1)
c.i_exist()
ping () {}The magic method is apparently __getattr__, it is invoked when you apply dot operator to a class instance and it does not have such named attribute by itself, note how the i_exist method stepped up despite of having __getattr__ overriden.
add (1, 2) {}
lookup ([1, 2],) {'key': 1}
i would not budge
x.fooSo what does it mean ? It means that you can override anything, including the dot operator, something not possible in static-typed compiled languages, and this feature makes it really simple to hide all sorts of advanced behavior behind a simple method access. For example, consider XMLRPC client in Python:
^---- __getattr__ is invoked when the dot is crossed
from xmlrpc.client import ServerProxyand see how straightforward the access to a network service with procedural interface is. ServerProxy class simply intercepts the method access and turns it into a network call. This is done transparently at runtime with no need to recompile any stub or anything - you can access any target service method without any preparation. Compare this to an XMLRPC client library of your choice.
p = ServerProxy("http://1.2.3.4:5678")
p.AddNumbers(1, 2, 3)
Now take a look at the following fictional line:
foo.bar["biz"]("baz").keep.on("going")Can you see now that every delimiter (except for literal string quoute) can be intercepted and have its behavior modified ? Given this, I can (and almost universally do) apply aesthetic thinking - how would I like my code to look ? One of the Python principles is to have code (pleasantly) readable. In each case, for each relation between program modules (whatever that means) I can have it
like["this"] -OR-and so on. Depending on the situation I can pick up whatever option that makes the code more clear. And guess what ? Overriding the dot is sometimes useful.
like("this") -OR-
like_this -OR-
like + "this" -OR-
like.this
Anyhow, this is only half of the story.
The other half is told from the other side of the dot. See, __getattr__ notifies an instance that one of its methods is about to be accessed and allows for it to override. But Python also allows for the accessed member to be notified whenever it is being accessed as a member of some other instance. Sounds weird ? Take a look at this:
class Member:prints out
def __get__(self, instance, owner):
print("I'm a member of {0}".format(instance))
return self
class C:
x = Member()
c = C()
c.x
I'm a member of <__main__.C object at ...>See ? The Member instance being a member of some other class is notified whenever it is accessed. Where can it be useful you may ask ? Oh, it is the key to the magic "self" in Python.
Consider the following most simple piece of code:
class C:Have you ever thought what "self" is ? I mean - it obviously is an argument containing a reference to the instance being called, but where did it come from ? It doesn't even have to be called "self", it is just a convention, the following will work just as well:
def foo(self):
print(self)
class C:And so it turns out that somehow at the moment of the invocation the first argument of every method points to the containing instance. How is it done ?
def foo(magic):
print(magic)
What happens when you do
c = C()anyhow ? At first sight, access to c.foo should return a reference to a method - something related to C and irrelevant to c. But it appears that the following two accesses to foo
c.foo()
c1 = C()fetch different things - c1.foo returns a method with its first argument set to c1 and c2.foo - to c2. How could that happen ? The key here is that you access a method (which is a member of a class) through a class instance. The class itself contains its methods in a half-cooked "unbound" state, they don't have any "self":
c1.foo
c2 = C()
c2.foo
class C:prints out
def foo(self):
pass
print(C.foo)
print(C().foo)
<function foo at ...>See ? When fetched directly from a class, a method is nothing but a regular function, it is not "bound" to anything. You can even call it, but you will have to provide its first argument "self" by yourself as you see fit:
<bound method C.foo of <__main__.C object at ...>>
class C:prints out
def foo(self):
print(self)
C.foo("123")
123But as soon as you instantiate and fetch the same method through an instance, the magic __get__ method comes into play and allows the returned reference to be "bound" to the actual instance. Something like this:
class Method:prints out
def __init__(self, target):
self._target = target
def __get__(self, instance, owner):
self._self = instance # <<<< binding ahoy !
return self
def __call__(self, *args, **kwargs):
return self._target(self._self, *args, **kwargs)
class C:
foo = Method(lambda self, *args, **kwargs:
print(self, args, kwargs))
c = C()
print(c)
c.foo(1, 2, foo = "bar")
<__main__.C object at 0x00ADA0D0>
<__main__.C object at 0x00ADA0D0> (1, 2) {'foo': 'bar'}
And so I could demonstrate a reimplementation of a major language feature in a few lines. May be not apparently useful most of the time, such experience certainly makes you understand the language better.
One more thing, have I told you Python was cool ? :)
To be continued...
6 comments:
the descriptor protocol example is not correct (unless you're using py3k which I haven't tried yet - then you should state this!). All classes should be new-style (otherwise getters will work, but setters won't), and there's no format() method on strings up to python2.5
I do indeed use Python3k, which can be seen from
print(stuff)
"{0:s}".format(s)
which are a sure telltale.
You are right, with Python 2.x some of the samples may need to be slightly modified. The idea stays the same though. Will have the post modified.
I suspected that, but I couldn't find any reference to it in the article :-) I was just fearing that you might have been mixing different versions.
This was insightful for me. I'm fairly new to Python and haven't yet found many explorations of the way __whatever__ type methods can be toyed with.
This was instructive and has given me some fun things to explore on my own since that have really expanded and clarified my thinking about how things happen in Python (both 2.7 and 3k, btw).
This was much better (and more fun!) than, say, reading a book about Python fundamentals that can't escape the Java-esque idioms of most (bad) programming book authors.
FYI, the following code from the above example did not work. Could you explain why?
class C:
def foo(self):
print(self)
if __name__=="__main__":
C.foo("123")
/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7 /Users/trambone/PycharmProjects/DataPatterns/Other/DotOperator/test.py
Traceback (most recent call last):
File "/Users/trambone/PycharmProjects/DataPatterns/Other/DotOperator/test.py", line 6, in
C.foo("123")
TypeError: unbound method foo() must be called with C instance as first argument (got str instance instead)
Process finished with exit code 1
hmm, the formating on the code post was lost. It should be indented as the code in the blog...
Post a Comment