July 15, 2008

This is Python, variable name lookup

As I already noted there is no declarations in Python. In general, there is no way to tell in advance what an arbitrary piece of code means, whether it is semantically correct and whether it can be successfully executed. All you have is a syntactically correct code fragment, but the meaning for any symbol is undetermined until the code is finally executed. For example,
foo = bar
is syntactically correct, but you cannot tell whether variable bar is defined at that point or what kind of an object it refers to. What behavior do you have in the above simplest assignment ? It is that if variable bar is defined, a new local variable foo will reference the same object as bar. Something like this:
current_namespace["foo"] = reference_by_name("bar")
This may be a trivial example, except for the behavior of the fictional reference_by_name function. Where does the language look up for a variable ? Like in the other languages that support procedural programming, Python procedures are natural namespace compartments. For example:
def foo(a): # begins foo's local namespace
b = 1 # modifies foo's namespace
print(a) # fails because a is invisible here
print(b) # same
Each procedure's individual namespace is in Python terms called "local namespace". Namespaces of nested procedures nest along with their frames, therefore a name inside of inner procedure may refer to the variable defined in an outer:
def foo():
b = 1
def bar():
print(b) # prints 1
bar()
On the other hand, presence or absence of a name in a namespace is determined dynamically, at the moment of access, unlike static lexical scoping, which welcomes all sorts of awkward ambiguities like
def foo():
b = 1
del b # would have deleted b from foo's namespace,
def bar(): # but could not be done, because this nested
print(b) # reference to b would hang (ouch !)
bar()
and
def foo():                    def foo():            
def bar(): def bar():
print(b) # prints 1 -VS- print(b) # fails because b is
b = 1 bar() # only almost there
bar() b = 1
Unless you want to maintain such ugly code, you should minimize using foreign variables in nested scopes, resorting to argument passing instead. Procedure arguments automatically become part of its local namespace and all locally accessed variables thus explicitly become local:
def foo():
b = 2
def bar(b): # explicitly local, no possible ambiguity
print b # prints 1
bar(1)
Nevertheless, it is convenient to visualize the name resolution as scanning chain of nested scopes upwards:
module.py:
5) is b here ?
def foo():
4) is b here ?
def bar():
3) is b here ?
def biz():
2) is b here ?
def baz():
1) is b here ?
a = b
Note that in step 5 the containing module becomes an implicit embracing namespace which is the last chance to find the name. In Python this module namespace is called "global namespace". Finally, in addition to local and global namespaces, there is a "built-in namespace" which contains the language primitives that are not explicitly defined anywhere.

Therefore, even the simplest access to a variable is a lookup in three namespaces - local, global and built-in in that order.

To be continued...

4 comments:

Anonymous said...

Which version of Python are you using? For me the following code doesn't hang:

def foo():
b = 1
del b # would have deleted b from foo's namespace,
def bar(): # but could not be done, because this nested
print(b) # reference to b would hang (ouch !)
bar()

In fact it gives:

File "x.py", line 3
del b # would have deleted b from foo's namespace,
SyntaxError: can not delete variable 'b' referenced in nested scope

Which is what I would expect. This is using Python 2.4. Python doesn't hang if it cannot find a name, it simply raises NameError.

Dmitry Dvoinikov said...

Ahem... I said that the _reference_ would hang, not the code. And yes, with my Python3k I have the same exception outcome as you.

eelgoois said...

Is there a cross-module name space in Python? At least, can I have a true reference to the other module's variable?

If I import another module, I can read from and write to its variables prefixed with the module name. But creating a reference to the other module's variable will expose unexpected copy-on-write semantics.

a.py:

v = 1

b.py:

import a

print a.v # 1

w = a.v

w = 2

print a.v # 1

eelgoois said...

My mistake. The cross-module references are no different from the module's own references. My confusion raised from misunderstanding of Python variables. They always return a reference to their content and will re-seat to the right-hand-value's reference, unlike C++ references.