August 22, 2008

This is Python, calling a spade a spade

Python is a high level programming language, but what does this term mean ? What does it mean for a language to be high level or low level ? Can you compare height levels of different languages ?

The meaning for the term is nebulous and there is no single or final definition. Here is one approach - the more effectively the language allows you to handle things, the higher level it is. And by things I'm not meaning just objects as in classes instances. Things, you know, everything, even if I occasionally call them objects.

Enter the notion of first-class objects. Put simply, something is called first-class object in a programming language if it can be treated just like an instance of primitive type, such as int. For example, when you declare a variable (which is a valuable feature already, to be able to declare a variable of that kind)
int i;
you then can do all sorts of things with it, such as passing it as a parameter:
foo(i);
return it as a result of function:
return i;
and do other things, depending on the language. The point is that first-class objects can be handled more effectively and provide additional flexibility. Thus, the more objects in a language are first-class, the higher level that language is.

In Python pretty much everything is first-class. I won't be digging into language reference to find whether or not it is formally true, but in practice it is just like that. It is partly because Python is an dynamically typed language with referential variables semantics - as soon as something exists, you should be able to get a reference to it, and then, once you have a reference, you pass it around as a primitive, not caring about the nature of the object it points to. The language itself does not care what kind of an object is being referenced by the variable you pass. It is only when it comes to real work, such as access to the object's methods, it may turn out to be incompatible with the operation you throw at it. Such just-in-time type compatibility is a very old idea and is called "protocol compatibility" in Python.

Why is it good ? Because I can call a spade a spade. If I need to pass a class as a parameter, what a heck, I can do it:
def create(c, i):
return c(i)

create(int, 0)
See ? Generic programming right there.

Or, why wouldn't I be able to pass in a method ?
def apply(f, x):
return f(x)

def mul_by_2(x):
return x * 2

print(apply(mul_by_2, 1)) # prints 2
Uhm, was it functional programming ?

One other curious and extremely useful first-class thing, which you wouldn't find in many other languages is the call arguments. Remember, I have said that before, there is no declarations in Python. Compatibility of a called function with the actually supplied arguments is checked just-in-time, just as anything else:
def foo(a):
...
foo(1, 2) # this throws at runtime
But nothing stops you from writing a function which accepts any arguments:
def apply(f, *args):
return [f(arg) for arg in args]

apply(mul_by_2, 1)
apply(mul_by_2, 1, 2)
...
And the point is - inside the apply function args is a variable that references a tuple of the actually passed arguments:
def apply(*args):
print(args)

apply(1, 2, 3) # prints (1, 2, 3)
there may be just a little stretch about calling args a first-class object being "arguments to the call", but practically it is just that. Imagine the flexibility of things you can do with it.

Anyway, in conclusion I will demonstrate another situation where calling a spade a spade is good. A state machine. An object with a state, and a set of state transition rules. What would it typically be ?
class C:

def __init__(self):
self._state = "A"

def _switch(self, to):
self._state = to

def _state_A(self):
print("A->B")
self._switch("B")

def _state_B(self):
print("STOP")
self._switch(None)

def simulate(self):
while self._state is not None:
if self._state == "A":
self._state_A()
elif self._state == "B":
self._state_B()

C().simulate() # prints A->B STOP
This is a quickly drawn together sample, so please don't be too picky. The problem with it, which I will try to eliminate is this - you have two kinds of way to represent the same thing - the state. What is the reason for aliasing _state_A by "A" and _state_B by "B" ? Oh, the last letter matches, I see... And what's the point in having the state-by-state switch in simulate ? Why don't we just call a spade a spade ?
class C:

def __init__(self):
self._state = self._state_A

def _switch(self, to):
self._state = to

def _state_A(self):
print("A->B")
self._switch(self._state_B)

def _state_B(self):
print("STOP")
self._switch(None)

def simulate(self):
while self._state is not None:
self._state()

C().simulate()
In this second example, I don't have any arbitrary aliases for state, instead I use for a state its own handler. A method which handles a state is a state here. It simplifies things just a bit - the switch is gone, and it is overall more clean and consistent to my taste.

Well, that's about what I had to say.

Python being a high level language... Other factors, such as wide variety of built-in container types and huge standard library also help Python to be higher level than many other languages, but it's another story.

To be continued...

12 comments:

Paddy3118 said...

You state "Python is an untyped language".
Thats not quite right: Python has types , but a type is not associated with a variable name, it is associated with the value of the name.

- Paddy.

Dmitry Dvoinikov said...

Well, it could have been poor wording from my side, but I meant to say the same thing as you. Check out one of my recent blog entries, it starts with exactly the same statement.

BTW, do you know any practically useful language with no types whatsoever ?

Dmitry Dvoinikov said...

Fixed the "Python is an untyped language" part.

Paddy3118 said...

TCL *used* to have only strings. numbers and things like lists were stored and re-interpreted from the string representation as needed.

- Paddy.

Dmitry Dvoinikov said...

Isn't it about the same as saying that in all languages everything is stored in bytes and different sequences of bytes are re-interpreted as needed ? Isn't such "re-interpretation" effectively a type casting ?

At some point during execution you have to decide what behavior you expect from this thing that you have, and should it be an integer then it doesn't matter if it has been previously stored in a string or in binary machine word, no ?

Paddy3118 said...

Hi Dmitry,
You get to the same place in the end, but their are efficiency considerations. If you store an integer constant in its binary form then each time you come to use it you might read a type tag that says its an int then use the bits as an int. If stored as a string then each time you need to use the constant you must first convert it from a string to an int (or whatever type the string represents). The second takes longer.

Some languages would keep, for example, a function as the textual string of that function - every time you called the function, it had to be re-interpreted. Python/Perl/Ruby now parse the program into 'byte-code' and interpret the bytecode at a much faster rate.

- Paddy.

Dmitry Dvoinikov said...

Hi Paddy, I can see your point and it is perfectly correct.

But it is slightly different from what we have started at. My point was that "untyped" language is impossible at all (at least impractical) - as soon as you impose certain behavior on a thing - it becomes an instance of some type. You give it a name, it becomes. After all one definition of a type is what - a set of values plus a set of supported operations. And we expect it to support an operation.

Performance considerations do exist in many languages, where primitive types can be handled out-of-band, but it is a different matter.

Anonymous said...

"BTW, do you know any practically useful language with no types whatsoever ?"

*nix shell languages

Anonymous said...

If by "untyped" you mean "weakly typed", then some practically useful languages include PHP and ECMAScript.

http://en.wikipedia.org/wiki/Type_system

Anonymous said...

Is this really a good idea?

def __init__(self):
self._state = self._state_A

Honest question. Just wondering if it lingers on the borders of clarity?

Dmitry Dvoinikov said...

I guess that's for you to decide, depending on where your borders of clarity are.

reine said...

Right ON, Dmitry!, keep on rocking'em!, (and keep on trucking!!!).