June 30, 2008

Block-drawing characters in Firefox. WTF ?

Unicode defines a family of characters shaped like boxes of increasing height. Presumably useful for drawing diagrams in text. Something like this
x
xx
xxx
only fancier. The exact 8 characters in discussion have code points 0x2581-0x2588, and range from 1/8th to 8/8ths, i.e. full block. Here is a sample:
▁▂▃▄▅▆▇█
Now, correct me if I'm wrong, but those characters are only useful as soon as they are aligned with each other. You can't draw a diagram if one box is slightly offset - it turns out ugly. And so, can anyone tell why Firefox 2 renders the 4/8ths (half-block, code point 0x2584) and the 8/8ths (full block, code point 0x2588) shifted down a little ? Here, have a look:
This glitch makes it practically useless. WTF ?

June 22, 2008

Different ways to understand things in software engineering

The proficiency of a software developer is determined not only by which technologies he used or for how long, but more importantly by how exactly he understood and interpreted the principles behind them. Because the basic principles of software engineering are so numerous and often not specified formally, the view of the actual developer means everything.

In the course of work, a developer adapts his understanding to the problems he is working at, this is somewhat similar to how shapes of key and lock match. For this reason two people may be using the same technology for the same amount of years but be totally unable to understand each other to a point of engaging religious wars over the simplest points.

Now I understand why whenever I have a chance to interview a job applicant, I ask rather unspecific questions even of philosophical kind - to see not what he knows, but how he actually understands it and whether his understanding matches mine. Because if it doesn't we'd probably have hard times working together.

The difficult part here is trying to keep your knowledge deep and broad at the same time, because both the details and the perspective are required to understand.

June 17, 2008

The set of good programmers is still very small, a great joke by David Parnas

A reviewer explained his rejection of my best-known paper on the subject by writing, "Obviously Parnas does not know what he is talking about because nobody does it that way". Only a decade later, however, a textbook stated, "Parnas only wrote down what all good programmers did anyway". A logician would conclude that the set of good programmers was empty; that set is still very small.

-- David L. Parnas

This is Python, everything is executable

Dynamically typed language is by definition the one where variables don't have type, but the actual values do. This is by all means true in Python where the following code works fine
x = 10
x = "ten"
print(x) # prints ten
but limiting dynamism to untyped variables only would be missing the point.

Like with many "scripting" languages a Python program is started by passing the name of its main module to the "interpreter", such as
c:> python main.py
The transition of Python source code to an actually executed program begins with loading and parsing the module file. This step succeeds as soon as the module does not contain any syntax errors fatal for the parser. Successful parsing only guarantees that the module is not totally broken - a weak guarantee, only useful for checking for unbalanced parentheses and such.

What happens next is magic - the parsed module file is executed as though it was just a chunk of a source code. Wait a minute ! It is a chunk of a source code ! Anyway, execution of every module at its first import is the major part of Python program run. This process is identical no matter if the module being loaded is the program's main module or some other module explicitly imported by demand.

I have arrived to Python from C++, it took me a long time to change the perspective and the change is this - in Python you should look at everything as though it is an executable statement, because it really is. To illustrate this principle, consider definitions vs. declarations.

In statically typed languages, declarations exist for the sake of separate compilation - for the compiler to be able to tell whether one part of code is compatible with another without having to dig through the entire program. In Python, which is a dynamic language, there is no compilation stage, therefore declarations are useless, and what's left only looks like definitions.

For example, where in C++ you have two files (if you do it properly)
// foo.h                  // foo.cpp                         
class Foo int Foo::GetX(void) const
{ {
private: return x;
int x; }
public:
int GetX(void) const;
};
the .h file is a declaration - your promise to the compiler that you will provide the matching implementation and the .cpp file is that promise fulfilled. In Python there is no compiler so you don't have to feel obliged. Identical code in Python would be
class Foo:
def get_x(self):
return self.x
What you see in this Python code is neither a declaration nor a definition. It is a piece of executable code, which, when executed, introduces a new class to the containing module's namespace. Rewritten to its actual effect in pseudocode it would look like this:
class Foo:       temp1 = new class()

def get_x(self): temp2 = new method()
return self.x temp2.__code__ = return self.x
temp1["get_x"] = temp2

module["Foo"] = temp1
What you just saw was an illustration that a Python class definition is an executable statement, just like anything else and it executes once when the module is first imported. For example, it is possible to do something like C++'s conditional compilation:
class Foo:
if os.platform == "win32":
def do_it(self): # windows way
...
else:
def do_it(self): # unix way
...
The effect of the above code is that when the module is imported, the compiled version of class Foo will contain method do_it matching the current environment. It is not the same as the straightforward approach, where the check would have been performed upon each call to do_it:
class Foo:
def do_it(self):
if os.platform == "win32": # windows way
...
else: # unix way
...
In a similar vein, your class definition could fail to execute:
class Foo:
1 / 0 # this throws at import time
and the module will fail to import, throwing an exception to the caller.

Now it should not surprise you the least that when one module imports the other it is again not a declaration. When module foo does
import bar
the described process repeats for module bar, unless it has already been imported, in which case the import statement does nothing (from the discussed point of view). Similarly, you can import modules as you need them at runtime:
if need_time:
import time
print time.time()
Python therefore does not have any declarative semantics, only executional - ask yourself - what does it do when executed ?

To be continued...

June 16, 2008

This is Python, language installation and program structure

Installing Python is easy. If you use Windows, you have no choice at all - run setup.exe and you are done. Under Unix, Python can be preinstalled or you can install it manually. I always prefer to install from source on a clean machine, but if you have it preinstalled, you should be fine too.

Python installation is fully self-contained, and can be migrated to a different machine by copying all the files (or just the necessary ones) from c:\pythonXX or /usr/local/whatever/ to the destination. Multiple versions of Python coexist peacefully in different directories (although you should copy them around manually, because installation process registers stuff in the Windows registry and do other such things of global effect).

Python installation essentially contains the language parser+compiler and a huge and poorly structured standard library. The compiler itself along with a minimum set of libraries lives in pythonXX.dll or pythonXX.so.1, and the executable python.exe or bin/python is nothing but a simplest driver program of the (read line, execute, repeat) sort. The standard library lives in c:\pythonXX\lib + DLLs or /usr/local/lib/pythonXX/ and is just a heap of assorted utilities.

Python can be and is easily embedded into another application. It is a DLL, remember ? You take the DLL, zip the standard library and there you have it in two files - an embedded Python. In your application you create an instance of a compiler at runtime and start feeding it with stuff, that's all. Python can also be embedded into a diskless machine, it works just fine in a very restricted environment (such as the high security FreeBSD CD that I have here).

Python program consists of a set of separate modules, each module is a separate .py file containing some source code. The program is therefore available to the language in source, but Python nevertheless is not an interpreter. As each module is about to be used at runtime, it is loaded, parsed and compiled to an intermediate byte code for some virtual machine. The compiled byte code is saved alongside the original source file in an identically named .pyc file for future reuse. The outcome is the same as with Java or C# or any other language that translates source into byte code, and the difference is that in Python there is no separate compilation stage as such - the translation is performed at runtime and is in fact an important part of program execution.

To be continued...

June 11, 2008

This is Python, intro

Writing in Python for a few years now, I still get a kick of it. Wonderful and very powerful language, if used in the right way (aren't they all like that ?). No matter if code base is in megabytes, I still occasionally sit back in silence, admiring the beauty of a little code fragment or the way an idea is expressed in code.

If there is one single snippet of Python code to introduce its power of simplicity, it would be swapping of two variables. When I found it long ago in Python cookbook, it hit me like thunder. Never was my understanding of Python the same as before.

Here goes. To swap two variables in Python you need to

a, b = b, a

This utilizes Python feature called automatic tuple packing/unpacking. What's actually going on is more like

a, b <<< unpacked <<< (b, a) <<< packed <<< b, a

where (b, a) is a Python notion for an immutable sequence called tuple.

To be continued...

June 06, 2008

Slow pace of software development

If you need it fast, make sure it *looks* good, because it won't be. Pay more attention to high quality advertisement than to high quality development, but note that once you start selling promises, you will unlikely deliver a product.