January 10, 2011

Notes on implementing MongoDB driver

It's been a week-long statutory holiday in Russia and I spent a few afternoons implementing MongoDB support for the Python 3 application framework Pythomnic3k.

It turned out to be simpler task than I thought, a few odd hours here and there, given that I wanted to implement everything from the ground up. Still, there are notes I'd like to share afterwards.

Note #1: BSON

BSON is "Binary jSON" - proprietary binary protocol for serializing pieces of JavaScript for transmitting or storing. It sets the byte representation rules for simple JavaScript types such as "Integer" or "String", and also for MongoDB-specific structures such as "Regex" or "JavaScript with scope".

Supporting BSON is #1 requirement for MongoDB driver. Essentially, BSON is all there is to it.

Its entire specification is just two pages long, describes formats for 20 different objects, and for such a small spec is notably awkward. There are different ways to serialize similar objects, there are deprecated objects, and there are mysterious objects with unspecified format, probably reserved for internal use.

For example, String is serialized as
but Regex is serialized as just
Also key/value tuple is serialized as
(value type)(key)(NULL)(value)
which is strange because while deserializing you cannot simply read key, then value type and then switch to an appropiate type parser, which would be possible for

Note #2: No response to packets

Application communicates with MongoDB engine using request packets, OP_THIS, OP_THAT and OP_SO_ON. To some of them the database responds and to some it does not. In fact, it prefers to remain silent, responding only when it has to.

For example, if you want to find something, you send OP_QUERY then you do receive a response, OP_REPLY, containing the data you've been looking for. But if you want to insert data, you send OP_INSERT and (surprise) there is no response.

I understand that with MongoDB there are no guarantees of data integrity, no persistency and no transactions, but now there is even no ack from the database. It makes me feel uncomfortable.

Note #3: Explicit ordering of documents

This one is ugly. The main data structure used in BSON hence in MongoDB is a document which is a dict or associative array. And associative arrays have no order. They have iteration order which is arbitrary implementation dependent likely derived from internal hash function. That order may even be well defined, for example smallest key comes first, but that could only work with comparable keys. In any case dicts don't have n-th element.

The problem appears when the request packet containing document is serialized to be transmitted. MongoDB relies on the order in which the dict appears on the wire. Specifically, if you send a command to the database, the first key must contain the name of the command. And in some cases the command requires positional arguments to be transmitted in defined order. So you have to iterate over a dict in a specific order only known to the caller.

For example, if you have a dict
{ "foo": 1, "bar": 2 }
and you want to invoke
with it you have to serialize it like this:
[foo=1, bar=2]
but if it is
it is the other way around.

The abysmal
{ 1 => "first", 2 => "second" }
kind of dictarray has also found its way here, it is the way BSON serializes lists, but this is at least an implementation detail hidden from the application inside the driver.

Note #4: Similar things in different ways

This is easy to illustrate with the way MongoDB reports errors.

For one, as already mentioned, some of the requests have no response whatsoever even though they might fail.

Then, if you get an OP_REPLY response, it has bit flag QueryFailure, which if set should be accompanied with a single document with $err key in it containing error message. But it also has bit flag CursorNotFound, which apparently is also an error condition, but then it has no message.

Furthermore, if the request has been a command (which only the caller knows), the OP_REPLY is a success, but it contains a document with $ok key that can be 0 in which case there is also an $errmsg key containing error message.

So in the first case error reporting is done on packet level, and in the second case - on application level. And in some cases no error is reported at all.

All this leads to the final

Note #5: Ad Hoc-ish

The exercise of integrating with MongoDB leaves the impression of incompleteness. It feels like this database (and I suspect other currently existing NoSQL databases) is an experiment, an early prototype.

Well, they have 40 years of catching up with relational databases to be as well understood and implemented, it's a long way to go, so I do wish them good luck.


Kristina Chodorow said...

The BSON/protocol stuff is pretty young, but some of the stuff you mention has reasons/solutions:

You can get responses to packets, you just have to ask for them. You ask by sticking a getlasterror command to the end of the write message. Then the getlasterror is guaranteed to execute right after the write, get the result of the write, and you can get it like a normal db response. It's a little different than how relational dbs do things in that it's optional, but it is an option.

The error codes are at different levels because there are different types of errors that can occur and they should be handled differently. $err errors are caused by database asserts, often caused by user errors (e.g., duplicate key exceptions). errmsg is used when something goes wrong running a database command. A get_more basically tries to find a cursor's handle, so it's not crazy to return a special error for failing to do that, versus general database errors.

Dmitry Dvoinikov said...

Thank you for information, I modified driver code to always be requesting error information by issuing getlasterror command. It does work fine, but now I have another issue with error reporting.

The getlasterror command delivers information about the previous operation's packet error, but in yet another way, for example what was called "$err" in packet error, now becomes just "err".

Next, packet error replies either received directly or queried by getlasterror, may contain "code" key. Command errors do not contain codes. There is no code for "cursor not found" condition. Currently there appears to be no uniform way of determining a numeric error code from just any error. Is that so ?