ZODB is a Python object-oriented database engine.
- New data is only ever appeneded to the database file. This makes ZODB very robust against corruption.
- Database file must periodically be "packed" to delete old data.
- Complete index of all objects is kept in memory.
- A write transaction can fail if someone else writes first, updating already read objects. In this case, the transaction must be repeated. (I quite like this approach.)
- Currently, reads can fail if one of the objects read is updated. This is dumb, and is being fixed. (MVCC)
The need to pack and to keep the object index in memory make ZODB kind of sucky.
I don't like the need to "pack" becuase it means that periodically all writes to the database must stop. Instead propose:
- Database is stored in multiple, sequentially numbered files.
- Data in the oldest file is progressively moved to the latest file. When all the old data has been moved, the old file can be deleted (or moved to an archive). Typically a new file would be created at this point, to keep the total number of active files constant.
- Data structure can be optimized while being copied. For example trees re-balanced, or overwritten items actually removed.
- Minimum of two active files. More files will be more space efficient.
- Cute feature: It is ok if an old file is still being read when it is deleted. Deleteing a file simply unlinks it from the filesystem, a process with an open filehandle can still read it. (Note: this won't work on NFS--d'oh)
I also don't like the need to store the index in memory:
- Instead use functional programming approach: every object is const, every transaction updates root object.
- Root object would be an ObjectId->Object mapping, so that the system appears to be mutable. The "real" root would be the object with ObjectId zero, just as in ZODB.
- Mutability implies a need for garbage collection. This can safely be done concurrently (MVCC). The garbage collector just writes a transaction mutating deleted objects to some kind of "delete-me" marker, the progressive pack actually does the deletion.
- I seem to recall Andrew Clausen talking about some really nice data-structure for this. Or maybe just use a splay tree^H^H^H^H^H^H^H^H 2-3 tree.
Some nice features that should be possible (Aether][):
- Fast boot time. Plausible to use from CGI script without having a server process.
- Usable over NFS.
- Thus possible to use on bog standard virtual-hosting clusters.
btw... What should go in a database?
- Data? Zope: yes. SQL: yes. Me: yes.
- Source code? Zope: sometimes. SQL: mostly constraints and consistency checks. Me: I would rather gnaw off my own leg.
- HTML/templates? Zope: site specific HTML and templates. SQL: no. Me: only "content" HTML. "Architectural" HTML and templates should be treated like source code.
- Schema? Zope: yes, external code has schema installation procedure. SQL: yes, created by SQL script. Me: this contradicts zero-install (see previous blog entry), so no. An implication of this is that there should not be "folder" or "table" objects in the database, just leaves. This can be alternatively viewed as "one big table", in which things like type and location are just part of the primary key of objects.
- It is possible not to shoot yourself in the foot with this approach. Rather like Python. (Zope is like Perl, always ugly. SQL is like Java, tedious but hard to go wrong (so long as you use explicit foreign keys).)