April 04, 2003
Exploring Object/Relational Mapping in Python
Long winded title that one, I'll have to revise it for future posts. This is the first in what I hope will be a series of posts. Its a learn while you go exercise, just like the rest of this blog, so please bear with me whilst I veer all over the place merrily making mistakes as I go.
I'm interested in object/relational mappers for a number of reasons. Primarily because I'm the database guy working with a bunch of java programmers during the day, but also because we need a nice easy to implement but lightweight persistence mechanism for PythonCard. These two driving factors aren't exactly identical but have bought to the same place. So, on with the investigation.
Our first cabs off the rank are the easiest to understand and explain and serve as a gentle introduction to the world of impedence mismatch.
dtuple
This module was written by Greg Stein, and is available from his python page. It converts the results of SQL queries on relational database into special objects whose attributes can then be accessed by subscript or dot notation. This is an extension to the standard db-api functionality of just returning each row as a tuple, requiring the programmer to then refer to each column value by index rather than a meaningful name.Rather than explain its operation in detail here I will refer you to Steve Holden's recipe in the Python Cookbook.
Whilst dtuple is useful, it isn't an object relational mapper. It supports the transition of relations to objects but not back again. This means that the only way to create an object which dtuple understands is from the database. I can't create a new object and then persist it to the database, I have to do that myself and then fetch it into a dtuple type object.
dtuple makes the results of database queries more "pythonic" but doesn't support any kind of data manipulation. Which is fine if all you are doing in your application is pulling information out of a database. But I suspect that sooner or later you will want to change this data and put it back, in which case dtuple doesn't really help.
db_row
Provided by the The OPAL Group db_row performs a similar function to dtuple. It is available directly from their web page.db_row also attempts to put a more "pythonic" face on the results of database queries. Its advantage over dtuple is that it uses new style classes (mental note to self, figure out what on earth that means). In practise this means that it re-uses dictionaries rather than creating a whole new one for each row returned and thus reduces the memory overhead when dealing with large result sets. For that reason alone it might be worth using, although I tend to avoid this problem these days by using generators to serve up the results of my queries.
Aside from that implementation detail the function is very similar to dtuple. So its also good for having more meaningful objects returned from queries but you end up writing your own SQL for everything else.
My conclusion is that these modules aren't what I am looking for. What might be worth investigating is extending dtuple or db_row so they support the missing functionality. Realistically though, I will only really consider that if none of the other available object relational wrappers are up to scratch.
Posted by Andy Todd at April 04, 2003 04:36 PM
Why go for a "lightweight" persistence system that requires the weight of running a separate database server? Python has built-in very-basic-persistence support (Pickle, Shelve), and there are full featured ones out there (ZODB, which supports multiple back ends). These are much more "Pythonic" because there is no translation layer required to convert python objects to relational data and back again.
For non-Zope ZODB uses, there are some third party mechanisms (also open source, I believe) that add extra index and query support.
Something else to look at, possibly, is Roundup's "hyperdb" system. Roundup supports many backends, from simple dbm support to BerkelyDB to SQL backends (SQLite, MySQL) and other database systems like MetaKit (a light and fast embedded database engine with Python bindings available).
Anyways, those are just some thoughts for PythonCard. I think with a system like that, pure-python persistence (or using a more object friendly database like MetaKit) is best for most peoples uses.
For object-relational mapping, I've found many of the available systems ultimately too simple to use for real business situations, or they just annoyed me by requiring too much SQL code to intermix with business objects. There's a Python system available called "Modeling" that I haven't had time to evaluate that looks to offer the robust model that I'm looking for. "Modeling" bases its design on Apple's Enterprise Objects Framework, a very powerful system typically used for O-R Modeling, The EOF keeps SQL (or other storage) independant of the object code. Your model describes how the two worlds interact, with the theory being that you can swap one storage mechanism out for another by reconfiguring the model, without affecting application code. The EOF's design is meant to deal with multiple clients, editing contexts, and more. It's a very detailed system (that as a developer of applications you never have to deal with), so since "Modeling" is using that as its design, I think it will be able to deal with complex situations better than many of the other Python O-R systems. See http://modeling.sourceforge.net/ for more. It does do the data mutation/creation commands you want. PyDO (part of skunkweb) and SQLObject offer data creation/mutation as well. SQLObject looks to be the most lightweight of the bunch, while Modeling looks to provide the most robust architecture.
True OR bridges (as opposed to object oriented databases, which may or may not use an RDBMS as a back end) have two big advantages:
* Access to existing databases.
* Easy access to reporting tools.