4th August, 2008

Gerald release 0.2.3

Filed under: database,oracle,python — admin @ 7:08 pm

I have packaged and released version 0.2.3 of gerald. This is mainly a tidy up and bug fix release, with numerous additional unit tests. You can find all of the details in the CHANGELOG.txt file that comes with the source distribution (or view it here) but in a nutshell the changes from the last release are;

  • Numerous renames to more fully comply with PEP-8
  • Added numerous tests, full details in the CHANGELOG.txt file
  • Added an ‘enabled’ flag to table constraints (except in the MySQL module, because that database doesn’t support them)
  • Added support for column defaults in Oracle
  • Added a ‘table_name’ attribute to the schema.Trigger class
  • Removed main and usage functions from each module that shouldn’t be imported

Please download, install and enjoy. Bug reports or tumultuous praise to the usual address please.

9th May, 2008

Trouble Getting a Date

Filed under: database,python — admin @ 8:04 am

I’m having trouble with dates. This can be summed up in a couple of high level issues;

1. Date support in relational databases is insane, or at the best inconsistent.

As far as I can tell the ANSI SQL-92 standard defines date, time, interval and timestamp data types. Which doesn’t help when SQL Server only implements something called ‘datetime’ – at least I think so, have you tried accessing any sort of manual for a Microsoft product online? Blimey, I thought billg had embraced this web thing years ago. Oracle has the ‘date’ data type (which is actually a time stamp) and MySQL, well they’ve gone and outdone everyone by implementing DATETIME, DATE, TIMESTAMP, TIME, and YEAR.

2. The Python DB-API does not cope with date data type ambiguity well.

When it comes to the date question the Python DB-API states (and I quote) ” … may use mx.DateTime”, which if you ask me isn’t much of a standard. This needs to change so that all DB-API modules return consistent datetime objects, not such a big issue as datetime has been part of the standard library since, what, Python 2.3?

Sadly even if we fix this it won’t work with Sqlite as it doesn’t consistently support data typing. In my experiments regardless of what sort of date you insert into the database you get a unicode string back. Don’t believe me? Try this in Python 2.5;

>>> from sqlite3 import dbapi2 
>>> db = dbapi2.connect('test_db')
>>> cursor = db.cursor()
>>> cursor.execute('create table date_test (id integer not null primary key autoincrement, sample_date DATE NOT NULL)'
>>> stmt = "INSERT INTO date_test (sample_date) VALUES (?)"
>>> cursor.execute(stmt, (1234, ))
>>> import datetime
>>> cursor.execute(stmt, (datetime.date(2008, 3, 10), ))
>>> cursor.execute(stmt, ('My name is Earl', ))
>>> db.commit()
>>> cursor.execute("SELECT * FROM date_test")
>>> results = cursor.fetchall()
>>> for item in results:
...     print item[1], type(item[1])
1234 
2008-03-10 
My name is Earl 
>>>

But note that it is fine for integers.

3. The people writing the Python standard library modules are on crack.

Outside of the database world and within the batteries included Python standard library some modules use datetime, others time and there are even uses of calendar.

O.K. I’ll accept that maybe the module authors aren’t on full strength crack, because the time module just exposes underlying posix functions. But the people who wrote those were on something strong and hallucinogenic. I table the following function signatures from section 14.2 of the Python Library Reference 2.5 as an example;

strftime(format[, t ]) 
strptime(string[, format ])

This has bitten me twice in the last twenty four hours and frankly I’m not happy.

I appreciate that there are historical reasons for having inconsistent function signatures but can someone please fix this in Python 3.0. All we need is a single module that can access the underlying system clock and then convert between a number of different representations of that and other epoch driven dates. How hard can it be? As far as I can tell this is not part of the proposed standard library re-organisation. I think it should be.

22nd April, 2008

Whither Relational Databases?

Filed under: database — admin @ 9:16 pm

Following on from a theme that Simon has been pursuing here is an interesting piece – How SimpleDB differs from a RDBMS. A thorough analysis of SimpleDB, but I think the extra value here is in the comments. I particularly liked Greg Jorgensen’s submission that programmers just don’t like RDBMS because they take some learning. Whilst I don’t have empirical evidence to back up this supposition I can say that most Java programmers I’ve come across go slightly green if you suggest that they can solve most problems with a SQL statement (and yes, that was meant to be read ironically).

If I can sum up the message of this post and it’s comments it is that we should be thankful for having different tools available to us, because this isn’t a one size fits all world. Where you’ve got a big list of simple things ™ tools like BigTable and SimpleDB work well. Where you’ve got large pieces of unstructured data (sometimes referred to as ‘documents’) you can use CouchDB, and where you have complex, structured data that has to adhere to certain validation and usage rules use a relational database. Each of these will store up to terabytes of information so let’s not even talk about (the myth of) scalability. Choose the right tool for the job and stop insisting that every problem is a nail.

So to answer my question from the title of this post – still around, and still kicking arse.

20th January, 2008

Gerald release 0.2.2

Filed under: database,python — admin @ 7:11 am

I have packaged and released version 0.2.2 of gerald. You can find all of the details on the project page but in a nutshell the changes from the last release are;

  • A serious bug in the packaging script has been fixed and the distributed archive is now complete and valid.

Bug reports, patches and test cases to the usual email address please. Or just a message to let me know you are using the code would be nice.

4th December, 2007

Indexing isn’t hard

Filed under: database — admin @ 7:45 am

I found this blog post via Reddit and was happy to see the positive benefits the author is getting from profiling his SQL and indexing his database. He could have avoided the slow running query in the first place by following Andy’s simple rules of indexing;

  1. If your database doesn’t do it automatically, add indexes to all of your primary key columns
  2. Add indexes to all of your natural key columns
  3. Add indexes to all of your foreign key columns

When primary or foreign keys are composites (i.e. contain more than one column) create an index containing all of the columns in the same order that they are defined in the corresponding constraint.

If none of this makes any sense and you are responsible for your application’s database give me a call, my consulting rates are very reasonable.

30th September, 2007

Gerald release 0.2.1

Filed under: database,python — admin @ 4:01 pm

I have packaged and released version 0.2.1 of gerald. You can find all of the details on the project page but in a nutshell the changes from the last release are;

Bug reports, patches and test cases to the usual email address please. Or just a message to let me know you are using the code, that would be nice.

20th September, 2007

Insanity

Filed under: database — admin @ 5:29 pm

Today I’ve come across a new definition of insanity. It is this;

Expecting an update of every row in a 13 million row table (which is subject to frequent updates) to complete in one pass without timing out.

Of course, this wouldn’t be necessary if the system in question was running a version of PostgreSQL that was released after 2003. Damn those risk averse system administrators.

I should probably explain. In versions of PostgreSQL before release 8 you cannot increase the size of a column. Therefore to increase the size of one of the columns in our production database we have to add a new column, copy the contents of the old column to the new column, rename both columns and then drop the old column. Now that is what I call a database refactoring.

12th September, 2007

Much SQL Goodness

Filed under: database — admin @ 8:49 am

I rarely link to other blogs, but when I come across something that I think that you, dear readers, would appreciate I feel it would be churlish not to share. I’ve recently started reading Jeff Smith’s SQL Server Weblog and it is a veritable cornucopia of great information.

All of it is good solid advice about how to design, build and work with your database. I even forgive him using the wrong database because everything he writes is of such high quality.

This article alone will come in useful whenever I have to explain the concept of primary keys to monkey coders who only understand object models.

Thanks Jeff, keep up the good work.

12th July, 2007

In Search of Technologists

Filed under: database,General,python — admin @ 5:52 pm

In my new executive position I’m responsible for maintaining adequate staffing levels in the technology department of my company. We are pretty much staffed up at the moment but there are some potential projects on the horizon that will tax the capacity of my team. In a novel fit of forward planning I’m trying to find a consulting partner who will be able to help us in these times of high workload. It is proving quite tricky.

So, dear lazyweb, do you know of any reputable consulting companies who will fit our bill. They must be;

  1. On the ground in Australia. A Sydney office is a bonus.
  2. Have competencies in one or more of; Python, PostgreSQL and scaling internet applications
  3. Have a track record and reputable references. I will be checking.
  4. Not cost the earth. Although I’m very aware of how much consultants cost, having been one myself for most of the last fifteen years.

Any and all suggestions are welcomed in the comments or to my email address. As long as they fit our criteria. Outsourcing and offshore development are not an option at this time so please don’t suggest these.

7th March, 2007

Gerald Release 0.2

Filed under: database,python — admin @ 9:24 am

I’ve packaged and released version 0.2 of gerald. You can find all of the details on the project page but in a nutshell the changes from the last release are;

  1. A test suite – using py.test
  2. Support for more database objects – views, sequences, stored code objects, indexes, etc.
  3. A large number of bug fixes. Too many to mention here.
  4. Use of my dburi module for database connectivity

There’s probably more but I didn’t keep a comprehensive change register whilst I was tinkering with the code, an oversight I’ve already addressed for the inevitable 0.2.1 release.

This code has been fairly extensively tested and exercised in the real world but I’d appreciate any feedback, bug reports or patches.

« Previous PageNext Page »

Powered by WordPress