Playing with Software

I've been trying to install open source software. In this case an application called Moodle. I followed the instructions to install it on an Ubuntu Server virtual machine but they were missing a couple of key points.

After installing the Moodle package (under "Moodle Installation" in the instructions) you need to copy the generated Apache configuration file into your Apache conf.d directory. To do this try the following command;

$ sudo ln -s /etc/moodle/apache.conf /etc/apache2/conf.d/moodle.conf

As you can see from this handy guide to configuring Apache on Debian application specific configuration files should go in the conf.d directory rather than hacking the httpd.conf file.

The second problem is that the default http://locahost/moodle URL won't work on a server because access is restricted to the local machine only by default. To allow access from other machines (in my case the host computer) you need to edit the generated apache.conf file and uncomment the line which says allow from all. This will enable remote access to the instance from your host machine.

How To Start Unit Testing

I had the great privilege to present this weekend at PyCon Australia 2013. My talk was originally titled "Why I Use py.test and Maybe You Should Too" but as I wrote the paper and accompanying slides I realised that it should really have been called "How to Start Unit Testing" as that was really the key message from my talk.

I've uploaded the slides and the paper I wrote to accompany them to the presentations section of this site. Feedback and corrections are alway welcome. Thanks to the PyCon Australia committee, volunteers and delegates for another great conference.

Living in the Future

On my morning commute today I realised that I am actually living in the future. I remember when I got involved in the PythonCard project 10 years ago one of the major questions on the mailing list was why we were building a GUI toolkit when the future was the web. It wasn't true then but I think that it is now.

Why do I think we have moved now? It is in large part thanks to a book I have started reading called Python for Data Analysis. I have a copy of the book in ePub format and wanted to read it on my laptop. After some research instead of an e-reader I actually installed a web browser plugin called Readium to view the book.

I then wanted to set up an environment for working through the examples from the book. I created a virtualenv on my Ubuntu server based VM and installed the required modules. After a couple of pages I realised that I needed some sort of graphical environment for rendering graphs. Rather than move to a desktop virtual machine I decided to go for another option. I read the documents and fired up an IPython notebook with remote access. The only thing missing from my useful toolset is a VIM instance. I'm sure that can't be far away.

All of which means that within a single browser (on separate tabs) I am both reading a book and interactively working through the Python code examples from it. I appreciate that there are back end processes involved especially with the IPython notebook. But here in 2012 it is possible to do some amazing things in the browser that I wouldn't have imagined even a couple of years ago. Did I mention that I really like living in the future?

Interrupted Service

This site may not be available for various periods over the next couple of days as I move it lock, stock and barrel to a new web host. It has been a good few years at Cornerhost but it's time to move. After an exhaustive search I've signed up at WebFaction and will be up and running on their servers in no time at all.

This does mean that email reception may be spotty as I migrate. In the rare event that you send me an email and I don't reply please accept my apologies. Then resend your email. See you on the other side.

Update (2012.08.24): the blog and web site migration seems to have worked. Hopefully email will as well.

Quotable Quotes

It started with a tweet from Tim O'Reilly. He mentioned a quote that I'm very familiar with - "Data matures like wine, applications like fish". When I read it I wondered if it was anything to do with. His tweet linked to a blog post called the 11 best data quotes from the DataMarket blog. On that list (which I highly recommend reading) the quote was tentatively attributed to me based on a write up of my 2009 OSDC presentation entitled "Change Bad!". I'd like to take the credit for this, I really would. But I can't. I did feature it on the 9th slide of my presentation but I didn't write it. The problem is that I can't remember where I first read it. Normally I put a reference in at least my notes so that I know which giant's shoulders I am standing on but I've rummaged around in various hard drives and back ups and can't find any reference. Which is a shame because it is a great quote - not least because it is very true. If anyone does know the real origin of the phrase please do leave a comment and I'll put proper attribution in my slides.

You can check here
The good news is that it has prompted me to clean up and re-publish the papers and slide from various presentations that I have given. I've put up a new index page for my presentations and checked and uploaded the correct versions of all of the files. UPDATE: thanks to A C Censi in the comments the original quote was from James Governor's Monkchips - Why Applications Are Like Fish and Data is Like Wine

Python path relative to application root

I've recently written some code to wrangle XML files. Part of the code validates a provided file against an XML Schema stored in a file. When I wrote this code I got tangled up in absolute and relative path manipulations trying to load the XML Schema file. Most of the Python file operations work relative to the current working directory and I needed to be able to load my XML Schema from a file relative to the application root directory. Regardless of where the code was executed from the schema file would always be up and across from the directory containing the Python module being executed. A picture will probably help.

Within I need to load and parse the XML Schema contained in Source_File.xsd. Here's how I did it. First, we need to work out the root directory of the application relative to After a few false starts this tip from StackOverflow was the key - The full path to our etl directory is;

root_dir = os.path.abspath(os.path.dirname(__file__))

But we need to go up a directory so we use os.path.split to remove the last component of the path.

root_dir = os.path.split(os.path.abspath(os.path.dirname(__file__)))[0]

The final part is simply joining this with the name of the directory and schema file that we wish to load. Then we have a directory that is the same irrespective of where we run code from. To make reading the code easier I split this across a few lines and ended up with.

>>> import os >>> from lxml import etree >>> root_dir = os.path.split(os.path.abspath(os.path.dirname(__file__)))[0] >>> schema_file = os.path.join(root_dir, 'schemas', 'Source_File.xsd') >>> xmlschema.doc = etree.parse(schema_file) >>> xmlschema = etree.XMLSchema(xmlschema_doc)

Extracting a discrete set of values

Today's I love Python moment is bought to you by set types.

I have a file, XML naturally, the contains a series of transactions. Each transaction has a reference number, but the reference number may be repeated. I want to pull the distinct set of reference numbers from this file. The way I learnt to build up a discrete set of items (many years ago) was to use a dict and set default.

>>> ref_nos = {} >>> for record in records: >>> ref_nos.setdefault(record.key, 1) >>> ref_nos.keys()

But Python has had a sets module since 2.3 and the set standard data type since 2.6 so my knowledge is woefully out of date. The latest way to get the unique values from a sequence looks something like this;

>>> ref_nos = set([record.key for record in records])

I think I should get bonus points for using a list comprehension as well.

Validating an XML File with LXML

I've been playing with XML files recently and have on the odd occasion needed to validate a file against an XML schema. This is surprisingly easy using lxml, the Swiss Army knife of Python XML processing. Allow me to demonstrate. >>> from lxml import etree >>> schema = etree.XMLSchema(etree.parse('schema_file_name.xsd')) >>> xml_file = etree.parse('xml_file_name.xml') >>> schema.validate(xml_file) True Job done. If you are unlucky enough that your file doesn't validate you can find out by checking the error_log attribute of your XMLSchema object.  


Due to a recent accounting error (on my part and in my favour) I recently found myself in possession of a netbook. I know that makes me a luddite and I should have bought a tablet. Call me a throwback. In my defence it was half the price of an iPad and a lot more practical for me. The major deal breaker for me is that iPad's don't come with a command line client and can't (to the best of my knowledge) run the only editor worth having. Also, iPad's don't run free software and that is becoming more important to me. So I bought a netbook.

As it came with Windows installed my first task was to install a decent operating system. I'm a fan of Xubuntu so I grabbed the latest release and then ... stopped. Because my first thought was to burn the Xubuntu .iso file to a disk and install from that, but my netbook doesn't have a CD drive. I've never installed from anything else in the past so I was a bit stuck.

The good news is that it is 2011 and Google came to the rescue. After a couple of false turns, and via, I found the rather wonderful LinuxLive USB Creator. Whilst it isn't an exhaustive test, and don't come to me with your problems, I simply installed and started LiLi, pointed it at my USB stick and the .iso file I had downloaded and 10 minutes later I had a bootable copy of Xubuntu.

Some words of praise, too, for the (X)ubuntu installer folks who have made getting their operating system on a new machine a complete breeze. Thanks everyone, top job.

Now all I've got to do is install all of the software that I rely on, configure the thing and I can start using it. At my pace that should only take a week or two. I'll be back then.

Use the right tool for the job

I was going to write an informed and opinionated piece about the use of proper tools in corporate IT departments. In particular I was going to say that I found it interesting that smaller, more cost conscious teams (in startups or open source projects) use more modern and sophisticated tools for issue management, project planning and code management than the big IT departments that I have the pleasure to work in.

But, well, I've got to go and write a status report showing the break down of issues by status, and that is going to take me about three and a half hours. So I don't have time to faff about on my blog.

Instead, I'll just paraphrase JWZ (who was apparently in turn paraphrasing an older comment about sed) and say;

Some people, when confronted with a problem think "I know, I'll use a SharePoint list." Now they have two problems.

I mean, a SharePoint list for issue management? When we could use Jira or FogBugz? I give up.