Archive

Random puppetry

I was talking to a colleague earlier about Puppet and its ability to install packages. I'd not really given it much thought beyond using it to install packages on classes of machines, but he mentioned one particular package which gets updated quite frequently, but is extremely low risk to update - tzdata. By setting this to "ensure => latest" rather than "ensure => present" I can forget about ever having to upgrade that package again \o/ Simple really, but it hadn't occurred to me.


Pick a letter, any letter

Earlier on my laptop suffered a slight mishap which resulted in a key popping off. I examined the mechanism and it didn't obviously go back on by itself, so I googled around a little and landed on the helpful chaps at laptopkey.com. I watched the video that pertains to my exact model, figured out which bits of metal had been slightly bent and a few minutes later I had everything back in working order. It's almost a shame I didn't need to buy anything from them in return for using their helpful video ;)


My python also spins webs

With Terminator 0.94 released I'm turning my little brain onto an idea I have for a web service and obviously I'm sticking with python. Clearly writing all the web gubbins by hand is mental, so I'm playing with Flask, a microframework for web apps. So far I'm really liking it, but it's taken a while to figure it and sqlalchemy out. I'm not at all convinced that this is going to be in any way scalable, but it's a nice way to test my idea :)


Who wants to see something really ugly?

I think it should be abundantly clear from my postings here that I'm not a very good programmer, and this means I give myself a lot of free rope to do some very stupid things. I'm in constant need of debugging information and in Terminator particularly where we have lots of objects all interacting and reparenting all the time. We've had a simple dbg() method for a long time, but I was getting very bored of typing out dbg('Class::method:: Some message about %d' % foo), so I decided to see what could be done about inferring the Class and method parts of the message. It turns out that python is very good at introspecting its own runtime, so back in January, armed with my own stupidity and some help from various folks on the Internet, I came up with the following:

# set this to true to enable debugging output
DEBUG = False
# set this to true to additionally list filenames in debugging
DEBUGFILES = False
# list of classes to show debugging for. empty list means show all classes
DEBUGCLASSES = []
# list of methods to show debugging for. empty list means show all methods
DEBUGMETHODS = []

def dbg(log = ""):
    """Print a message if debugging is enabled"""
    if DEBUG:
        stackitem = inspect.stack()[1]
        parent_frame = stackitem[0]
        method = parent_frame.f_code.co_name
        names, varargs, keywords, local_vars = inspect.getargvalues(parent_frame)
        try:
            self_name = names[0]
            classname = local_vars[self_name].__class__.__name__
        except IndexError:
            classname = "noclass"
        if DEBUGFILES:
            line = stackitem[2]
            filename = parent_frame.f_code.co_filename
            extra = " (%s:%s)" % (filename, line)
        else:
            extra = ""
        if DEBUGCLASSES != [] and classname not in DEBUGCLASSES:
            return
        if DEBUGMETHODS != [] and method not in DEBUGMETHODS:
            return
        try:
            print >> sys.stderr, "%s::%s: %s%s" % (classname, method, log, extra)
        except IOError:
            pass

How's about that for shockingly bad? ;) It also adds a really impressive amount of overhead to the execution time. I added the DEBUGCLASSES and DEBUGMETHODS lists so I could cut down on the huge amount of output - these are hooked up to command line options, so you can do something like "terminator -d --debug-classes=Terminal" and only receive debugging messages from the Terminal module. I'm not exactly sure what I hope to gain from this post, other than ridicule on the Internet, but maybe, just maybe, someone will pop up and point out how stupid I am in a way that turns this into a 2 line, low-overhead function :D


A good day

Today has been about creating, not consuming. Apart from half-watching Primal Fear with Rike, I have spent the day fixing bugs in Terminator and playing with the Akai Synthstation app on my iPad. I suspect I'm not going to be ruling the clubs anytime soon, and the UI is pretty dreadful for composing music, but it has a good library of sounds and synth mangling knobs :) I even filmed myself playing some of the parts and edited them together into a little music video, but it's really very poor ;) Rike's going to be out for most of tomorrow, so I have to decide between doing more of what I've been doing today, playing PS3 games or going out myself. Tricky!


Terminator 0.94 released!

Lots of bug fixes and some improvements to the preferences are in this release, as well as a couple of new plugins for watching terminals for activity, or taking screenshots of individual terminals. See the changelog for full details.


The Lawnmower Man

Introduction

This website shares a server with various other network services that form the foundation of my online life (i.e. IRC and Email) and I've been running into capacity issues in the last few months, so I'm running an experiment whereby I upgrade to brand new hardware (Quad Core i7, 8GB of RAM) and partition the available resources across virtual machines so the various network services are isolated into logical security zones.

Whining

I have plenty of experience using Xen for this sort of thing, but that's becoming more and more irrelevant in newer kernels/distributions. As much as I think that's a shame and a stupid upstream decision, I can't change it, so I need to move on to KVM and libvirt.

Resolution

So, with the beefy new server booted up in a -server kernel and a big, empty LVM Volume Group I got to work creating some virtual machines. This post is mainly a reminder to myself of the things I need to do for each VM :)

Action

These are the steps I used to make a VM with 1GB of RAM, 10GB / and 1GB of swap:

Create an LVM Logical Volume

lvcreate -L11G -n somehostname VolumeGroup

Create a VM image and libvirt XML definition

ubuntu-vm-builder kvm lucid --arch amd64 --mem=1024 --cpus=1 \
--raw=/dev/VolumeGroup/somehostname --rootsize=10240 --swapsize=1024 \
--kernel-flavour=server --hostname=somehostname \
--mirror=http://archive.ubuntu.com/ubuntu/ --components=main,universe \
--name 'Chris Jones' --user cmsj --pass 'ubuntu' --bridge virbr0 \
--libvirt qemu:///system --addpkg vim --addpkg ssh --addpkg ubuntu-minimal

Catchy command, huh? ;)

Wait

(building the VM will take a few minutes)

Modify the libvirt XML definition for performance

The best driver for disk/networking is the paravirtualised "virtio" driver. I found that ubuntu-vm-builder had already configured the networking to use this, but not the disk, so I modified the disk section to look like this:

<disk type='block' device='disk'>
  <source dev='/dev/VolumeGroup/somehostname'/>
  <target dev='vda' bus='virtio'/>
</disk>

Modify the libvirt XML definition for emulated serial console

I don't really want to use VNC to talk to the console of my VMs, so I add the following to the <devices> section of the XML definition to make a virtualised serial port and consider it a console:

    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target port='0'/>
    </console>

Modify the libvirt XML definition for a better CPU

I'm running this on an Intel Core i7 (Nehalem), but libvirt's newest defined CPU type is a Core2Duo, so we'll go with that in the root of the <domain> section:

<cpu match='minimum'>
  <model>core2duo</model>
</cpu>

Import the XML definition into the running libvirt daemon

virsh define /etc/libvirt/qemu/somehostname.xml

Mount the VM's root filesystem

The Logical Volume we created should be considered as a whole disk, not a mountable partition, but dmsetup can present the partitions within it, and these should still be present after running ubuntu-vm-builder:

mkdir /mnt/tmpvmroot
mount /dev/mapper/VolumeGroup-somehostnamep1 /mnt/tmpvmroot

Fix fstab in the VM

Edit /mnt/tmpvmroot/etc/fstab and s/hda/vda/

Configure serial console in the VM

Edit /etc/init/ttyS0.conf and place the following in it:

    # ttyS0 - getty
    #
    # This service maintains a getty on ttyS0 from the point the system is
    # started until it is shut down again.
    start on stopped rc RUNLEVEL=[2345]
    stop on runlevel [!2345]
    respawn
    exec /sbin/getty -L 115200 ttyS0 xterm

Edit /boot/grub/menu.lst and look for the commented "defoptions" line. Change it to:

    # defoptions=console=ttyS0 console=tty0

(the default "quiet splash" is not useful for servers IMHO)

Unmount the VM's root filesystem

umount /mnt/tmpvmroot
rmdir /mnt/tmpvmroot

Start the VM

virsh start somehostname

SSH into the VM

I didn't specify any networking details to ubuntu-vm-builder, so the machine will boot and try to get an address from DHCP. By default you'll have a bridge device for libvirt called virbr0 and dnsmasq will be running, so watch syslog for the VM getting its address.

ssh cmsj@192.168.122.xyz

you should now be in your VM! Now all you need to do is configure it to do things and then fix its networking. My plan is to switch the VMs to static IPs and then use NAT to forward connections from public IPs to the VMs, but you could bridge them onto the host's main ethernet device and assign public IPs directly to the VMs.


Python decisions

Every time I find myself hacking on some Python I find myself second guessing all sorts of tiny design decisions and so I figure the only way to get any kind of perspective on them is to talk about them. Either I'll achieve more clarity through constructing explanations of what I was thinking, or people will comment with useful insights. Hopefully the latter, but this is hardly the most popular blog in the world ;) So. What shall we look at first. Well, I just hacked up a tiny script last night to answer a simple question:

Is most of my music collection from the 90s?

Obviously what I want to do here is examine the ID3 tags of the files in my music collection and see how they're distributed. A quick search with apt showed that Ubuntu 10.04 has two python libraries for dealing with ID3 tags and a quick play with each suggested that the one with the API most relevant to my interests was eyeD3. After a few test iterations of the script I was getting bored of waiting for it to silently scan the roughly 4000 MP3s I have, so I did another quick search and found a progress barlibrary. So that's all of the motive and opportunity established, now let's examine the means to the end. If you want to follow along at home, the whole script is here.

try:
  import eyeD3
  import progressbar as pbar
except ImportError:
  print("You should make sure python-eyed3 and python-progressbar are installed")
 sys.exit(1)

First off this is the section where I'm importing the two non-default python libraries that I depend on. I want to provide a good experience when they're not installed, so I catch the exception and tell people the Debian/Ubuntu package names they need, and exit gracefully. I rename the progressbar module as I import it just because "progressbar" is annoyingly long as a name, and I don't like doing "from foo import *". Skipping further on, we find the code that extracts the ID3 year tag:

year = tag.getYear() or 'Unknown'

This is something I'm really not sure about the "correctness" of; One of the reasons I went with the eyeD3 library was that the getYear() method returns None if it can't find any data, but I don't really want to capture the result, then test the result explicitly and if it's None set the value to "Unknown", so I went with the above code which only needs a single line and is (IMHO) highly readable. This is ultimately the crux of the entire program - we've now collected the year, so we can work out which decade it's from:

if year is not 'Unknown':
 year = "%s0s" % str(year)[:3]

If this isn't an unknown year we chop the final digit off the year and replace it with a zero. Job done! Next up, another style question. Rather than store the year we just processed I want to know how many of each decade have been found, so the obvious choice is a dict where the keys are the decades and the values are the number of times each decade has been found. One option would be to pre-fill the dict with all the decades, each with a value of zero, but that seems redundant and ugly, so instead I start out with an empty dict. This presents a challenge - if we find a decade that isn't already a key in the dict (which will frequently be the case) we need to notice that and add it. We could do this by pre-emptively testing the dict with its has_key() method, but that struck me as annoyingly wordy, so I went with:

try:
  years[year] += 1
except KeyError:
  years[year] = 1

If we are incrementing a year that isn't already in the dict, python will raise a KeyError, at which point we know what's happened and know the correct value is 1, so we just set it explicitly. Seems like the simplest solution, but is it the sanest? The only other thing I wanted to say is a complaint - having built up the dict I then want to print it nicely, so I have a quick list comprehension to produce a list of strings of the format "19xx: yy" (i.e. the decade and the final number of tracks found for that decade), which I then join together using:

', '.join(OUTPUT)

which I hate! Why can't I do:

OUTPUT.join(', ')

(where "OUTPUT" is the list of strings). If that were possible, what I'd actually do is tack the .join() onto the end of the list comprehension and a single line would turn the dict into a printable string. So there we have it, my thoughts on the structure of my script. I'd also add that I've become mildly obsessive about getting good scores from pylint on my code, which is why it's rigorously formatted, docstring-ed and why the variable names in the __main__ section are in capitals. What are your thoughts? Oh, and the answer is no, most of my music is from the 2000s. The 1990s come in second :)


gtk icon cache search tool

Earlier on this evening I was asking the very excellent Ted Gould about a weird problem with my Gtk+ icon theme - an app I'd previously installed by hand in /usr/local/, but subsequently removed, had broken icons because Gtk+ was looking in /usr/local/share/icons/ instead of /usr/share/icons/. We did a little digging and realised I had an icon theme cache file in /usr/local/ that was overriding the one in /usr/. A bit of deleting later and it's back, but in the process we whipped up a little bit of python to print out the filename of an icon given an icon name.

#!/usr/bin/python
# gtk-find-icon by Chris Jones <cmsj@tenshu.net>
# Copyright 2010. GPL v2.

import sys
import gtk

THEME = gtk.IconTheme()
ICON = THEME.lookup_icon(sys.argv[1],
 gtk.ICON_SIZE_MENU,
 gtk.ICON_LOOKUP_USE_BUILTIN)

if not ICON:
 print "None found"
else:
 print(ICON.get_filename())

Hybrid - Disappear Here

It's a while since I wrote anything in the Music category of this blog, and since everything has been about my software projects recently, I figure it's time to mix things up a little. It's also just a few weeks since the release of the latest studio album by probably my favourite band of the last few years, Hybrid. The album is called Disappear Here and it's pretty damn good - I was going to describe the tracks individually, but that's what reviewers typically do and it always sounds insufferably poncy, so I suggest you just go to the album's site and listen to the damn thing yourself ;) They also post semi-frequent hour long DJ mixes on their Soundcloud page, which I would recommend!