Tuesday, December 29, 2009

When ctypes comes to the rescue

I recently purchased a DSLR (Canon 1000D) and almost at the same time I read an article about remote controlling your camera using gphoto. I tried it out and thought it was cool. During the holidays I had some time over to spend on hacking and wanted to try controlling my camera from Python. To my disappointment there were no Python bindings included with Ubuntu for gphoto. I did some googling but couldn't find any pre-compiled bindings, what to do?

Well, I could always try doing it with ctypes.

From the docs:
ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.
Note: I haven't done any serious stuff with ctypes nor gphoto before so if you find any errors etc please post a comment.

It amazed my how easy it is to use ctypes. Here's a snippet that will take a picture (from the first camera found), download the image to local storage and then delete it from the camera's storage.
import ctypes
import os

# gphoto structures
""" From 'gphoto2-camera.h'
typedef struct {
        char name [128];
        char folder [1024];
} CameraFilePath;
class CameraFilePath(ctypes.Structure):
    _fields_ = [('name', (ctypes.c_char * 128)),
                ('folder', (ctypes.c_char * 1024))]

# gphoto constants
# Defined in 'gphoto2-port-result.h'
GP_OK = 0
# CameraCaptureType enum in 'gphoto2-camera.h'
# CameraFileType enum in 'gphoto2-file.h'

# Load library
gp = ctypes.CDLL('libgphoto2.so.2')

# Init camera
context = gp.gp_context_new()
camera = ctypes.c_void_p()
gp.gp_camera_init(camera, context)

# Capture image
cam_path = CameraFilePath()

# Download and delete
cam_file = ctypes.c_void_p()
fd = os.open('image.jpg', os.O_CREAT | os.O_WRONLY)
gp.gp_file_new_from_fd(ctypes.pointer(cam_file), fd)

# Release the camera
gp.gp_camera_exit(camera, context)
Ok, remember that I haven't done any wrapper or anything, it almost looks like 'C' code. I have also skipped all error checking for brevity.

You can always use a c_void_p if you don't need access to the data in Python (if you only need to pass a pointer between foreign functions). I'm using c_void_p instead of defining ctypes structures for gphoto's data types such as Camera and CameraFile. I still had to define CameraFilePath since I needed access to the data in Python.

I really like ctypes because I don't have to maintain code in C to access native functionality. Maybe you won't get the same performance as with traditional bindings but in this particular case it's not an issue.

Tuesday, December 22, 2009

Hello Planet Python!

I just got added to Planet Python and want to give the readers a quick introduction to myself.

My name is Mario and I live in Sweden. I've been working with Java development since 1999, mainly with enterprise systems. In 2007 I switched to embedded development and primary focusing on embedded Linux systems.

Early this year (2009) I got curious about developing KDE/QT applications since I've been using the KDE desktop for years now. I didn't want to use Java nor C++ for desktop development, I needed a language which was simple to use, feature rich and well supported by QT and KDE. There were two candidates for the job, Ruby and Python, guess which one I choosed.

I really like Python and it feels fun coding again, it's my first choice of programming language now.

On my blog you'll find posts about everything regarding Python, language features, APIs, frameworks, etc. I hope you'll enjoy reading.

Tuesday, December 15, 2009

try/for/while else... else what?

The Python language actually has support for else-clauses in some compound statements such as try, for and while.

So how do you use them?

I'll illustrate the for usage with an example:
def has_only_alpha_string(lst):
    for s in lst:
        if not s.isalpha():
            print '[{0}] contains non alphabetic character'.format(s)
        print 'All strings contains only alphabetic characters'

lst = ['Hello', 'World!']

If lst is empty or exhausted the execution will continue in the else-clause. If the break is executed, the else-clause will not be executed. The same applies to while statements, the else is only executed if no break is executed within the while statement.

Note: It's valid to have a for/while-else without a break. In that case the else-clause will always be executed.

The first time I tried it I got it all wrong (before actually reading the documentation). I expected that the else-clause should be executed only if the list sequence was empty or exhausted, but it's the other way around.

What about try-else?

For me, at least, the try-else is probably a bit more intuitive.
    f = None
    f = open('a_file', 'r')
except IOError as err:
    print err
    print f.read()
    if f:
        print "Closing file..."
If the open call is successful the else-clause is executed and of course if the open call raises an IOError exception the else-clause isn't executed. Both open and read can raise IOError exceptions, the finally-clause will always be executed even if an exception occurs in the else-clause. In this particular case I'm only interested in catching the exception raised by open. If read raises an exception, it should be forwarded to the caller.

I could also write the code as following:
    f = None
    f = open('a_file', 'r')
except IOError as err:
    print err
    # return or os.exit()

print f.read()
print "Closing file..."
The problem with this solution is that file wouldn't be closed if read raises an exception.

Hope this post made the else-clause thing in combination with try/for/while more clear.

Tuesday, December 8, 2009

Profiling your Python code

Python provides support for deterministic profiling of your application. It's very easy to setup and use. To start profiling your code, two python modules are used:
  •  cProfile - the application that collects profiling data
  • pstats - the application that makes the profling data human readable
You can read more about Python profiling stuff at The Python Profilers page.

I'll show an example of how you can use the profiler.

Say I need to calculate the sum of all odd numbers from zero to an arbitrary positive value. My initial code might end up something like this:
def odd_numbers(max):
    """ Returns a list with all odd numbers between 0 to max (inclusive) """
    l = list()
    for i in xrange(max+1):
        if (i & 1):
    return l

def sum_odd_numbers(max):
    """ Sum all odd numbers between 0 to max (inclusive) """
    odd_nbrs = odd_numbers(max)

    res = 0
    for odd in odd_nbrs:
        res += odd
    return res

def main():
    # Run this 100 times to make it measurable
    for i in xrange(100):
        print sum_odd_numbers(1024)

if __name__ == '__main__':
Now I want to find out where my code spend most if its time to help me optimize the code if possible. To profile this snippet I run:
$ python -m cProfile sum_odd.py
This will output some statistics about the code (try it), but I'll show you a more handy way to browse and examine the profile dump.
$ python -m cProfile -o profile_dump sum_odd.py
This will output the profiling statistics to a file (in non-human readable format) which can be loaded and examined with pstats. Start pstats and browse the profile dump:
$ python -m pstats
Welcome to the profile statistics browser.                    
% help                                                        

Documented commands (type help topic):
EOF  add  callees  callers  quit  read  reverse  sort  stats  strip

Undocumented commands:

% read profile_dump
profile_dump% stats
Tue Dec  8 20:55:41 2009    profile_dump

         51405 function calls in 0.186 CPU seconds

   Random listing order was used

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    51200    0.082    0.000    0.082    0.000 {method 'append' of 'list' objects}
      100    0.099    0.001    0.181    0.002 main.py:1(odd_numbers)
        1    0.000    0.000    0.185    0.185 main.py:1(module)
        1    0.000    0.000    0.186    0.186 {execfile}
      100    0.004    0.000    0.184    0.002 main.py:9(sum_odd_numbers)
        1    0.000    0.000    0.186    0.186 string:1(module)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.001    0.001    0.185    0.185 main.py:18(main)

This will of course give the same information as if you just executed cProfile without specifying an output file. The advantage of using pstats interactively is that you can view the data in different ways.

Now I want to find out in which function we spend most time. This can be done by using the sort command:

profile_dump% sort
Valid sort keys (unique prefixes are accepted):
stdname -- standard name
nfl -- name/file/line
pcalls -- call count
file -- file name
calls -- call count
time -- internal time
line -- line number
cumulative -- cumulative time
module -- file name
name -- function name
profile_dump% sort time
profile_dump% stats
Tue Dec  8 20:55:41 2009    profile_dump

         51405 function calls in 0.186 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.099    0.001    0.181    0.002 main.py:1(odd_numbers)
    51200    0.082    0.000    0.082    0.000 {method 'append' of 'list' objects}
      100    0.004    0.000    0.184    0.002 main.py:9(sum_odd_numbers)
        1    0.001    0.001    0.185    0.185 main.py:18(main)
        1    0.000    0.000    0.186    0.186 {execfile}
        1    0.000    0.000    0.185    0.185 main.py:1(module)
        1    0.000    0.000    0.186    0.186 string:1(module)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
Nice! We can see that most time is spent in the odd_numbers function. The time key specifies that we would like to sort the data by the time spent in a function (exclusive calls to other functions).

Time to optimize, change the odd_numbers function to the following snippet:
def odd_numbers(max):
    """ Returns a list with all odd numbers between 0 to max (inclusive) """
    return [i for i in xrange(max+1) if (i & 1)]
Now profile the code and load the dump in pstats:
profile_dump% stats
Tue Dec  8 21:20:19 2009    profile_dump

         205 function calls in 0.020 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.015    0.000    0.015    0.000 main.py:1(odd_numbers)
      100    0.004    0.000    0.019    0.000 main.py:5(sum_odd_numbers)
        1    0.001    0.001    0.020    0.020 main.py:14(main)
        1    0.000    0.000    0.020    0.020 {execfile}
        1    0.000    0.000    0.020    0.020 main.py:1(module)
        1    0.000    0.000    0.020    0.020 string:1(module)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
Wow! Not that bad, we decreased the number of function calls from 51405 to 205. We also decreased the total time spent in the application from 0.186 to 0.020 CPU seconds by writting proper Python code :)

Tuesday, December 1, 2009

FUSE - Filesystem in Userspace part 3 (final)

Finally, as I promised, the last blog post on implementing file systems using FUSE.

I've created a file system, shoutcastfs, which enables you to mount the Shoutcast Radio directory as a file system. The genres are represented as directories and stations as files. Each file contains the station's playlist and the files are suffixed with .pls which makes it possible to load the playlist in a media player such as Amarok by double-clicking the file.

Of course, I'm using pyshoutcast (Python shoutcast API) to access the shoutcast service.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import errno
import fuse
import stat
import os
import shoutcast

fuse.fuse_python_api = (0, 2)

_shoutcastApi = shoutcast.ShoutCast()

class RootInfo(fuse.Stat):
    def __init__(self):
        self.st_mode = stat.S_IFDIR | 0755
        self.st_nlink = 2
        self._genres = {}

    def genres(self):
        if not self._genres:
            for g in _shoutcastApi.genres():
                self._genres[g] = GenreInfo(g)
        return self._genres

class GenreInfo(fuse.Stat):
    def __init__(self, name):
        self.st_mode = stat.S_IFDIR | 0755
        self.st_nlink = 2
        self.name = name
        self._stations = {}

    def stations(self):
        if not self._stations:
            for s in _shoutcastApi.stations(self.name):
                name = '{0}.pls'.format(s[0])
                name = name.replace('/', '|')
                self._stations[name] = StationInfo(name, s[1])
        return self._stations

class StationInfo(fuse.Stat):
    def __init__(self, name, station_id):
        self.st_mode = stat.S_IFREG | 0644
        self.st_nlink = 1
        # Hope no playlist exceeds this size
        self.st_size = 4096
        self.name = name
        self.station_id = station_id
        self._content = None

    def content(self):
        if self._content is None:
            self._content = _shoutcastApi.tune_in(self.station_id).read()
        return self._content

class ShoutcastFS(fuse.Fuse):
    def __init__(self, *args, **kw):
        fuse.Fuse.__init__(self, *args, **kw)
        self.root = RootInfo()

    def split_path(self, path):
        """ Returns genre and station """
        if path == '/':
            return (None, None)

        parts = path.split('/')[1:]
        if len(parts) == 1:
            return (parts[0], None)
            return parts

    def getattr(self, path):
        genre, station = self.split_path(path)

        if genre is None:
            stat = self.root
            stat = self.root.genres.get(genre)
            if not stat:
                return -errno.ENOENT

            if station:
                stat = stat.stations.get(station)
                if not stat:
                    return -errno.ENOENT
        return stat

    def readdir(self, path, offset):
        yield fuse.Direntry('.')
        yield fuse.Direntry('..')

        if path == '/':
            entries = self.root.genres.keys()
            entries = self.root.genres[path[1:]].stations.keys()

        for e in entries:
            yield fuse.Direntry(e)

    def open(self, path, flags):
        # Only support for 'READ ONLY' flag
        access_flags = os.O_RDONLY | os.O_WRONLY | os.O_RDWR
        if flags & access_flags != os.O_RDONLY:
            return -errno.EACCES
            return 0

    def read(self, path, size, offset):
        genre, station = self.split_path(path)
        info = self.root.genres[genre].stations[station]
        if offset < info.st_size:
            if offset + size > info.st_size:
                size = info.st_size - offset
            return info.content[offset:offset+size]
            return ''

if __name__ == '__main__':
    fs = ShoutcastFS()
    fs.multithreaded = False

To try the file system, run:
$ # Download shoutcast.py
$ wget http://github.com/mariob/pyshoutcast/raw/master/src/shoutcast.py
$ mkdir mnt
$ ./shoutcastfs mnt
$ cd mnt/
$ ls
...A list of genres...
$ cd Samba
$ ls
...A list of 'Samba' stations...
$ cat [station name]
...Playlist data...
Have fun!