Why You Shouldn’t Hate VirtualEnv and PIP

So a friend passed a link to me about Why [Someone] Hates Virtualenv and PIP.

Well, I also wrote my fair share of angry posts, but there is a lot of this that it is bothering me. Read it and then come back.

Back? Ok, let’s see…

Illusion of isolation

I think the argument is somewhat weak. What the author mentions in this section is basically “Virtualenv provides isolation for python things”, which basically is what the box says: “Virtual Python Environment builder”. I kinda understand that some people may confuse this as pure isolation but that is the same about complaining that people may use Word and think they can do math because it has tables.

But stop for a second and think: “Who would think Word can do math just because it says ‘Tables’ in the menus?” Well, there you have it. Seriously, if someone think virtualenv can provide a full isolation when the package clearly says “Python Environment”… well, they shouldn’t be coding anyway, right?

Full isolation

His point is on point: Yes, if you want full isolation, you’ll need another solution. He provides two, Vagrant and LXC (which stands for Linux
Containers). Thing is, a Vagrant environment is not an easy “5 seconds” process. Heck, it’s not even an easy “5 seconds” start process.

Vagrant, for those unaware, create a virtual machine, boots it, start a SSH session to it and provides a somewhat easy process to map a local directory to a directory inside your virtual machine. Vagrant provides a full isolation by creating a full operating system inside your operating system, based simply on a file (it’s Vagrantfile.rb, or something like that). But, again, it’s far from being a “5 second” process, creating or starting.

LXC (which, again, and keep this in mind, stands for Linux Containers) provides something like Vagrant, but apparently using Linux internal
virtualization system to create such machines. Unfortunately, after installing, I tried to use it but it requires some “templates”, which it can’t download from anywhere (which Vagrant does: It has its list of available templates, so you just pass the URL and it will download and create the machines — although it’s kinda hard to have two different OSes as base system). So, let’s say, it’s Vagrant with the “10 second” create/start. The problem with LXC is that it is tied to Linux and, thus, it would require everything to use Linux. While Linux is a nice operating system and all (and I use it as my primary OS these days), Python is not tied to a single operating system and we need a solution that works everywhere. Virtualenv works on Linux; virtualenv works on OS X; virtualenv works even on Windows; LXC works on Linux; LXC doesn’t work on OS X; LXC doesn’t work on Windows.

(The fact that LXC is even suggested makes the solution even mor silly if you check the blog title and it says “platform-agnostic python developer”. How can you suggest a platform specific solution if you are a platform-agnostic developer?)

If you need full isolation, the only real solution is Vagrant. Which is slow, even if that provides a full operating system isolation, which is way more than virtualenv provides — and, most of the time, way more than you need.

I’ll steal the point here and bring something here: Virtualenv is a nice way to have two different apps running under the same server. You can wrap both under different WSGIs (uWSGI or Caussette), provide two different ports for each and make NGinx provide each in different URIs. How would you do that with Vagrant of LXC? Install a different
NGinx inside each and use a third outside your virtual machines as load balancer? Make the outside NGinx access each via different ports, losing all the benefits Linux provides when dealing with sockets in the same machine? Either solution is stupid and moronic, specially if your apps are small/have low access count and virtualenv provides the perfect isolation for such situations.

Virtualenv for deployment

Here I’ll admit my ignorance and say that the only type of Python deployments I ever did were deployments for web apps. And really, what we did was simply create a virtualenv and install the packages. New version? No problem, just install the package in the virtualenv. Done.

(Actually, I had one experience with desktop deployment even before virtualenv existed — or was so widely know as it is today — but I guess that doesn’t count.)

So… no, virtualenv is not for deployments. You can use for deployment, but it’s not its primary function.

Also, if you need external dependencies (like the mysql-devel packages to be able to compile python-mysql), neither Vagrant nor LXC will help you there. You would need to install those even there (even worst, you can forget that you are using one of those and create your databases inside the virtual engine and, if something goes wrong with your installation, the whole data will be gone — and it’s really easy to forget such configuration things.)

Virtualenv is full of messy hacks

The whole “hacks” here is that you get a full package of Python inside your virtualenv. Well, this is needed because there are slightly changes even in the python standard libraries and virtualenv can create an environment for any python version installed. Thus, the packages must follow.

The binary inside the virtualenv also get changed to reflect a lot of stuff. I’ll admit that some things are silly — not stupid — because things will break if you change your virtualenv directory. But hey, that’s your fault for messing with the environment (or would you say that Vagrant can gracefully recover if you change the virtual machine image filename?).

If you need to run a Virtualenv’d python app in a cron job you’ll need to pass the virtualenv initialization, yes. But so should you check if your Vagrant engine is running (unless you put your cron job inside the vagrant engine, but then you’ll need to make sure the configuration file reflects the creation of the cron job, or it will be lost forever if need to recreate the environment). The same goes to LXC. If you forgot to start the virtualenv, or starting the Vagrant machine or start the LXC container, all 3 would fail. The fact that you need to start your virtualenv before calling the script doesn’t make any worse the the other options.

On top of that, if you need to keep going into virtualenvs to run your scripts, you’d do what any sysadmin worth its salt would do: Create a script to start it. That’s what virtualenv wrapper do — heck, even I wrote something like that already.

bin/activate

Nope, bin/activate is not exciting. Neither is Vagrantfile. But both do a lot of things in the background — setting PATHs, defining environment variables — which you don’t want to worry about. The fact that active changes your prompt is not “exciting” but it is a nice informative measure to tell you “hey, you are in a virtualenv now”. Do you want to make bin/activate “exciting”? Install powerline then.

Since we are talking about those “this thing starts a virtual environment/engine”, do Vagrantfile change anything to tell you you are in a virtual machine? Nope. Unless your virtual machine is using a different prompt, you’ll never know you are in a virtual machine for start!

(You will see differences in the prompts, yes, but that’s because people who upload the images for Vagrant actually change the original images prompts to reflect that — after all, all you’re doing is SSHing to a virtual machine. Or do you think Vagrant does a wrapper around SSH to change the prompt?)

And, since we are talking about scripts that suck, let’s talk about Vagrantfile, which is the most stupid idea I ever had (sorry, I need to go to rant mode now). A Vagrantfile is, basically, a Ruby script, with access to all Ruby can provide. If you can’t see the damage that can be done with it — or the pure laziness of its developers, which didn’t even care about writing a proper configuration file — seriously man, give up coding, for the sake of everyone else.

–no-site-packages

See the answer above about “messy hacks”: There is a reason things get cluttered inside the virtualenv and that’s due the versioning of packages inside the virtualenv.

I don’t even think it’s worth discussing this.

PIP and virtualenv buddies

I don’t know how to respond this. At first, it seems the author has a personal vendetta with Ian Bicking, which makes the point about both going hand-to-hand moot. Actually, the same can be said about Werkzeug + Flask + Jinja: “Oh, look, they fit so perfectly together, I bet it’s because Armin Ronacher wants to promote his personal philosophy and workflows”. Yes, if I said something like that, a giant “WAT” would appears on the top of your head. Thing is, Werkzeug + Flask + Jinja work so fine together because the author knows each inside and out and it makes easier to make one fit into the other — and the same goes with PIP and virtualenv.

Also, easy_install is not a solution. Easy_install do not have uninstall. Easy_install requires that you use an special option to record which files have been added/modified. PIP has none of those problems. And if you think “oh damn, this package isn’t needed anymore, better let it there” or “well, this package isn’t needed anymore, better destroy my virtualenv and create it again”, you’re doing package management wrong.

PIP builds from source

Anyone that had to deal with eggs know they sucked. Yes, they did. The whole concept of eggs is so broken that it’s being replaced (I think they new format is called “gears”, or something like that), but really, after so many installations, fuck binary installs of Python stuff.

The fact that PIP generate its install from the source is a good thing: It promotes a lot of clean storage of stuff, a proper setup.py for your project, a proper MANIFEST.in for your project, a proper project structure, a proper separation of each component and seriously, no freaking hacks to read non-python files inside your egg (try it, it’s terrible ’cause you need one behavior for development, when you have no eggs, and another when your project is packaged in one egg).

requirements.txt

PIP accepts a file as a list of requirements, yes, but you don’t need to name it “requirements.txt”; you can name it whatever you want. All you need to put in this file are the names of the packages your package/project requires. Just that. PIP does no magic over it.

The real magic happens when you read it inside your setup.py to provide the list of requirements to PIP/easy_install. And that’s it.

URIs as dependencies

Ok, semi-point. But it is not like "everyone is doing it, AMG!". Actually, I can’t remember any package that I used professionally (or even in my personal projects) that the author used an URI instead of the package name. Even in our projects, we always did create a company-wide PyPI with the company packages to deployment and as a cache for the official PyPI.

Can the fact that PIP accepts URIs be considered a problem? It can be abused, yes, but, as I put before, Vagrantfile can be abused in terrible ways, so maybe we should ban Vagrant too, right?

Actually, no. Vagrantfile, as stupid as it is, provides a lot of access to things that may be required when you’re creating your virtual machine, and so can URIs as requirements in that silly, stupid corner case.

But, again, no serious project uses URI in their requirements.

PIP freeze

Semi point again. I see a lot of people who go “I need this, and this, and this… Ok, everything here, let me create my requirements.txt by using pip freeze”, which is utterly wrong. But that doesn’t make “freeze” a bad option: It’s a pretty nice way to see what is installed in your environement. Or is “ls” a bad tool? Are stdin/stdout redirects a bad tool?

Conclusion

Dunno, some points are completely off the mark and the rest are semi-ok. I guess it was just a rant for the sake of ranting, nothing else.

It doesn’t mean virtualenv and pip don’t have their problems. But the fact that both are now part of the Python standard library may provide a cleaner implementation and a more tight implementation with the Python interpreter.

Auto-virtualenv, now with more magic

Following yesterday’s post about Auto-virtualenv trick, today I managed to fix the issue of “auto-virtualenv loses the virtualenv if you go into a subdirectory of the directory with the .venv“.

The only change is the _venv_cd function. All the other alias still remain the same.

function _upwards_search {
    venv=""
    curdir=`pwd`

    while [[ `pwd` != '/' ]]; do
        if [ -f ./.venv ]; then
            venv=`cat ./.venv`
            break
        fi
        cd ..
    done

    cd $curdir
    echo $venv;
}

function _venv_cd { 
    if [ ! -f $PWD/$1 -a "$VIRTUAL_ENV." != "."  ]; then 
        deactivate
    fi;
    \cd $1
    venv=$(_upwards_search)
    if [ -n "$venv" ]; then 
        venv $venv 
    fi
}
alias cd=_venv_cd

Next step: remove all this stuff from my .bashrc, move to a single file which can be easily sourced inside your on .bashrc and drop it in a repository somewhere.

NOTE: Apparently, there is something wrong with the test for empty venv. Hold down your horses for a sec.

NOTE 2: Ok, problem solved. Also, the repository is now live at https://bitbucket.org/juliobiason/auto-virtualenv.

My magical auto-virtualenv trick (without VirtualenvWrapper)

One thing that pissed me off a few days ago was working in a Python project with some modules and switching between virtualenvs every time[1]. So I quickly hacked a solution.

But before going further, let me say that the solution is highly based on VirtualenvWrapper — to the point that I’m using the same environment variables. I just didn’t want to install a whole package for a simple feature.

And, without further ado…

The whole thing started with two alias added in my .bashrc, one to create a virtualenv and another to “active” the virtualenv. Today, they look like this:

export WORKON_HOME=$HOME/Venv
function venv { source $WORKON_HOME/$1/bin/activate; }
function mkenv { virtualenv $WORKON_HOME/$1; venv $1; echo "$1" > ./.venv; }

Nothing fancy here: I’m using WORKON_HOME exactly as it is used with VirtualenvWrapper, to point the directory where all virtualenvs sit. Then, to avoid going full path to activate them, I can simply use venv <virtualenv-name> to activate any virtualenv and, finally, to create virtualenvs in the right directory, I have mkenv <virtualenv-name>. Simple as that.

One thing you may notice is that I’m saving the virtualenv name in a hidden file inside the current directory, called .venv. This is what makes the magic happen.

Then, I have this script + alias:

function _venv_cd { 
if [ ! -f $PWD/$1 -a "$VIRTUAL_ENV." != "."  ]; then 
    deactivate
fi;
\cd $1; 
if [ -f ./.venv ]; then 
    venv `cat ./.venv`; 
fi }
alias cd=_venv_cd

This basically replaces cd with my function, which checks if the target directory have a .venv and, if it does, activate the virtualenv (so I don’t need to use venv anymore in normal situations); if there is no .venv but a virtualenv is active, deactivate it.

The only problema I still have is that going up inside the project/module won’t check if there is a .venv in any of the parent directories and, thus, would disable the virtualenv.

[1] It was just a matter of “keeping each with their own”. The whole project goes around creating modules for a web framework and each module must be capable of working standalone, without the others.

Workaround for Broken ElementTrees

(or “Your broken XML broke my Tree!”)

Recently, I decided to parse an HTML file using ElementTree. Nothing fancy, seems most of the HTML is well-formed anyway. But (there is always a but) the source file I’m trying to parse starts with , have no and, for some reason, this makes ElementTree a very confused parser. And, by confused, all the elements, instead of keeping their original tags (e.g., table) have the prefix “{http://www.w3.org/1999/xhtml}” added to them (e.g., {http://www.w3.org/1999/xhtml}table).

Now, this isn’t purely a problem with ElementTree. If I save the file and use TextMate “tidy HTML” and run through ElementTree again, everything works perfectly.

Not only the elements themselves get weird, but ElementTree can’t use any special filters in its XPath search, like indexes (div[2]), or any attribute search ([@class="class"]).

The solution I found was convert the whole XPath (without any attribute search) to a longer form, which seems to work fine (it solves my problem), adding the “{http://www.w3.org/1999/xhtml}” to every element and doing the index search manually.

def _find_xpath(root, path):
    """Finds an XPath element "path" in the element "root", converting to the
    weird information ElementTree."""
    elements = path.split('/')
    path = []
    for el in elements:
        if not el.endswith(']'):
            path.append('{http://www.w3.org/1999/xhtml}'+el)
        else:
            # collect what we have, find the element, reset root and path
            this_element = el.split('[')
            # first part, without the 
            path.append('{http://www.w3.org/1999/xhtml}'+this_element[0])
            xpath = '/'.join(path)
            root = root.findall(xpath)

            pos = int(this_element[1][0:-1]) -1
            root = root[pos]

            path = []

    if len(path) > 0:
        xpath = '/'.join(path)
        root = root.find(xpath)

    return root

I reckon is not the cleanest solution and that I should probably use recursion somehow, but it works.

Improvement suggestions are welcomed.

“What if” iterable.join()

…or “Let’s keep floging this dead horse”…

Over the weekend I got some reactions over the str.join() vs list.join(). Well, just for fun, this morning I played “what if” in my head.

So, let’s say every iterable got a join() method. So you could

>>> a = ['s', 'l', 'o', 'w']
>>> a.join('')
slow

Exactly as JavaScript does. Mkay. A could even be a tuple and would still work. Or, stupidly enough, a string and still work. But, then, what would be the first thing that would cross you mind when you saw this code for the first time:

>>> a = ['s', 'l', 'o', 'w']
>>> b = ['r', 'u', 'l', 'z']
>>> c = a.join(b)

You have a list in one side and a list in the other with a “join” method. List-join-list. Well, I’d expect another list with ['s', 'l', 'o', 'w', 'r', 'u', 'l', 'z'] which is what extend() does.

But let’s return to the original str.join(): What it does is, join the iterable in the parameters using “self”. Mkay, so what would be:

c = ['r', ['s', 'l', 'o', 'w'], 'u', ['s', 'l', 'o', 'w'], 'l', ['s', 'l', 'o', 'w'], 'z']

Which doesn’t make any sense.

Let’s go further. join() could, possible, call str() for the parameter and still react like JavaScript join(). Why? Well, as you may already know, Python have dynamic typing which means any variable can be used. Just to remember:

var l = ['s', 'l', 'o', 'w']
var j = l.join('')

would result in “slow” in JavaScript. Mkay, so

>>> l = ['s', 'l', 'o', 'w']
>>> r = ['r', 'u', 'l', 'z']
>>> j = l.join(r)

Would result in… guess what, "s['r', 'u', 'l', 'z']l['r', 'u', 'l', 'z']o['r', 'u', 'l', 'z']w" which, not surprisingly, still doesn’t make sense. Ok, if r was a simple string, it would react exactly like JavaScript. But it would be a mess with any other types. And what’s the point of dynamic typing if you force types?

So, str.join() not only makes code simpler and avoid some mess of monkey-patching, it also removes a greater problem: ambiguity.

And yes, I understand that JavaScript list.join(str) makes sense, but it still have the problem with “What about the other types? Are you a racist?”

Edit: Just for curiosity, I wrote a list.join(list) code in JavaScript to see the results. Here they are for you mind-bloggling pleasure:

js> var l = ['s', 'l', 'o', 'w'];
js> var r = ['r', 'u', 'l', 'z'];
js> var j = l.join(r);
js> j
sr,u,l,zlr,u,l,zor,u,l,zw

Python: why str.join() and not list.join()

or “Slowpoke finally understands Python

When I was in Australia, one guy kept asking why Python had the “horrible” (in his opinion) str.join() instead of obvious (in his opinion) list.join()?

After working with JavaScript for a while, I can understand his opinion: In JS, you have a list.join() of sorts and it makes a hell lot of sense.

But, then again, this morning it finally hit me: str.join() uses an iterable object as parameter, so any iterable object will work. For example:

>>> p = 'python'
>>> '-'.join(p)
'p-y-t-h-o-n'
>>>

Ok, this is understandable, but why not have a list.join() too? Well, this would mean that every iterable object would have to have a join() method (str.join(), tuple.join(), dict.join(), list.join() and all the new iterable objects that appeared in Python 3.0.) Since the C API for Python doesn’t allow object hierarchies (and all base types are implemented in C), the same method would have to be implemented over and over again. Not only that, but you would have several different ways to join() stuff instead of one, (now) obvious way.

Another way to fix this would monkey-patch every object to have a join() method, but that’s not the Python way. Monkey-patch is never the Python way.

And the same rule applies to len(): it takes any iterable due the same reason.

setup.py, simplejson and json modules

When dealing with JSON objects in Python, we have the excellent simplejson module to convert a JSON object into Python object. But, in Python 2.6 (and 3.0), we have the json. While loading it’s as simple as

try:
    # Python 2.6/3.0 JSON parser
    import json
except ImportError:
    # Fallback to SimpleJSON
    import simplejson as json

And you would just need to use json.dump(), that leaves the problem of pointing if the application requires simplejson or not in the setup.py. The solution is use sys.version_info comparing versions with distutils.version.StrictVersion:

from distutils.version import StrictVersion

import sys
version_number = '.'.join([str(a) for a in sys.version_info[:3]])

if StrictVersion(version_number) < StrictVersion('2.6'):
    params['requires'] = ['simplejson']

Explaining:

  • First of all, params is a dictionary of parameters passed to setup (as setup(**params), as Python supports passing a dictionary as a list of parameters as a function.) Another probable solution would be use a requires variable and set it to None in case the version is superior than 2.6.
  • The version_number uses just the first three elements in sys.version_info 'cause it also contains other information like "final" or "rc" and the release info (e.g., Python 2.5rc1 would have a sys.version_info of (2, 5, 0, 'rc', 1))
  • StrictVersion provides proper comparison between versions, so you don't have to worry about Major, Minor and Release versions or even have to write three cases in an if.

OSDC 2008, Day 2

The second day of OSDC started with Larry Wall, the creator of Perl. At first I thought he would ignore the elephant in the room, but all his talk was about Perl 6. He shown the parser and the weird new operators (like “»+«” and no, this is not a mistyping) which means I’ll have a harder time trying to understand Perl code. But, on the other hand, their changes on the regular expressions make them a lot easier to use and a lot more logical (well, once you understand the basics of regular expressions.)

After the opening talk, I went to see Joshua May talking about “Going mobile – tips, tricks and tools for building mobile web-apps”. He basically summarized all the things I heard the mobile developers talking in the office: It’s freaking hard to make something that would work on every mobile. WURFL is here to help with descriptions of capabilities on mobiles. Another very interesting point was the Facebook mobile application which have a special “Call” link on it, ’cause it is easier to call someone than leaving a message and he also suggested that, instead of a “Contact us” form, business should have a “Call us for suggestions” or something around those lines, ’cause, again, it’s easier to call someone and tell them about something than typing a message on a phone.

Then it was Ben Balbo with a quick demonstration of how to stream video with a talk about “Streaming the world for free”. Basically, only using free software, you have DVgrab to capture video; FFmpeg, VLC and Mencoder to encode the video; HTTP, RTSP, RTMP as streaming protocol; Darwin Streaming Server, VLC, LScube, Red5 and Helix Server as streaming server and VLC, MPlayer and Helix Player as, well, players. On a quick demonstration, he showed the stack of DVgrab, VLC, HTTP, Darwin and QuickTime Player playing a real time video using his laptop. He also demonstrated Kyte.Tv played a video capture from his mobile phone.

The third talk was Silvia Pfeiffer talking about “An open source ‘YouTube'”. The system they build for the University of Queensland uses the FedoraCommons (not to be confused with the distribution) as the archiver/storage (FedoraCommons can store a lot of stuff, like documents) and they use Fez as a front-end to retrieve that information and display it.

Following her talk, John Ferlito joined Silvia to talk about “MetaVidWiki: When you need a web video solution”, which mixes MediaWiki (Wikipedia) with video playing. They also menioned the failed attempt of the W3 Consortium of making Ogg Theora (the free media codec) the default media format for the new <video> tag in HTML 5. As a solution they presented Mv_Embed, which is a Theora-capable player that can work with MetaVidWiki to stream the videos.

After lunch, I went to see Andrew Bennetts talking about “How to make a FAST command line tool in Python” and his experiences with Bazaar. Just to demonstrate where the basic problem is, he run two commands: time python -c "", which basically runs nothing, taking 0.013ms and time python -S -c "import os; os._exit(0)", which doesn’t load the site module and forces an “unclean” exit, before the garbage collector can do anything, reducing the run time to 0.008ms. So, basically, the big culprits are imports. For that, both Bazaar and Mercurial have a “lazy import”, which is capable of doing the proper imports only when requested. Also, a lot of imports are bad right now, like String module, which imports the regular expression module, which is slow. Also, same goes to urllib, even when you just want to use the encoding stuff there (it loads the socket module, which loads the _ssl modules, which is incredible slow and large to load.) Some suggestions he gave was use PyFlakes to find unused imported modules. Later, in the questions part, I got the sad news that locally importing modules (e.g., doing import inside a function) does not improve the load speed, which means most of my new code in Mitter is bad.

Then I went to see “Managing category structures in relational databases” by Antonie Osanz. He basically explained how to use nested sets in a relational database. I saw nested trees in Uni, but I couldn’t understand how that works properly and it seems I stil can’t. But basically you keep a left and right information pointing where the element belongs in the set. It makes insert and delete slower (’cause there are more records to be updated than just one) but search is way faster (and the query is incredible simple.)

In the same database line, I went to see Jonathan Oxer talking about “Self-Healing Databases: Managing Schema Updates in the field”. His suggestion is basically never run a schema upgrade script, but you code your ORM layer to, in case of error running a query, check for probably upgrades (e.g., table creation script, column creation script), run it and then re-execute the query. Yes, you can have problems of two users trying to access your site just after an upgrade and two scripts running at the same time, but things could probably go fine if you don’t capture the error of the upgrade procedure (I’m not sure about data convertion.)

The last talk in the day was “MySQL Optimisation by design” by Arjen Lentz. He mostly talked about the good practices we usually apply. The good stuff, though, came in the little tips. For example, you can add a small C comment inside your SQL query and that comment will appear in your logs. He also suggested that you should worry about replication and write your code thinking about doing requests in a group of servers and updates in another group, so your application is ready to work in a master-slave environment (updates in the server, requests in the slaves.)

And this day I decided to stay around and see the Lightning talks. Honestly, I have a problem with 30 minutes presentations (it’s too short) and I was kinda worried about presentations in only 5 minutes. But it was, nonetheless (and yes, I love that word.) Unfortunately, I wrote the name of the presentation but forgot to write down the presenter name. Shame on me. So we had:

  • Golly, a Corwin’s Game of Life with an impressive size;
  • That joke about “if languages where cars”;
  • Faster Beer, where the presenter make use of Corepy to speed up his Pythton application (I think it was Michael Hudson, but my memory fails me here);
  • “So you’re a kick ass coder”, which pointed how we, as developers, should get more involved in the community;
  • “Ladies Get In Free” by Pamela Fox about a suggestion to bring more girls to this kind of conference;
  • “SiliconBeachAustralia.org”, a developer community;
  • “Freeway 2.0 on Zend Framework”, which I completely lost ’cause I decided to check something different (sorry about that);
  • “Counting Your Users Without Download Statistics”, which explains some tactics used to count users using surveys;
  • “SQL vs NP”, which was a really crazy talk with SQL, first making a text/ASCII-code fractal and then solving the traveler problem using just SQL and PL/SQL;
  • “Geek my ride”, presented by Jonathan Oxer showing how he did add a computer in the trunk of his car, with power source, Wi-Fi and G3;
  • “OSDcLang for Mobile Devices”, also by Jonathan Oxer, with a simple, Turin-complete, Brainf**k-like language running on mobile devices — HIS CAR! (see above)

OSDC 2008, day 1

Ok, first “real” day of the conference.

The opening keynote was made by Chris DiBona, Google open source manager and license extraordinarie (or something along those lines.) He made some points about the grown of Summer of Code in Australia (with the sad note that only 7 students registered, even with 68 tutors), the number of licenses used by Google, Google usage and contribution of open source code… And yes, suddenly, it start sounding like “Here, Google is a friend Open Source!” propaganda, but DiBona manage to point that most of what he was talking was to make a point to other companies and how they could make a step into and alongside open source. But, apart from the little propaganda side (I can guess that it was completely unintentional), the graphs he showed, like the number of students and tutors through the years of Summer of Code in Australia and the number of licenses used by Google and other small peals, where really interesting.

The first session I saw after the opening keynote was Michael Neale talking about “Rule based systems – using rules to manage business logic”, mostly because I worked with a system written in C where we had to write all the business rules and such. Well, I surely wasn’t expecting a talk about decision trees, expert systems and logical programming, but it was interesting nonetheless. The interesting bits to me were some talk about logical programming and how rule based systems approach that. Even more interesting was knowing that Clips can generate rules by analyzing other rules.

Then I went to see Nicolas Steenhout talk about “Web accessibility and Content Management Systems”, mostly ’cause I’m terrible annoyed by websites that force me to use the mouse (looong time memories of using Lynx/Links ’cause it was the only thing that run properly on Linux and because I’d like to be a “real nerd”) and ’cause I really dislike the way sites are built these days. He pointed the current state of WCAG (Web Content Accessibility Guidelines): 1.0 is too old and 2.0 is not finished yet. He also mentioned the problem that the Sydney Olympics 2000 website had due not having any accessibility (they were sued about that) and how Target was also sued for not providing accessibility in their website.

Third session was Tennessee Leeuwenburg and “Google AppEngine + ExtJS prototyping demonstration.” That was one presentation that didn’t went well. Because the wireless was down and everyone had no internet at all, there was no prototyping and, obviously, no demonstration. So we saw some nice screenshots about AppEngine and some ExtJS screenshots and… that was it. I’m pretty sure the 30 minute space to talk about a topic was also a problem.

Next was Thomas Lee and “Python Language Internals: From Source to Execution”, which was a pretty good demonstration of the Python code. And by Python code I mean the very core of Python. What he did was, using the trunk code of Python, add a new command, from the parser, to the very end of creating a bytecode for it. Really impressive and, honestly, a big surprise that Python code is clean even when it’s not written in Python (most of the changed code was C code.)

Then Andrew Bennetts and “Python’s Unittest module: an under appreciated gem”. I was expecting a lot of weird tricks with unittest, but it seems that after a whole year using it, there was nothing new I could get from it. On the other hand, I got some pretty good list of extensions for Unittest, which provide a few more things that I may use in the future. The canonical (pun intented, as you’ll see) of such extensions is in Pyunit-Friends project in Launchpad.

“Extending Nagios with Python Plugins” by Maneschi (whose name I managed to completely lose somewhere) was more about Nagios than Python which was completely my fault. Anyway, it was an interesting talk, pointing some code used to collect information to Nagios.

Lastly, I went to “Getting Your Average Joe to use Open Source Software” by Peter Serwylo. Nothing new, I know, but he pointed some “gentler” methods to make people use Open Source (like keeping the user files and such — something that I barely thought after being constantly annoyed by “technical support” calls from my parents.) I think the way he described his methods were more in the way of “show the users what they can do” than “convert them!”, which seems to be the most commom way of making people use free software.

And I completely skipped the Lightning talks, and the Dinner Keynote.