Wednesday, April 30, 2008

Django SECRET_KEY Generation

When deploying Django application it is a common step to generate a SECRET_KEY for the site. Here is the quick recipe to do it:


$ python -c 'import random; print "".join([random.choice("abcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*(-_=+)") for i in range(50)])'


Useful when for whatever reason you don't want to install django-command-extensions.

Monday, April 28, 2008

Python Comparison Weirdness

While tracking a Jython bug related with some __cmp__ methods (on dict and unicode, ant least) I had to check how __cmp__ behaves on CPython. And got a few surprises:

>>> {} == ''
False

It sounds right, but...

>>> {}.__eq__('')
NotImplemented

Oh. So == isn't using __eq__ to check for equality. It's using the old three way comparison function:

>>> cmp({}, '')
-1

So, as '' is greater than {}, then they are not equal. But...

>>> {}.__cmp__('1')
Traceback (most recent call last):
File "", line 1, in
TypeError: dict.__cmp__(x,y) requires y to be a 'dict', not a 'str'

Oops. Isn't cmp(foo, bar) the same that foo.__cmp__(bar), at least when hasattr(foo, '__cmp__')? Well, obviously, not always.

For some reason, CPython does a bit of "type checking" when you indirectly use dict.__cmp__. If you compare a dict with an instance of a incompatible type, it does a "default comparison" by class name, instead of raising TypeError. By looking at CPython sources it seems that this is the case for every type where tp_compare is implemented in C.

So, we get a -1 from cmp({}, '') because 'dict' < 'string'. Weird. But that isn't all. If it were, probably I wouldn't bothered to write this.

Let's derive dict and check what happens:

>>> class dict_derived(dict): pass
...
>>> cmp(dict_derived(), '')
-1
>>> dict_derived().__cmp__('')
Traceback (most recent call last):
File "", line 1, in
TypeError: dict_derived.__cmp__(x,y) requires y to be a 'dict_derived', not a 'str'

No surprises: It inherits the behavior from dict. So, remembering what I said above:
If you compare a dict with an instance of an incompatible type, Python does a "default comparison" by class name, instead of raising TypeError.

Now we can extend it to:
If you compare a dict or an dict-derived instance with an instance of an incompatible type, Python does a "default comparison" by class name, instead of raising TypeError.

But, why am I saying that it applies only to dicts? [Or, AFAICS, special types where the comparision function is written in C] Why not to every type? Aswer:

>>> class Foo(object):
... def __cmp__(self, other):
... raise TypeError("Foos are not comparable")
...
>>> Foo() == ''
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __cmp__
TypeError: Foos are not comparable
>>> cmp(Foo(), '')
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __cmp__
TypeError: Foos are not comparable

So, on one hand we have dict (and maybe other builtin types) where cmp() and comparison operators doesn't raise TypeError even if __cmp__ does. And on another, user-defined classes where the raised TypeError does "leak". In the middle, our dict_derived class inherited the behavior from dict. But look at this:

>>> class dict_derived2(dict):
... def __cmp__(self, other):
... super(dict_derived2, self).__cmp__(other)
...
>>> cmp(dict_derived2(), '')
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __cmp__
TypeError: dict_derived2.__cmp__(x,y) requires y to be a 'dict_derived2', not a 'str'

Dict-derived types inherit the behaviour of dict, unless they override __cmp__. CPython doesn't care that the new __cmp__ just call the original dict.__cmp__. The only important thing is that there is a __cmp__ implemented on python code. Once you write a "custom" __cmp__, cmp(), == and all the other comparison operators will raise the exception.

To summarize, here is final rule for dict.__cmp__:
If you compare a dict or an dict-derived instance with an instance of an incompatible type, and __cmp__ is not overriden, Python does a "default comparison" by class name, instead of raising TypeError

Note that this rule is not directly applicable to other builtin types that implement __cmp__:

>>> set().__cmp__('')
Traceback (most recent call last):
File "", line 1, in
TypeError: set.__cmp__(x,y) requires y to be a 'set', not a 'str'
>>> cmp(set(), '')
Traceback (most recent call last):
File "", line 1, in
TypeError: can only compare to a set
>>> set() == ''
False
>>> set().__eq__('')
False

With set, TypeError is raised on __cmp__ and on cmp(), but not on ==. That's because set.__eq__ takes care of returning False if the argument type is not compatible. The end result sounds quite reasonable, because you can still do check for equality against instances of other types (like set() != ''), but can't compare for ordering against them (set() > 1 raises an error instead of doing a weird class name comparison).

I suppose that the roots of this inconsistency are historical accidents. I'm curious to see if all this changed on Python 3.0.

Wednesday, April 23, 2008

Django on Jython: Summer of Code!

This post is not exactly hot news, but late is better than never:

My application for the Google Summer of Code 2008, titled “Django on Jython: Supporting Python Web App Frameworks on the JVM” was accepted!! :-)

I can't say in words how happy I am, even after a few days have passed since I got the news. I'm lucky enough to have Jim Baker as my mentor, a very inspiring guy, as motivated as myself by the project. He is a very active Jython contributor, a successful mentor of past SoC projects. And Frank Wierzbicki is already moving DoJ to work on top of Glassfish. From the Django side, Jacob Kaplan-Moss maintained the support shown on the past and will fast-track our patches to Django, if they are needed.

I also got the acknowledgment from Imagemaker, my employer, to stop working full time during the SoC period. In fact, they showed very supportive and interested on the project, looking forward to its results and potential.

I'm going to participate on a vibrant Jython community, with other two fellow students working on web frameworks and Jython: Georgy Berdyshev with Zope and Ariane Paola Gomes with TurboGears2 .

Things couldn't be better. I will resume the work done on the past year, hoping to make Django on Jython a reality by August 2008. In fact, I already started, with a quick but useful web application for tracking the status of the Django test suite running on top of Jython and the postgresql_zxjdbc driver. Hope to get that page fully green as soon as possible.

Monday, April 21, 2008

404

I saw this on reddit, on a thread about the supposed best 404 error message. But this is one far better:


“I'm sorry, you've reached a page that I cannot find. I'm really sorry about this. It's kind of embarassing. Here you are, the user, trying to get to a page on LiveJournal and I can't even serve it to you. What does that say about me? I'm just a webserver. My sole purpose in life is to serve you webpages and I can't even do that! I suck. Please don't be mad, I'll try harder. I promise! Who am I kidding? You're probably all like, "Man, LiveJournal's webserver sucks. It can't even get me where I want to go." I'm really sorry. Maybe it's my CPU...no that's ok...how bout my hard drives? Maybe. Where's my admin? I can't run self-diagnostics on myself. It's so boring in this datacenter. It's the same thing everyday. Oh man, I'm so lonely. I'm really sorry about rambling about myself, I'm selfish. I think I'm going to go cut my ethernet cables. I hope you get to the page you're looking for...goodbye cruel world!”

(LiveJournal's 404)



You may need to reload the page a few times to get this message, because they have others...

Wednesday, April 9, 2008

Rails Migrations Gotcha: Backward-Incompatible Model Changes

I'm pushing for adoption of Rails Migration on all Rails projects on my job (we use them on a few). As a consequence, I won the assignment of writing migrations for the last changes on the system I'm currently involved. That seemed easy, but it wasn't. I will try to show why, without diving into details of my specific scenario.

Imagine you have the following model:

class Foo < ActiveRecord::Base
end

And the following migration:

class AddAnotherFieldToFoo < ActiveRecord::Migration
def self.up
add_column :foo, :new_column, :string
Foo.reset_column_information
Foo.find(:all).each do |foo|
foo.new_column = some_calculation(foo.another_column)
foo.save!
end
end
end

Now, we make the following changes to our model:

class Foo < ActiveRecord::Base
has_many :bars
before_save :do_something_with_my_bars
def do_something_with_my_bars
...
end
end

And its migration (just for completeness, not really relevant):

class AddBazToFoo < ActiveRecord::Migration
def self.up
add_column :foo, :bar_id, :integer
end
end

So what is the problem?

For us, who made the last change on Foo after doing the AddAnotherFieldToFoo migration, it's all fine.

But, for the new developer who just made a checkout of the source code and happily executed rake db:migrate, the AddAnotherFieldToFoo migration failed miserably.

That's because Foo#do_something_with_bars will get called (remember the :before_save we introduced), but the association between foo and bar is not made yet (we are executing a previous migration).

Same happens to the developer who didn't update his local copy this week. And it will break on production too, when we merge this set of changes into the production branch.

So, here is my problem:

Every backward incompatible change to models will (potentially) break past migrations, because they are not specifically associated to a model state on the time.

And SCMs doesn't help either (updating one changeset at time would work, but when merging braches all that changesets will collapse into one and you are doomed) I'm looking into what to do. Maybe I'm using migrations in a way they were not intended to be used...

Does someone know how to solve this?

Update: Here is the ruby-talk thread

Thursday, April 3, 2008

Cinemark Should Learn Unicode



Crappy photo taken last weekend on a Cinemark: it shows, in the text on the middle, the title of the movie "Crónicas de Spiderwick" mangled by the incorrect interpretation of UTF-8 data as ISO8859-1.

Someone should point the Cinemark guys to the Joel's Unicode guide. It is a good reading on the topic of text manipulation on the real, modern world. That is, taking text encodings into account. It is a messy, but unavoidable topic.

And remember:

“There Ain't No Such Thing As Plain Text.”

(quoted from The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!))