Friday, September 26, 2008

Spider Python: Some Notes on IronPython

Some person on the internet has been learning IronPython. In order to learn IronPython he has ported a simple web spider (from a language called Retlang) and posted the Python code:
He is normally a C# developer (but he has tried Boo). He has also posted some notes on his first impressions of Python:
  • Python list comprehensions can do the same as LINQ for simple cases, but LINQ is much more powerful, and it supports deferred execution, while list comprehensions are evaluated greedily. (Ed: generator expressions are evaluated lazily in Python)
  • Every description of the python syntax I ever see emphasizes the fact you don't need to put in braces. Pity they don't spend more time telling you that you have to put in colons, that would actually be useful knowledge. This really bit me when I learnt boo, which is syntactically very similar.
  • IronPython 1.0 targets CPython 2.4. This sounds fine until you realize that this was released in 2004. A fair bit has happened since then, not limited to the introduction of an inline if syntax. (Ed: IronPython 2 is Python 2.5 which is still the stable current version of Python)
  • While we're on the subject, the inline if (defaultValue if condition else exceptionValue) is actually quite cool.
  • The fairly lax approach to types means I don't need to call .Cast anymore.
  • Tabs are the devil as far as indents in Boo and Python are concerned. I highly recommend using something that shows tabs explicitly, and then eliminating the lot.
  • Instance methods that explicitly take self feels awkward to me.
  • Static Methods that require an incantation such as "ToAbsoluteUrl = staticmethod(ToAbsoluteUrl)" also feels pretty awkward. (Ed: staticmethod can be used as a decorator which is much nicer)
  • The casting to delegate isn't quite as slick as it is in C#. Passing in "spiderTracker.FoundUrl" instead of "lambda url: spiderTracker.FoundUrl(url)" results in an extremely unhelpful runtime error.
  • The lambda syntax is pretty elegant, but so is C#3's. Indeed, C#3 seems to have the edge.
  • Python's regular expressions are powerful, but not quite as good as .NET's. In particular search does what you'd expect match to do
  • findall doesn't return matches. Rather, it returns the value of the first group of the matches. This is often actually what you wanted, but it's a bit peculiar. The semantics of "Matches" in .NET are much easier to understand.
  • It did seem rather slow compared to the C# version. There are, of course, way too many variables here to make a final judgement, but it was disappointing.
He does have some quibbles with the error reporting. He's posted some screenshots of particularly crpytic tracebacks:


  1. Interesting stuff here. LINQ more powerful than list comprehensions? Better delegation and regexen? Maybe I'll take a(nother) look at .NET and C# one of these days.

  2. This comment has been removed by the author.

  3. Someone had to reference this just as I went to Egypt for three weeks. :) I clearly ought to add an "About Me" page. For reference, Retlang is a C# library for parallel processing, not actually a language. I've since updated the post to include corrections (thanks to all who commented).

    I stand by my assertion about LINQ. LINQ has two main advantages as a method of performing list comprehensions: grouping and sorting. The third feature: the deferred compilation model than allows such things as LINQ to SQL, isn't really that useful yet, but it may well be once NHibernate LINQ is completed.

    Julian Birch


Note: only a member of this blog may post a comment.