Monday, January 05, 2009

DevHawk: IronPython and Linq to XML

Continuing his IronPython and Microsoft Technologies series, Harry Pierson has posted four articles on working with XML and Linq from IronPython.
Linq stands for Language Integrated Query and is part of .NET 3. It adds first class support for querying data sources to C# and VB.NET. Underlying this are libraries that can be used from any .NET language (whether it has syntactic support for Linq or not.) In its basic form, "Linq over objects" is remarkably similar to Python list comprehensions and generator expressions.

IronPython doesn't have syntactic support for Linq, but you can use the libraries. Part 2 of this series does exactly this:
LINQ to objects works just fine from IronPython, with a few caveats. First, IronPython doesn’t have extension methods, so you can’t chain calls together sequentially like you can in C#. So instead of collection.Where(…).Select(…), you have to write Select(Where(collection, …), …). Second, all the LINQ methods are generic, so you have to use the verbose list syntax (for example: Single[object] or Select[object,object]). Since Python doesn’t care about the generic types, I wrote a bunch of simple helper functions around the common LINQ methods that just use object as the generic type.
As for the motivation for this, part 1 explains:
There are lots of songs available for Rock Band - 461 currently available between on-disc and downloadable tracks – with more added every week. Frankly, there’s lots of music on that list that I don’t recognize. Luckily, I’m also a Zune Pass subscriber, so I can go out and download all the Rock Band tracks and listen to them on my Zune. But who has time to manually search for 461 songs? Not me. So I wrote a little Python app to download the list of Rock Band songs and save it as a Zune playlist.
Part 3 uses the XDocument.Load (another part of Linq) to read XML, and part 4 uses the XmlWriter to write out the playlist.

IronPython (well the DLR as a whole) shares an important element with Linq. You construct Linq queries with lambda expressions that filter the result. For example (from this blog entry):

static void UseLINQ()

var names = new List {
new GenderedName { Name="Bob", Gender=Gender.Boy }
, new GenderedName { Name="Sally", Gender=Gender.Girl }
, new GenderedName { Name="Jack", Gender=Gender.Boy }
, new GenderedName { Name="Sarah", Gender=Gender.Girl }
, new GenderedName { Name="Philbert", Gender=Gender.Boy }

var boyNames = names.Where((n) => n.Gender == Gender.Boy).Select((n) => new { n.Name });

foreach (var name in boyNames)
Console.WriteLine("{0}", name.Name)
The lambda actually becomes an expression tree which is effectively the AST. This expression tree can be compiled to query the data provider - unsurprisingly Linq to SQL compiles to SQL queries.

The Dynamic Language Runtime uses expression trees as the AST representation for dynamic languages.

You can also represent the same expression with first class Linq syntax (where and select keywords):
var boyNames = from n in names
where n.Gender == Gender.Boy
select new { n.Name }
Introducing this capability into IronPython without changing the language is a challenge for the IronPython team. It may not be possible, but adding new keywords would draw criticism from those worried that Microsoft have entered the 'extend' phase of embrace, extend & extinguish in their new found enthusiasm for dynamic languages!

No comments:

Post a Comment

Note: only a member of this blog may post a comment.