Monday, October 26, 2009

Python on .NET: PyPy-cli-jit

PyPy is the combination of an interpreter compiler toolchain that allows you to write interpreters (Virtual Machines really) in a static subset of Python, called RPython, along with an implementation of Python in RPython.

The compiler toolchain allows you to compile your interpreters for several backends. The major backends are native code, for the JVM and for the CLI (which is .NET and Mono). This allows you to maintain a single codebase and produce an interpreter that runs on all these platforms. The PyPy .NET backend is called pypy-cli.

PyPy on its own runs at somewhere between the same speed as CPython and twice as slow. Because it has better garbage collection (PyPy doesn't use reference counting) there are some things it does faster but there are also places it is slower.

There are various cool things about PyPy, for example it is already useful where you want a sandboxed Python interpreter. The really cool thing, and perhaps one of the driving forces for creating PyPy, is the JIT compiler that aims to provide a radical performance improvement over standard Python.

The JIT has gone through several experimental versions, causing many in the Python community to lose hope that it will ever live up to expectations, but the latest version has proved viable. It is gradually being integrated with the compiler toolchain and expanded to cover more of the Python language. The PyPy team have posted a series of blog entries demonstrating the power of the JIT.

The clever thing about the JIT is that because of the way it works it is even useful for the backends that also have their own JIT, like the JVM and .NET. The JIT is able to generate statically typed bytecode for performing operations with known types, allowing operations that don't use the dynamic features of Python (which are typically only used in a small proportion of Python code) to run very fast. The JVM and .NET backends of PyPy are therefore 'double-JITted'. The JIT emits bytecode which is then compiled to native code by the platform JIT (the PyPy JIT for the native backend directly emits assembler).

A recent blog entry gives some performance metrics for the JIT on the CLI backend (pypy-cli-jit) including comparisons with IronPython on both .NET and Mono. The result: pypy-cli-jit is between half the speed of IronPython up to five times faster depending on the operation. As the PyPy JIT improves so will these numbers.
 As the readers of this blog already know, I've been working on porting the JIT to CLI/.NET for the last months. Now that it's finally possible to get a working pypy-cli-jit, it's time to do some benchmarks.

Warning: as usual, all of this has to be considered to be a alpha version: don't be surprised if you get a crash when trying to run pypy-cli-jit. Of course, things are improving very quickly so it should become more and more stable as days pass.

For this time, I decided to run four benchmarks. Note that for all of them we run the main function once in advance, to let the JIT recoginizing the hot loops and emitting the corresponding code. Thus, the results reported do not include the time spent by the JIT compiler itself, but give a good measure of how good is the code generated by the JIT. At this point in time, I know that the CLI JIT backend spends way too much time compiling stuff, but this issue will be fixed soon.
 What PyPy-cli doesn't yet have that IronPython has is the close integration with the .NET framework. Whilst there is some support (you can use some framework classes) the current focus is on getting the JIT to work. It will be quite some time before pypy-cli is a viable alternative to IronPython, or even useful in a practical sense for anything, but it is still good to see progress.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.