My conclusions from this are:
- If it's not tested it doesn't work. And when you change the environment under which your code runs you need to retest.
- The unit tests paid off again, since they allowed me to find many problems in places I didn't expect.
- I need to do more to flush out these hidden assumptions. One way is to add a randomized test that runs over longer periods and plays with some variables. I could easily randomize the order in which I go over the checks, inject random errors (the system is supposed to be self healing) or delays in some strategic points, etc. The not-so-easy part is verifying the system behaved correctly, and being able to reproduce problems once they surface.
- Would be really good if I had code coverage tools. Don't think there's anything available for IPy :-(
- This happens every time I port non-trivial code. Need to stop being surprised :-)