GSoC Week #5

A good news is I finally received my GSoC payment card in this week. 🙂

I did some cleanup for my patches and get some small issues fixed. Now with all these patches applied my branch is showing blue dots (which means test passed without error) for most of the test jobs on the Cython Hudson site. There are some others showing yellow, which are tests against older Python such as 2.3 and 2.4. They should not be big matters.

All the patches are uploaded to Rietveld. It should be time to start call for reviewers. 🙂

I also produced a patch for bug #543. It’s very interesting – after several hours of digging into the code to understand what happened behind the bug, the patch I finally produce is just one line of code!

Happy hacking!

June 28, 2010. Uncategorized. Leave a comment.

GSoC Week #4

I just spent the whole weekend on fighting with a refcounting bug. Stefan e-mailed me that my branch on Cython hudson site is turning red and I should look into the console output on the site to see what happened. What I found is that all the test tasks were crashed due to a Python assertion fail:

compiling (cpp) and running knuth_man_or_boy_test ...   Doctest:
knuth_man_or_boy_test.__test__.a (line 44) ... python:
Modules/gcmodule.c:276: visit_decref: Assertion `gc->gc.gc_refs != 0'

It should be related to Python GC and refcounting. It is surely caused by my work on the nonlocal support (ticket #490) since only in that patch I have dealt with the refcounting things. It is also very strange that the problem only occur when Cython’s refnanny is enabled.

To solve the problem, I have spent some time to study how Python GC works, and then I have known this assertion error is due to incorrect refcounting of an object – so that during GC traversing, the number of owners of this certain object is larger than the reference count, which should not happen.

The next several hours were spent on checking the INCREF and DECREF codes again and again, but no obvious problem found. Until I noticed that the crash usually appeared in a GC collecting triggered inside refnanny INCREF or DECREF operation, I got the idea that the GC should not be triggered there since during INCREF and DECREF the refcounting maybe inconsistent within the objects. Looking at the code, I found I have written something like this:

scope->v1 = arg1;
scope->v2 = arg2;
scope->v3 = arg3;
Pyx_INCREF(scope->v1); Pyx_GIVEREF(scope->v1);
Pyx_INCREF(scope->v2); Pyx_GIVEREF(scope->v2);
Pyx_INCREF(scope->v3); Pyx_GIVEREF(scope->v3);

Well done, I caught the bug! During INCREF for v1 or v2, the v3 is already owned by the scope object but the reference is not counted yet, thus when GC started, the objects are in a inconsistent state.

So, the lessons learnt are: always do INCREF before we actually own the object, and do DECREF after we disown the object. INCREF and DECREF looks like some simple atomic operations, but object deallocation could happens in DECREF (which would means execution of arbitrary code), and for INCREF, with some debugging mechanism like Cython’s refnanny, it could also trigger a lot of extra codes.

Stefan also have given some useful comments on cython-dev mailing list.

June 21, 2010. Uncategorized. Leave a comment.

GSoC Week #3

I was moving back to Singapore this week, so it is pretty late for this post to come out. However, now I have settled down and got some time to sit on the front of my laptop to work on Cython again. Yeah!

Last week I have attempted ticket #69 emulate Python 3 print() function in Py2 <2.6. With some research, I figured out that the print() function could be implemented easily in Python/Cython code, by putting these code in the head of a module during compiling:

from __future__ import print_function

if __import__('sys').version_info < (2,6):
    def print(*args, sep=' ', end='p\n', file=None):
        import sys
        if file is None:
        sep = str(sep)
        end = str(end)
        file.write(sep.join([str(x) for x in args]))
        return None

The about code is actually invalid for Python <2.6 since they don’t support the print_function future directive. However, it is valid in Cython no matter which version of Python you are using. This is because Cython compiles these code to extension module, thus circumvented the Python parser. Finally, Python will not complain if you define a function named “print” in extension module.

But function def inside a if statement is not working yet for Cython. So we need to implement function definition in control structure (ticket #87). This meant to treat function definition as assignment, as what pure Python code do. Cython already doing so in some case, such as for closure. So what I need to do is to enable it. Seems not too hard.

After some hacking, it starts to work, but cannot pass some corner case. I might be going on the wrong way, i.e., should not do this kind of monkey patching against the test failures, but to give a complete solution based on deep understanding of the code base? Hmm.. finally I decide to put it aside. Maybe I can figure out a better solution later when I got a better understanding of Cython code base.

I’d like to start work on relative import then. Meanwhile it should be the time to call the community to review the patches I produced. So I can revise them and learn more.

June 17, 2010. Uncategorized. Leave a comment.

GSoC Week #2

This week I’m continue working on the tickets. Robert reminded me that the -3 compiler switch for Python 3 is already checked in to cython-devel. So I merged the latest cython-devel and cython-closures to my branch. It seems Mercurial is not smart enough. It complains about some conflicts and starts vimdiff for me to manually resolved the conflicts, even though there is actually no conflicts. Well, I take this as a chance to learn the vim diff mode – ‘do’ to get a diff, and ‘dp’ to put a diff. ‘:%diffg’ would be very useful to get all the diff from the right side (‘theirs’ in SVN term).

After that, I did the exception catching semantic change in Python 3 (#541). It means to do proper cleanup of caught exception. I did this by using a new tree transform, and then a bit worried about the reducing of performance for adding an extra transform.

Also I added the ‘…’ as the Ellipsis object (#488). Again this is an easy fix – just some modification on the parser is involved.

Meanwhile, I found that Cython do not support relative import (eg. from import bar), so I created a ticket for that (#542) and wondering how could this being implemented.

June 8, 2010. Uncategorized. Leave a comment.

GSoC Week #1

The first coding week of Google Summer of Code is just past. I was working on some bug fixes for Cython, according to the schedule in my proposal.

The first thing I did is revised a patch I produced during the GSoC application period for Cython Ticket #422 (bug in setting __module__). With some suggestion provided by Robert,  I found a way to make the module name as a const string by using a mechanism already provided by Cython, then there’s no need for conversion between PyObject and C string.

After that, I came to the nonlocal keyword, which is a Python 3 feature specified by PEP 3104. Implementation is straightforward – modify the parser, add a new node that lookup the name so it bring the variable from outer closure scope to the current scope. Since the ‘nonlocal’ keyword is similar to the ‘global’ keyword, so I can have the ‘global’ as a model to follow during implementing this. However, then I found the type inferencer did something wrong on the variables declared nonlocal, so I disable it on nonlocal variables. Now the test case runs, but refnanny starts to complain, since now inner functions can change the object that an outer name pointed to, and this will confuse refnanny. To silence refnanny, I fixed the refnanny-generating code to completely pull the closure variables out of refnanny’s control. Then it is done. With some more tests, I also discovered a new bug in the closure code.

Then I worked out the ‘with’ statement with multiple manages. This is easier.  What I did is just a bit refactoring of the parser code for with statement.

I also looked into a ticket saying the Python 3 integer division is not respected by C ints. However, when I try to reproduce the described erroneous behavior,  I found this seems already get fixed. So I just produced a test case for it.

Finally, I got the explicit exception chaining syntax done.  Again, this involves changes in parser, node and some utility code. It just spend some time to write the test case that runs simultaneously on both Python 2 and 3.

With these work, I think I have grabbed the basic idea of how the parser, tree node and scoping works in Cython. The next challenge would be to figure out how the Cython “-3” option should change the behavior.

June 1, 2010. Uncategorized. Leave a comment.