Setting up for developing Sage is one thing, actually contributing to the development is a completely different ball game. The development process, broadly speaking, consists of the following steps:
- Identify a defect, bug, enhancement or new feature that Sage has or requires.
- Open a ticket on Trac describing the exact problem.
- Enter discussions with other community members on clarifying the issue and discuss solution and implementation ideas.
- Write code and submit.
- Get it reviewed by other at least one other independent programmer.
- If review is positive, the ticket is closed and code is successfully merged. Otherwise, go back to step 3 and repeat.
As a contributor, one needs to know how to do both, review as well as successfully write code. And that is what this post describes. I’ll use tickets that I worked on, to illustrate these aspects of development in further detail. In order to avoid confusion, I’ll use Code while talking about Coding Theory and code to talk about programming.
Reviewing a Ticket:
David and Johan (my mentors for the project) suggested Ticket #20113. The
zero method in the linear_code.py file was supposed to return the zero vector of the associated Code. The original implementation called the
gens method (the generators of the Code), took the first element of the returned list, multiplied it by 0. This resulted in all zero vector of length of the Code. While the solution wasn’t incorrect, this was a rather tedious way of obtaining it. David had already submitted a patch that simply called the zero element from the ambient space of the Code to produce the answer.
My job was to review this new code and at first glance, this seems trivially correct. And it was. But this may not always be the case. Sage is a massive beast and a small change in one part of the code could break something in some other seemingly unrelated part of the code. There are a lot of things that need to be verified and the best way is to simply follow the Reviewer’s Checklist. I’ll clarify some of the terms from there first and then provide a general rule-of-thumb workflow on how to interact with the git-trac system to review.
- Doctests and Coverage: The documentation for each of the methods in Sage contains examples of code that explains how to use the method. Python can search for and extract these commands, run them and compare the output to the one mentioned in the documentation. If the patch changes code, the doctests for all the changed methods must still hold true. To do this, run the command
`./sage --coverage src/sage/coding/linear_code.py.
- Manuals and Building: Manuals serve as a reference book and can be incredibly useful for quick reference. Sage adds material to its manuals from the documentation in the source code itself. Which is why the documentation should follow a specific template. See here for examples and notice the formatting and indentation. Run `./sage –docbuild reference html` to build the html version of the “Reference” Manual. Others include _tutorial, developer, constructions and installation_. These are also available in pdf versions.
- Speedup: Generally, it is hard to check the speed. Standard Python time-management (import time, time_clock()) can be used. The main idea is to ensure that the new code does not utterly slow down Sage. Patchbot reports can also be used but caution should be exercised and error reports should be double-checked. Run methods on large inputs, compare to other softwares that offer similar functionality, try boundary cases and consistency checks based on mathematical foundations.
git trac checkout TICKET_NUM(This will create a local branch say t/20113)
git checkout t/20113
./sage -b(See if the code builds without errors)
- Run through the Checklist above and note down errors, if any.
- Run various test examples in an attempt to “break” the code.
- Go to Trac and write comment(s).
- Change ticket status.
- Write your full name in the Reviewer field of the ticket.
Writing a Ticket:
David and Johan suggested improving the efficiency of the
decode_to_code method of the Nearest Neighbour Decoder. I opened a Ticket #20201 on Sage to fix this. The previous implementation created and stored the distances between the received word and every codeword. It then sorted the entire list in order to find the closest codeword. This took exponential memory and time in the size of the Code which can be very inefficient for large input. An obvious solution is to compute the Hamming weight for the first codeword and set that as the minimum. Then, iterating over the rest of the codewords and updating the minimum drastically brings down the memory and run time requirements.
There are some major differences when it comes to actually opening a new ticket and writing code for it. Whenever you write new code in Sage, you might want to add some print statements and code, run the files, test, make small changes and test again. In that case, you have to build sage (
./sage -b) or in some cases even run
make distclean && make which can take a very long time. Instead, the best way is to write your new code in a myfile.sage file and then run
- Open a new ticket on Sage. Or if a ticket that has commits already exists, create a local branch on your system. And then pull the changes from the ticket onto your branch. Do not directly checkout the ticket. It can result in very weird errors.
- Discuss solution idea on trac.
- On the local branch (NOT master) corresponding to the ticket, write your code.
- Test your code.
git add <changed_file(s)>
git commit -m "insert detailed message here."
- `git trac push TICKET_NUM`
- Once you’re satisfied with your commits, set the ticket field to “needs_review”.
- If positive, ticket closed. Else, go back to step 2.
As a final note, Sage is guaranteed to break down at any point for seemingly no valid, discernible reason and I’ve lost track of the time I’ve spent in trying to fix it. But in doing so, I’ve learnt a lot about the codebase and once you successfully manage to rebuild Sage, there is a weirdly awesome feeling of accomplishment!