We are given a semester long project, where the ultimate goal is to create a working language translator, where we take text in our made up programming language and change it to match the C language. Sounds simple enough, right? Well, not exactly. If we try to tackle the project all at once, it would be too much to handle. But maybe if we take it on one bit at a time, it probably won't be so bad.
And this is exactly what happened. We were given the project in iterations, where the first iteration was to implement a scanner to tokenize a piece of text into "words" using regular expressions and to store these "words" into a data structure. Also, we were required to code this in C++.
Well, wait a minute. It's nice to break up the project into parts, but I am new to C++. I can't really do this by myself, not without some help. And so, we were allowed to work in pairs. Subversion was also introduced to aid us in our group work. Using subversion, each partner would be able to make their own edits to their version of the project code and merge both revisions together. With the help of subversion and a project partner, we started working on iteration 1.
We started discussing all the requirements and general ideas about how the scanner should work, such as the sub-functions the main scanner function would use and how it would go through and scan the text. Then, we began creating the regular expressions for each of the token types, or "words". Afterwards, we started implementing the test cases to test if our regular expressions were correct. This is when all the "pain" started.
There was a lot of repetitive code. There were over 50 token types to match, and writing test cases to match each case would be quite tedious! So, why don't we just start coding the scanner? All these repetitive tests would take too much precious time. This would be such a pain! But in fact, we learned in class that writing test cases before coding the real deal would help set the foundation and guide us through what the scanner should do. Following this idea, we decided to divide the work of writing the test cases in half. My project partner wrote test cases for the first half of the token list and I wrote test cases for the second half. Afterwards, we committed and merged our revisions of our code and executed the makefile to run our tests. With a few errors and a few fixes, we got all our test cases to pass. So far, there was only minor pain. However, from here onward, the pain was only going to get worse.
We then started implementing the scanner. With our test cases, we wanted the scanner to match what our test cases described. Thus, we wanted our scanner to check each "word" and see what type it was. We then wrote a bunch of if statements. What's next? Well, we also wanted the scan function to ignore white space and comments. So, we began creating a function to do that. We also wanted to create the constructor and the array to hold all the scanned tokens. So, we implemented that. There were a lot of things we needed to do in the scanner, but after a couple hours of coding, we committed, merged, and updated our code. We completed the scanner, or so we hoped.
To test the scanner, we ran the given iteration 1 assessment tests with fingers crossed. Two dreaded words popped up on the command line: Segmentation Fault. The initial pain of seeing this error was saddening, but we have seen this error so many times that we were somewhat desensitised to the pain. However, the pain we had to go through afterwards to fix this error was almost unbearable. We managed to fall into the same trap that tends to happen after writing long stretches of code without testing. We clearly did not write test cases for every aspect of the scan function before even writing the scan function. Now, there were no test cases to help us locate the errors. We had to debug everything. It was painful.
After several more laborious hours, we finished debugging our code and all the test cases passed. What a relief! Next, we started writing more tests to check for corner cases to see if the code would break. However, we spent so much time debugging and writing more tests that we neglected formatting and commenting the code along the way. First of all, we needed to indent everything correctly to follow the emulated pure blocks style. Without an auto-indenting feature, this process would have been painful. Second, we needed to comment our code. This process wasn't too painful. Most of our code was readable since we used descriptive words for variables and method names, so our comments were brief. Next, all lines of code needed to be under 90 characters long. This was a major pain. Over half of our lines of code extended beyond the 90 character limit. We had to manually format everything even though our scanner was already finished.
Thankfully, after a half hour or so, the formatting was done and iteration 1 was complete. We faced many painful challenges in iteration 1, but I learned a few important things along the way. One of the most important things to remember is to write test cases for every aspect of the scanner beforehand. Not only will this help you find errors in your code, it will also save you time in the end. Also, format and comment your code as you write it. It will save you the pain of going on a mass formatting and commenting spree to make it look nice, even though your code runs perfectly. This leads to my final lesson learned: Do everything a little bit at a time. Code a little, then test it. Comment and format it, then continue coding again. These small steps help keep all aspects of your program flowing and developing together as you write, and it helps avoid the pain of long stretches of monotonous work.
Of course, everything is easier said than done. But I think what we went through in iteration 1 definitely had an impact on how we would go about doing iteration 2. In iteration 2, we are implementing a parser. Much like the scanner, we will need to study how the parser should behave, and create a bunch of test cases to test each aspect of the parser before writing the parser itself. Hopefully, this will ease the pain involved! But like what Daniel Berry wrote in his paper, The Inevitable Pain of Software Development, Including Extreme Programming, Caused by Requirements Volatility, there is no silver bullet to eliminate all the pain. There wasn't much pain working with my partner or using subversion, but in debugging, testing, and modifying our code. Some pain is pretty much inevitable. But that just comes with being a programmer. Of course, as good optimistic programmers, we learn to deal with it, and maybe even learn to enjoy it.