Building up with confidence

| No Comments | No TrackBacks

Much of the time, your code is not flawless the first time you write it. Neither is mine. Neither is a programming veteran's. To make matters worse, sometimes we don't even try out the new little sections of code we write, especially trivial functions, because there was no way we could have messed them up. Fast forward to having written a lot of code and deciding to test how it all works and we are shocked to see failed tests. Development time of the project then increases as we have to trace through the code and debug it to find out where things went wrong, which might not be found without looking at the low level depths of the code.

Luckily, we can use unit tests to avoid problematic situations such as these while legitimately boosting confidence in our code. Commonly, typos and logic errors can find their way into anyone's code without being noticed. Even if the program compiles, nothing can be said on the performance of the software. Only when you have tested even the simplest of code can you be sure that it was written correctly and can build on top of that code with confidence of its functionality. Unit testing is intended to test the small, low level functionality of our programs and a couple of guidelines should be met to get the most use out of it.

Guideline one: unit tests should answer questions along the line of "does function x do what it is supposed to do?" Once testing begins on an entire class, component, or the entire system we are no longer in the realm of unit testing. Take, for example, some unit testing my partner and I performed while working on our translator project early on. A function called scan took in a character array and output a linked list of C stucts representing the tokens in the array. In a short time, we wrote four small tests of varying simple input to scan and tested that the proper tokens were in the linked list. Had we have written a test that includes the file input phase, the scanning phase and the token parsing phase, we would have been testing on the level of integration testing or higher.

Guideline two: test appropriately. If only the average case tests are used or if tests are written long after we have written a small section of code, you are not taking full advantage of unit testing. The justification for these is that when you are designing or have just implemented a unit of code, you are more familiar with how it should behave. Using this knowledge, tests can be made to cover not only the intentional use cases but also reasonable boundary cases and those that lead to errors. Coming back to a function a week or two after having designed or written it, you might not be able to come up with all of the special cases required to ensure the unit is fully functional. Let's look at a simple example of proper boundary coverage. Another function we had used in the translator was one called readInput that took a filename and output a character array of the contents of the file at the given filename. A few cases we tested for were passing in filenames of files that did and didn't exist as well as empty filenames. Such tests ensure that not only does this suite of tests cover the intended case of reading input from an existing file but also the boundary cases where the function must behave correctly when it has no file to process.

But how does one do unit tests? Hard code a bunch of print statements and eyeball them to make sure they all output what you wanted? Going down that route removes any way of automating test success and can hinder development when output code must be constantly commented out or the like. Instead, use a testing framework. With testing frameworks, you can separately write the tests and program and update the tests only when need be. Then, whenever a change is made to the program, just quickly recompile the tests via automated building and run them. In a short time, you should have a good idea on the progress made thus far or have an early warning of what to fix. My partner and I use CxxTest for our translator project. For each suite of tests we write, we start by creating a header file of the intended unit or group of units to test. Within this header file, we make a subclass of a CxxTest class called TestSuite, write methods prefixed with "test" for each test we wanted performed and include the body for each of these test methods. CxxTest includes many assertion macros intended to be included in the body of these methods to verify correct execution, such as TS_ASSERT(expr). Next, we call upon a Makefile target that simply calls a Perl script included with the framework to generate a .cpp file for our test suite and then compile that.cpp file. We can then run the executable to see how the pieces of our project are looking and can quickly recompile the test executables when our code changes. Whenever we run a test and an assertion fails, the exact line of tests header file is output to tell us what went unexpected as well as the number of failed tests out of the total tests from that same file. Otherwise, when all tests pass, the test executable tells us "OK."

As you can see, unit testing is intended to be simple and steady your development cycle. Keep true to testing small sections of code that you have written while testing that code in a timely and wide ranged fashion and you'll be coding on top of those sections with confidence of their functionality. Testing frameworks, such as CxxTest, allow for you to create tests easily and separate from your actual code. Using these frameworks makes testing fast as well as easy to automate. No one's code is perfect at first but you can at least avoid building it on top of buggy code through unit testing.

Get Control

| No Comments | No TrackBacks

Writing code can be a difficult process and figuring out how to handle all of that code can be difficult as well. Luckily, there is a solution to the latter in the form of source code control, or version control. Different source control systems exist but one popular version I've used is called Subversion. Two of the best benefits to source control are version record keeping and the sharing and merging of code. The former benefit allows for time traveling of the development process to a stable time before problems arose as well as describing of changes made to controlled code over time. The latter benefit allows for teams to easily work together on projects by distributing and updating code to each team member in a simple manner. Opponents of source control may say how source code control can lead to more headaches than benefits but these headaches aren't always the fault of the source control system. Before we go into the benefits and opposition of source control, let's see exactly what the system does and how to use it.

Source code control works by implementing a server with a repository where team member can share their code and other related materials with other team members. In addition to having the files on the server, team members have a local copy of the repository to work with on their own computers. Each team member can do various tasks with the repository server including retrieving new files, updating local copies of files to what is new in the repository, pushing updated or new files onto the server for others to retrieve and viewing the lifetime of a file on the repository. Subversion users can execute these tasks through simple commands in a terminal. They can "checkout" a repository when first beginning their use of source control of a project to initialize their workspace, or local copy of the repository. From there, Subversion users can "update" within a directory under version control in their workspace to obtain updated files from the server. Users can also "commit" changes they have made to any files under source control in their workspace to the server and increment the revision number used to represent the version the repository. (Users must first "add" any new files and directories created to Subversion before they can be committed to the server). But of course, why have version controlling if you can't determine differences between various versions of files? Users can invoke the repository server to display a "log" pertaining to the revisions of an individual file or directory or even the whole project. These log entries include information pertaining to the revision number of the entry, who committed the changes, the date and time of the commitment and any comments written by the team member who made the changes. Users can even invoke a "diff" to determine what changes exist between the different revisions of a file or "revert" a file from their working copy to different version that had been previously committed. Performing any of these useful tasks can be done with just one command in a terminal which allows for easy learning of Subversion. It's no wonder that with all of these commands that Subversion and other source control systems allow for great benefits.

One benefit to source control systems are their time traveling qualities. Two reasons for wanting this benefit exist. The first reason is that people make mistakes and we computer scientists are just as vulnerable to this as anyone else. The second reason is that requirements change. Take for example a situation that happened to me recently while working on my current project for CS 3081 where both of these reasons for needing time traveling came into play. My partner and I implemented the use of a certain data structure prior to it becoming a requirement for our project. Unfortunately, the way we had originally implemented this structure was different than what our new requirements wanted. Having made many changes to our code to conform to requirements and having run out of time to work in the computer lab during discussion, we committed our work from one machine onto the server to continue our work later. However, when testing our new code a short while later, we discovered a few fatal errors existed. By using the "log" and "revert" commands of Subversion, we were able to quickly restore our files to an earlier revision and begin our work again on a different path that lead us to success. (A note to new users of Subversion: use descriptive comments when committing changes so that the log kept by the server is helpful in showing you what the best revision to revert back to it.) Of course, this record keeping is not the only benefit that users to source control experience.

Another great benefit to source control is the ease of collaborating with a team due to easy sharing and merging of code. As I've already mentioned, retrieving updates from the repository server is just a one line command and so is committing changes to the repository server. In the past, I've used some methods to sharing files that look horrid in comparison such as emailing new files to a partner. With using Subversion, I simply commit changes onto the server and let my partner know to update his copy of the files when he gets a chance. But what if multiple team members work on the same file and all commit those changes to the server? That is no problem and should even be encouraged! Pulling from my own experience during the first iteration of the CS 3081 project, I'll show how useful Subversion can be in this case. While developing a large function, my partner and I split the work. We worked on the same revision of the file and each committed our changes. Subversion caught on to what we had changed and merged our changes together into a single file, even though what we both committed looked different from each other. Now instead of us simply adding lines to a file, what if we both changed the same line of code? Well this actually happened once and Subversion was not able to merge the files. When the second person was committing their change, Subversion pointed out our conflicting file contents and gave us several options on how to resolve the conflict. Once we chose the resolution method needed, Subversion took over and completed the merge on the server's repository. Opponents to using source control may cite the problem we experienced as a reason for not using such a system.

Due to the extensive automation performed by source control systems and the learning curve to use them properly, opponents to such systems feel they are a bad solution to managing code. I'll admit that early on in using Subversion, I didn't trust its merging automation and I even used the merging conflict mentioned before as a backing for that belief. I had blamed Subversion for the inability to merge and it wasn't until after examining the cause of the conflict did I realize the problem was human error, not source control error. My partner and I realized our miscommunication on what each of us was to edit was the root of the conflict. In retrospect, I'm glad Subversion discovered this because it was a way of showing a logic error in our development process. In addition, to successfully use a source control system, the user must learn many aspects to their system including what revisions represent, how to first get started with the system, how to add and commit files, how to update files, and how to get in the habit of keeping their workspace up to date. Learning all of these aspects is not difficult if you use an easy version control system like Subversion, but I will agree forming habits of proper workspace maintenance can be difficult. However, these problems lie with the user, not the source control system. Someone who can't get into the habit of committing new code the repository likely isn't in the habit of taking the time to manually copy files they have worked on to distribute them to team members whenever they make changes. Once it is clear the human element should not be considered a strike against source control systems, it is clear there is little reason not to use them.

As I've shown, source code control systems, such as Subversion, make the lives of programmers easier. These systems allow for teams to track changes during development and to turn back the clock when problems arise either due to mistakes or changes in requirements. In addition, version control eases the distribution of code for a team by providing a single server to commit changes to or to extract updates from as well as intelligently merging changes of files. Some programmers dislike source control systems but when looking at some such reasons, it is apparent that the human factor is to blame, not the system. I'd highly recommend the use of source control systems for any serious programmers.

Two Heads Are Better Than One

| No Comments | No TrackBacks

The first of iteration of our project, called Where's The Forest, has been completed and development is going well. During iteration one, we were tasked with creating a scanner for a pretend programming language and to print out a list of tokens found within a file written in that language. If iteration one of WTF has taught me anything, it would be that working in a small team makes development life easier. Being the start of the project, iteration one had some repetitive code that was easier to deal with given an extra set of fingers, which allowed for the meat and potatoes of the iteration to be developed. On top of that, the different knowledge and experiences that my partner and I have combined to make code that will lay a solid foundation for the iterations to come, including methods we might not have implemented if we were in a single-programmer environment. We did, however, run into a small challenge with code sharing that isn't a problem for single programmer projects. This challenge is easy to outweigh with benefits of teamwork such as the division of boring labor.

One benefit that arose from working in a group of two people was the reduction in tedious tasks. During our implementation of iteration one, we created over 30 different regular expressions to match the different possible strings within the file to translate and each of these regular expressions had four lines of code to initialize. Just to deal with this single part of the code required over 120 lines of repetitive code that required little to no thinking after the regex pattern for each was created. (I would like to point out quick to any critics that might say "Loops are perfect for repetitive code" that each of these regexs were being stored into individual variables for possible later use. Using a loop would not have been possible for such design.) Having to write this over and over again (or even a continuous copy, paste and edit) takes away from the enjoyment of creating a good algorithm and can annoy the programmer, especially when working solo. Luckily, since I was working with a partner, we were able to divide up the code needed to be written so that development could quickly proceed past that section of code. To a seasoned programmer, 120 lines might sound like a small gripe, but what about when the code gets to be 200 lines or how about 300? I would gladly take the help writing such code to proceed onto better items. In the future, I hope we will be able to better plan out when such sections of code will arise, especially if that code will be larger than what we had to write during iteration one. With the number of different types of tokens required for the language in WTF, I expect more repetitive code will show up soon. Getting these lackluster sections of code completed will bring us faster to the development pieces that will require both of us to pool from our respective experiences.

Another benefit to working in a team that we came across was that diversity of knowledge can help overcome problems. My partner and I have not taken exactly the same classes nor have we had the same programming experience. I've had an internship as a software engineer where I've tackled real-world problems and he is farther in the Computer Science course curriculum than I am. Our differences bring a variety of views to our project. Two good examples come to mind. First, I have little experience with C and have only begun using it this semester but he used it during the spring. Due to this, he was able to see the opportunities to construct structs that were helpful in keeping track of data. If I had been working alone, I would have wasted time trying to figure out what I can do in C rather than actually getting farther in the iteration. Second, with my experience on large projects, I was able to modularize our code to make it easier to read and more maintainable. If my partner had not decided to develop iteration one that way when working alone, he would have run the risk of developing code that would be a hassle to change come other iterations of WTF. But if he was working alone, he would have saved time not having to worry about code sharing with a partner.

Yet even with benefits such as these, some people still find that team projects can be more of a hassle than working alone due to the sharing of code. During iteration one, we ran into this challenge when completing the repetitive code mentioned earlier. When we had both completed our portions of the repetitive code and tried merging them together, our version control system, Subversion, gave us an error message. As it turned out, I had changed a single line of code before beginning on the repetitive code while he used a text editor to do a find/replace that accidentally replaced a word in that same line I changed. Obviously, the version control system had no way of properly merging these copies of the file. Luckily for us, Subversion was able to show us exactly where the problem was and we were able to get it resolved quickly. As soon as you use a version control system that offers that capability, keep better track of your changes and better communicate the changes before sharing them with your team, few problems of this nature should ever arise.

With the second iteration soon to come, I can only be thankful for working in a team. My partner and I will better plan out our second iteration to take advantage of dividing up code that is repetitive or code that can take away from any enjoyment of developing WTF. We will also be sure to seek out any opportunity to use the distinct skill-sets we have gained to better develop WTF than if either of us had to do so separately. Unfortunately, we will have to be weary of sharing code that both of us have edited concurrently. However, if we better inform each other of changes and track them better, these challenges will rarely show up. Working in a group is a great advantage for programmers and practicing to work out the kinks now will be a benefit for any project to come.

First Post- Looking ahead at class topics

| 5 Comments | No TrackBacks

One of the topics for 3081 that I feel I'm capable of is designing a useful class hierarchy even though I only first needed one when I had an internship this past summer. The project I worked on was extensive, and required me to find, parse, and store information from XML files into a database on a backend application, then display that information on a website in a frontend application. To keep the project manageable, I had to create classes that were related to each other in a meaningful way and were used for a specific purpose. For example, on the web side of the project, I used objects from an API I was using (Wicket for those of you interested) to create abstract classes geared towards my application, then created subclasses of those. This not only allowed me to create additional methods and fields for objects that could be commonly inherited, but also enabled me to better tailor an object to a specific task without rewriting a bunch of code from similar objects. For example, I created an abstract ProjectWebPage class that extended the framework's WebPage class that all web pages for the application would subclass. The abstract class contained methods that would allow for greater and common functionality among web pages, such as getting access to a database. I feel that having used the core OOP tool of inheritance is one such way of showing my ability to create a useful class hierarchy.

One topic I look forward to is writing good tests. As it stands, I use very simple methods of testing code such as outputting the state of variables or the return of a function within a main method or the like. However, constantly adding and removing lines for outputting variables can be time consuming when working with many of them or when working across several different pieces of code, such as different classes. Furthermore, the way I go about writing tests can be very time consuming because I will often have to change a line of code to call a method with a different value and repeatedly do that for the range of appropriate values. I hope to learn from this topic how to write more thorough tests and possibly how to automate them.

Another topic of the course I will appreciate going over will be writing effective comments, getting into the habit of writing them DURING coding, and how to efficiently maintain them. Yes, the topic itself could very possibly be a bore, but the real-world advantages to them are significant. As it stands, I try my best to actively comment sections of code after writing them. This was not the case only several months ago. During my internship, I would avoid doing any proper documentation because, to be honest, they can feel like a waste of time, especially when you are used to small, basic projects that you and you alone work on. But after a couple months of very minimal documentation (having inserted comments mainly for my benefit of understanding why I had to use an API a certain way and what it accomplished), I looked at my 70+ Java classes and thought that whoever inherits this tool after I leave will be in world of pain without documentation. Then I had troubles trying decide what needed commenting and what didn't and how to best describe the algorithm used without creating a novel. More pain came about when requirements changed after I did the massive documentation, which affected algorithms and classes used and, in turn, their respective comments. It is because of this painful experience that the topic of effective comments appeals to me, and having the ability to clearly communicate an algorithm through a comment to someone when you can't directly is a very important skill for any computer scientist.

Recent Comments

  • cachi001: Writing comments during the coding process is very important because read more
  • fongx059: Your experience reminded me of the importance of making comments read more
  • nunxx001: Writing tests like that does sound like a headache, though read more
  • ayal0034: I am so glad for you that you have already read more
  • hall1467: I also hope to learn how to write good tests. read more

Find recent content on the main index or look in the archives to find all content.