Sunday, December 11, 2011

Extending an open-source software system

In this blog entry I post about my experience extending the Hale-aloha-cli-grads open-source software system

Introduction and Background

In my last post I reviewed and evaluated the open-source software system named Hale-aloha-cli-grads, in the context of the three prime directives of open source software development, issue driven project management, and continuous integration. In this post I will relate my experience adding functionality to this system and collaborating with my partner, Zach Tomaszewski 

Here I re-introduce some background information to set the context for this blog post.

Hale-aloha-cli-grads
The Hale-aloha-cli-grads system under review is an open-source project that implements a command line interface (CLI) for WattDepot, created by Andrea Connell, Leo DeCandia, and Sergey Negrashov. It retrieves and displays energy and power data for the four Hale Aloha residential towers at the University of Hawaii Manoa campus.

Hale-aloha-cli-jz
The Hale-aloha-cli-jz system is an open-source project that also implements a command line interface (CLI) for WattDepot, created by my partner Zach Tomaszewski and myself. This software project implements the same specifications as Hale-aloha-grads, and for this endeavor, each group is extending the others' system.

WattDepot
The wattdepot open-source project developed at the University of Hawaii, is a collection of technologies and tools for collecting and storing data from electricity meters for smart grid research and experimentation


The 3 Prime Directives of open-source Software Development
To refresh your memory, and for the purposes of relating my experience, the three prime directive of open-source software development are summarized as follows:

  1. Does the system accomplish a useful task? 
  2. Can an external user can successfully install and use the system?
  3. Can an external developer successfully understand and enhance the system? 
By extending Hale-aloha-grads, I hope to determine more accurately whether or not it satisfies prime directive 3. In my last post I downloaded, installed, and reviewed the code. Now, actually extending the code should provide a more accurate answer to prime directive 3.

Issue-Driven Project Management and 
Continuous Integration
Please see my previous blog entry to gain an understanding of Issue Driven Project Management and 
Continuous Integration.
Adding Functionality to the System with 3 new Commands
I will relate important aspects of our experience extending an open-source software system: the build process, developer contribution, documentation, testing, and software quality.  
My partner and I added the new functionality outlined below. In all commands below,  [tower | lounge] defines a source.


Commands and command line argument specification

set-baseline [tower | lounge] [date]

  • This command uses date as the "baseline" day for the source.  
  • The date is optional, in YYYY-MM-DD format, and defaults to yesterday. 
  • When run, the system obtains and saves the amount of energy used during each of the 24 hourly periods of that day for a given source and date,  defining a baseline for that source.

monitor-power [tower | lounge] [interval]

  • This command prints out a timestamp and the current power for the source every interval seconds.  
  • Interval is optional and defaults to 10 seconds. 
  • Entering a carriage return stops the monitoring process and returns the user to the CLI command loop.  

monitor-goal [tower | lounge] goal [interval]

  • When run, this command prints out a timestamp, the current power being consumed by the source, and whether or not the source is meeting its power conservation goal.  
  • Goal is an integer between 0 and 99 that defaults to 5.  It defines the percentage reduction from the baseline for this source at this point in time.
  • Interval is optional and defaults to 10 seconds, and defines how often to retrieve energy data and display it to the user.
  • If the monitor-goal command is invoked without a prior set-baseline command for that source, it is an error.
  • Entering a carriage return stops the monitoring process and returns the user to the CLI command loop.
Some example runs and output for new commands:

set-baseline Ilima

Setting baseline values for Ilima on 2011-12-11
hour  energy
0     32.2 kWh
1     30.2 kWh
2     27.0 kWh
3     25.3 kWh
4     24.2 kWh
5     23.6 kWh
6     22.7 kWh
7     20.4 kWh
8     20.2 kWh
9     20.8 kWh
10    21.6 kWh
11    22.2 kWh
12    22.8 kWh
13    24.3 kWh
14    25.5 kWh
15    26.1 kWh
16    26.7 kWh
17    29.3 kWh
18    31.1 kWh
19    33.0 kWh
20    34.8 kWh
21    36.4 kWh
22    36.6 kWh
23    36.2 kWh

monitor-power Mokihana 4


Querying Mokihana for current power every 4 seconds. 
(Press Enter to stop.)


Most Recent Data     Power Consumption (kW) 
 16:51:20                 25.21
 16:51:20                 25.21
 16:51:20                 25.21
 16:51:20                 25.21
 16:51:20                 25.21
 16:51:20                 25.21

monitor-goal Ilima 5 3


Querying Ilima for current power every 3 seconds. 
(Press Enter to stop.)


Timestamp  | Power (kW) | Baseline (kW) | Base Hour  | Goal (kW)  | Goal met? 
 16:47:12     27.77        26.75           16-17        25.41        no        
 16:47:12     27.77        26.75           16-17        25.41        no        
 16:47:12     27.77        26.75           16-17        25.41        no        
 16:47:12     27.77        26.75           16-17        25.41        no 

Extending the System

Issue-Driven Project Management and Continuous Integration

Using Google Project Hosting Issues, Zach and I were able to effectively plan and manage our efforts following the practices of issue driven project management. Commits were done frequently. We broke the work down into tasks. We worked on different tasks at the same time, different tasks at different times, and to a lesser extent, similar tasks at the same time (for example, each of us contributing to the same Java class or JUnit test case).

As far as continuous integration using Jenkins, we had two failed Jenkins builds because I forgot to make sure the system passed the verify target before committing. 

In addition to using Issues in Google Project Hosting, we made use of informal emails and phone conferencing to collaborate and take the place of face-to-face meetings. We had no physical meetings, yet easily accomplished our goals. There was no confusion about tasking and no commit conflicts.

Adding new Commands

Once we got a handle on the existing code, we felt it was inferior to our implementation by a good margin. This was mostly in terms of (very) poor method naming, and overly complex interaction between methods, within methods, and not-so-great test cases.
Although we originally were going to to divide the work with me implementing SetBaseline and MonitorPower, it went the other way with Zach implementing MonitorPower, MonitorGoal and me implementing SetBaseline.

Adding new Test Cases


The Test cases printed a lot of output to the console, and "fake" objects were used for many of the test cases. Since we added our own new tests for the new commands, I did not spend an excessive amount of time analyzing the existing tests, but simply reviewed them. 


Cleaning up the User Interface

The user interface did not meet the specifications outlined for the original project. In particular:

  • The help message is printed after every command
  • There is not enough white space and command output is difficult to read since it is mixed in with the help text
  • Help and quit are not displayed in the list of available commands
  • Pressing the return key causes an “Invalid Command” message
  • Every run of the system causes a “looking for commands” message and displays the commands found
  • There is no command line prompt, just a blinking cursor displayed on an empty line
Taken together, these behaviors made the system 'clumsy' to use. It was rather easy to locate these behaviors in the code and improve upon them. In my opinion, the running system is now easier to use: there is a command line prompt, it is easier to see the output of each command, and it more closely follows the principle of "least astonishment" for any user of the system. Now, help is displayed when you ask for it, the help text makes good use of white space, and the return key (alone) displays the system prompt again, instead of an error message.

Finishing up


I modified the existing wiki pages, and added a new wiki page that covered the new commands. We created a distribution using "ant -f dist.build.xml" and added this to the downloads page of the Google project hosting page for the project.  This zip file can be downloaded to evaluate prime directives 1 and 2.  I made it clear in the wiki's that there are 2 downloads, one for the previous version, and one for the latest version.

Conclusion

Although in my last blog I stated from review and inspection that the system does satisfy prime directive 3, I must say it was a bit more of a struggle than anticipated, and I learned that actually extending the system gives a more accurate assessment of whether or not a system meets prime directive 3. I was a lot more 'attached' to our own implementation of this system (Hale-aloha-cli-jz), however, I was able to get past that feeling, and get on with the task at hand.

Given our collaboration constraints, time constraints, and other commitments, I am happy with the outcome and quality of the software we developed in a relatively short time.



















Wednesday, November 30, 2011

Technical Review and Evaluation of the Hale-aloha-cli-grads Open-Source Software Project


In this blog entry I post about my experience reviewing and evaluating the open-source software system named Hale-aloha-cli-grads

Introduction and Background

To understand the system as well as and the evaluation criteria, I will first provide summary data of some background information necessary to understand the issues examined in the review.

Hale-aloha-cli-grads



The Hale-aloha-cli-grads system under review is an open-source project that implements a 
command line interface (CLI) for WattDepot, created by Andrea Connell, Leo DeCandia, and Sergey Negrashov. It retrieves and displays energy and power data for the four Hale Aloha residential towers at the University of Hawaii Manoa campus.

WattDepot

The wattdepot open-source project developed at the University of Hawaii is a collection of technologies and tools for collecting and storing data from electricity meters for smart grid research and experimentation. From the wattdepot Google hosting site: “WattDepot is an open source, RESTful web service that collects electricity data (such as current power utilization or cumulative power utilization) from meters and stores it in a database. The data can then be retrieved by other tools for visualization and analysis. It is designed to provide infrastructure for experimentation and development of "Smart Grid" applications.”




The 3 Prime Directives of open-source Software Development


For the purposes of this evaluation, the three prime directive of open-source software development are summarized as follows:

  1. Does the system accomplish a useful task? 
  2. Can an external user can successfully install and use the system?
  3. Can an external developer successfully understand and enhance the system? 

Issue-Driven Project Management


The basics of Issue Driven Project Management are summarized as follows:
  • Divides work into tasks
  • Tasks take no longer than 2 days
  • Tasks are specified by issues
  • Every person always has an open task
  • Every commit specifies an issue in its log comment 
  • Each person has a task
  • Each person knows what it is
  • Each person knows what the next task is
  • Each person is rarely blocked
  • Completing a task brings the project closer to completion
  • Project state is visible at all times to all team members
  • Breakdowns in PM are recognized by everyone, quickly

Continuous Integration



The basic concepts of Continuous Integration are:
  • The System must be able to be built and tested automatically
  • The System must be under CM
  • Everyone commits every day or two
  • Upon commit, The System is immediately and automatically compiled, tested, and developers notified
Review and Evaluation



My review and evaluation will cover important aspects of open-source software development, as well as answer detailed questions about the system including: build process, developer contribution, documentation, testing, QA, process, software quality, extensiblity, and integration. These elements will be grouped into three main sections that correspond to the three prime directives of open-source software development.


Prime Directive 1: Does the system accomplish a useful task? 

Following the directions which were 'spread out'  among the home page, installation guide, and user guide, I downloaded, installed, and ran the system under Windows 7. 
I exercised the system using various commands and command line arguments.

The functionality that is present in the system corresponds to what is documented on the project home page, and the messages displayed for bad commands and command arguments are informative.

I was surprised to see the entire help menu printed after every command, and the help menu does not specify the valid tower and lounge names. 
The help command does not show that 'help' and 'quit' are available commands. 
Also, there is no command line prompt, just a blinking cursor, and as mentioned above, I don't think 3 pages are necessary to get the information required to download, install and run the system.


However, I conclude that prime directive #1 is satisfied because a useful task is performed and the system basically behaved as outlined in the documentation. There were no difficulties in downloading, installing, and running the system, and it displayed expected output based on input parameters.


Prime Directive 2: Can an external user successfully install and use the system?

I carefully reviewed the project site,  Including the home page, user guide wiki, and downloads page.

The home page provides a clear understanding of what the system is supposed to accomplish, and is accompanied by a list of user commands recognized by the system. The user guide has a screen snap of the program in action, showing both sample input and output. The user guide also explains that both physical and virtual ‘sources’ are supported. However it does not indicate what version of the Java JRE is required to run the system.

The user guide links to an installation page that provides clear details on how to download, install, and execute the system from the perspective of a user, but does not include a link to the downloadable zip file. The downloadable zip file is stamped with a version number, and the zip contains an executable jar file. I did not have to compile and build the system in order to use it. I exercised the system under both valid and invalid inputs as documented below.

Valid commands and command line arguments:

energy-since Ilima 2011-11-28
Total energy consumption by Ilima from 2011-11-28 00:00:00 to 2011-11-30 14:42:26 is: 1616.9 kWh

energy-since Mokihana 2011-11-28
Total energy consumption by Mokihana from 2011-11-28 00:00:00 to 2011-11-30 14:43:01 is: 1616.7 kWh

current-power Ilima
Ilima's power as of 2011-11-30 14:45:12 was 26.0kW

daily-energy Ilima 2011-11-28
Ilima's energy consumption for 2011-11-28 was: 611.7 kWh

daily-energy Ilima-A 2011-11-28
Ilima-A's energy consumption for 2011-11-28 was: 104.7 kWh

rank-towers  2011-11-28 2011-11-30
For the interval 2011-11-28 to 2011-11-30, energy consumption by tower was:
Ilima                            1255 kWh
Mokihana                         1261 kWh
Lehua                            1268 kWh
Lokelani                         1468 kWh

rank-towers  2011-11-24 2011-11-30
For the interval 2011-11-24 to 2011-11-30, energy consumption by tower was:
Lehua                            3364 kWh
Mokihana                         3379 kWh
Ilima                            3464 kWh
Lokelani                         3942 kWh

help
--Available commands are:
energy-since: [tower | lounge] [Start]
Returns the energy used since the date (yyyy-mm-dd) to now.
current-power [tower | lounge]
Returns the current power in kW for the associated tower or lounge.
daily-energy: [tower | lounge] [Date]
Returns the energy in kWh used by the tower or lounge for the specified date (yyyy-mm-dd).
rank-towers:  [start date] [end date]
Returns a list in sorted order from least to most energy consumed between the [start] and [end] date (yyyy-mm-dd)
--Enter a command (type 'quit' to exit):

quit
quitting...


Invalid commands and command line arguments:

"Just pressing the return key"
‘ ‘ is not a valid command

energy-since Ilima 2011-11
Bad xml 400: Range extends beyond sensor data, startTime 2011-11-01T00:00:00.000-10:00, endTime 2011-11-30T15:01:54.660-10:00:   Request: GET http://server.wattdepot.org:8190/wattdepot/sources/Ilima/e
nergy/?startTime=2011-11-01T00:00:00.000-10:00&endTime=2011-11-30T15:01:54.660-10:00&samplingInterval=15

energy-since Mokulua
Not enough arguments.

current-power
Invalid arguments for current-power.


current-power Mokulua
Mokulua is not a valid source name.

daily-energy
Not enough arguments.

daily-energy Ilima 2011-12-99
Argument "2011-12-99" is invalid :Invalid value 99 for Day field.

daily-energy Mokulua
Not enough arguments.

rank-towers  2011-11-28
Invalid arguments for rank-towers.

rank-towers  2011-11-24 2011-11-99
Argument "2011-11-99" is invalid.

rank-towers  2011-11-30 2011-11-24
End date must be greater than start date.


exit
'exit' is not a command!


The system responded in a useful and helpful way under both valid and non-valid commands and command line arguments with a few exceptions.  My conclusion is that the system satisfies prime directive 2 by providing clear, easy to understand instructions and documentation that is basically correct. The system runs as expected, and properly handles both valid and invalid input commands and command line arguments, except in a few cases. I have 'beautified' the output: in reality, the help message is displayed after every command, valid or invalid, and makes it very difficult to see the valid command output or even the error messages among all the help text.


Prime Directive 3: Can an external developer successfully understand and enhance the system?

To answer this question, I began by carefully reviewing the developer's guide wiki. It states that coding standards and quality assurance standards should be followed for this project, and briefly describes tasks a new developer can use. However, it says to use an xml file format for development without noting that the xml file is just an Eclipse code format file. It says to use the Elements of Java Style without saying what it is.

The developer wiki also says to use Issue-Driven Project Management and includes a brief explanation, even though this is clearly optional, just as Eclipse is optional. It says to check the CI build and includes a link, indicating the process followed by the developers.  The developer wiki does not explain how to generate JavaDoc documentation, and although it provides clear instructions on how to verify the system, detailed instructions for obtaining and building the system from source are not explicitly provided.
I checked out the sources from SVN (read only) using Tortoise SVN, generated and reviewed all the JavaDoc pages. They are well-written and informative, providing a basic understanding of the system architecture. Additionally, the names of the system packages, classes, methods, and fields are well chosen, and indicate their underlying purpose.  

Building the system from sources without errors was simple, just invoking a single command, Ant. However, verifying the system (Ant -verify verify.build.xml) failed about half the time while running the JUnit tests. The JUnit tests also printed alot of text to the console, indicating that the test were probably being run manually.
I generated coverage information regarding the system using Jacoco and it reported 80%, 60%, and 0% for the three packages in the system, for a total of 69% coverage. The test cases exercises 1,018 of 3,321 lines of code.

The test case source code should show how the developers are assuring the correctness of the functionality of the system.  Even though documentation is present, I found that in 50% of the test cases, it is difficult to understand exactly what functionality is being tested, and/or the methodology used. I believe more comments, or even better tests, would help. There are also spelling and grammar errors.

Although coverage is 69%, the command and processor test cases clearly do test the system under both valid and invalid input. I believe that in most cases, these existing test cases would prevent a new developer from making changes that break the existing code. Of course this depends on the level of programmer expertise and understanding of how to extend the current system.

A thorough examination of the source code shows that coding standards are followed and the code is commented appropriately.  The code is relatively easy to understand, and although the amount of commenting seems appropriate, some comments are clearly superfluous, and in other cases, insufficient. There are some classes such as FakeCommand and ReallyFakeCommand that seem to have been built for testing purposes. Not only are these the most confusing part of the system, they have the poorest documentation as well.

Examining the issues pages on Google project hosting clearly shows what parts of the system were built by each developer on the team, and it could be used to determine which developer would be the best person to ask questions regarding a particular component of the system. Also it reveals that roughly speaking, an equal amount of work (7,10, and 12 issues) from each of the developers went into the development of the system. Examining the issues pages does not account for the amount of work each issue required.

By inspecting the project page on the Jenkins CI server, it appears that any build failures were corrected promptly; the longest failure between successful builds was only 20 minutes.

From Google project hosting source changes page, I can see that only 5 of about 50 commits was not associated with a specific issue, and the total number of commits was 58 including the editing of wiki pages. By examining the commit log messages, issues page, and CI server, I can observe that the system was built in a consistent, methodical way.

My conclusion is that the system does satisfy prime directive 3, and a new external developer could easily and successfully understand the current system in its current state, and successfully enhance the system in the future.

























Sunday, November 20, 2011

Issue Based Project Management


In this blog entry I summarize an experience with issue-based project management, using and evaluating Google project hosting (GPH) and the Jenkins continuous integration server.

Google Project Hosting

For open source projects, Google project hosting enables anyone a free collaborative development environment. It enables fine-grained control and configuration, choice of repository, features an issue tracker, wiki pages, downloads, and more. I found it relatively simple and very reliable, and it reportedly scales well.

Jenkins

Jenkins is the leading open-source continuous integration server. It monitors executions of repeated jobs, such as building a software project, and in our case specifically, running an Ant script that verifys the system. The script compiles the code, builds the javadoc, runs our Junit tests, and then runs QA tools  checkstyle, PMD, and findbugs. Every time we do a commit, Jenkins assures the system builds by doing all this automatically and reporting failures.

The Process

My partner and I worked in different locations, using the GPH issue management system to effectively create, maintain, and evolve tasks and issues for implementing and managing  this project. It is like a global “white-board”, or can be used that way - as it is very flexible and customizable. Each commit (I used Tortoise SVN for Windows) can be linked to issue(s) using syntactic metadata in the commit log message. There are so many useful features in GPH I cannot possibly cover them all here.
Figure 1. Google Project Hosting Issues Page hale-aloha-cli-jz


Our basic workflow for each issue/task was to apply/change labels at the appropriate time as: new (ownerless), accepted (owner), started (implementing), fixed (finished), and verified (other team member looks at code and agrees). Even non-coding tasks were easy to capture using GPH. It is so effective, we rarely had to email back and forth as you can imagine might be required for collaborating on a software development project. This process really worked well for us. From inception and specification, through implementation, and testing, and verification, it all went smoothly.

The Project

Wattdepot

The wattdepot open-source project developed at the University of Hawaii is a collection of technologies and tools for collecting and storing data from electricity meters for smart grid research and experimentation. From the wattdepot Google hosting site: “WattDepot is an open source, RESTful web service that collects electricity data (such as current power utilization or cumulative power utilization) from meters and stores it in a database. The data can then be retrieved by other tools for visualization and analysis. It is designed to provide infrastructure for experimentation and development of "Smart Grid" applications.”

Hale-aloha-cli-jz

Our project, hale-aloha-cli-jz is a command line interface program written in the Java programming language, that interacts with wattdepot. Using hale-aloha-cli-jz can help those persons interested to understand various aspects of energy use without having to write code against the wattdepot API.

Functionality Implemented

The hale-aloha-cli-jz command line interface provides an interface to a subset of the wattdepot energy and power API .


Available Commands
  • current-power displays the current power in kW for the requested tower or lounge.
  • daily-energy displays the energy in kWh used by the requested tower or lounge for the specified date.
  • energy-since displays the energy used from date specified to now.
  • rank-towers displays a list in sorted order from least to most energy consumed between the start and end dates specified, for the four hale-aloha towers (buildings).
  • quit terminates the program.
  • help displays the available commands and command line options.
The Command Syntax


current-power [tower | lounge]
daily-energy [tower | lounge] [date]
energy-since [tower | lounge] [date]
rank-towers [start date] [end date]
quit

help



Overall quality of software  

I feel the quality of the code we developed is good, and in fact - better that it would have been without using Google Project Hosting and Jenkins. More importantly, I believe we developed the code faster, with less conflict and errors had we not used these project management tools.

Links to more information

Hale-aloha-cli-jz                               http://code.google.com/p/hale-aloha-cli-jz/
Google Project Hosting                     http://code.google.com/hosting/
Jenkins Continuous Integration         http://jenkins-ci.org
WattDepot                                       http://code.google.com/p/wattdepot/
Ant                                                  http://ant.apache.org/
Checkstyle                                       http://checkstyle.sourceforge.net/
Findbugs                                          http://findbugs.sourceforge.net/
PMD                                                http://pmd.sourceforge.net/
Tortoise SVN                                    http://tortoisesvn.net/
Eclipse                                            http://www.eclipse.org/
Jacoco Code coverage tool               http://www.eclemma.org/jacoco/