Documenting a Software System
Preface
A poorly documented system is not worth much, however well it once worked. For small, unimportant programs that are only used for a short period of time, a few comments in the code may be enough. But for most programs, if the only documentation is the code itself, the program rapidly becomes obsolete and unmaintainable. A surprise to most novices is that a small amount of effort on documentation is rewarded even within the confines of a small project.
Unless you are infallible and live in a world where nothing ever changes, you will find yourself returning to code that you have already written, and you'll question decisions you made earlier in the development. If you don't document your decisions, you'll find yourself repeating the same mistakes or puzzling out what you once could have easily described. Not only does lack of documentation only create extra work, but it also tends to hurt the quality of the code. If you don't have a clear characterization of the problem, for example, you're unlikely to develop a clean solution.
Learning how to document software is hard, and it requires mature engineering judgment. Documenting too little is a common mistake, but the other extreme can be just as bad: If you document too much, the documentation will be overwhelming to a reader, and a burden to maintain. It's vital to document only the right things. The documentation is no help to anyone if its length discourages people from actually reading it.
Novices are often tempted to focus their efforts on the easy issues, since these are easier to document. But that's a waste of time; you don't learn anything from the effort, and you end up with documentation that is worse than useless. Novices also tend to be reluctant to document problems. This is short-sighted: if you know that some aspect of your design is not quite right, that some part of the problem has not been clarified, or that some code is likely to be buggy, then say so! You'll spare the reader time puzzling over something that appears to be wrong, you'll remember where to look yourself if you run into problems, and you'll end up with a more honest and useful document.
Another issue is when to document. Although it sometimes makes sense to postpone documentation while performing experiments, experienced developers tend to document systematically even temporary code, initial problem analyses, and draft designs. They find that this makes experimentation more productive. Furthermore, since they've established habits of documentation, it feels natural to document as they go along.
This handout gives you guidelines on how to document a software system. It gives an outline structure and some required elements, but it leaves in the details much leeway for your own judgment. It is crucial that you don't treat documentation as a dull, rote affair; if you do, your documentation will be useless, painful to read, and painful to write. So document consciously: ask yourself as you do it why you're doing it, and whether you're spending your time most effectively. Your document should have the structure given in this template. Rough sizes in pages are provided for a typical CS3217 project; these are guides, not requirements.
Requirements
The requirements section describes the problem being solved, as well as the solution. This section of the document is of interest to users as well as implementers; it should not contain details about the particular implementation strategy. Other parts of the system documentation will not be of interest to users, only to implementers, maintainers, and the like.
Overview (up to 1 page). An explanation of the purpose of the system and the functionality it provides.
Revised Specification. If you were given detailed specifications for the behavior of the system, you may find certain portions of the system under-specified or unclear. In this section you should make clear any assumptions you made about the meaning of the requirements as well as making clear any extensions or modifications you made to the requirements.
User Manual (1 - 5 pages). A detailed description of how the user can use the system, what operations the user can perform,what the command line arguments are, etc. Detailed specifications of formats should be relegated to the Appendix. Any environmental assumptions should be made explicit here: For instance, note if the program only runs on certain platforms, assumes a certain directory hierarchy is present, assumes certain other applications are present, etc. Along with the overview, this manual should provide all the information needed by a user of the system.
Performance (0.5 page). What resources does the system require for normal operation, and what space and time can it be expected to consume? You may find use cases helpful in writing the revised specification and/or the user manual. A use case is a specific goal and a list of the actions that a user performs in order to achieve the goal. Among other things, a client can examine the list of actions to decide whether the user interface is reasonable. If the collection of use cases covers all desired user goals, then the client can have some confidence that the system will fulfill its objective.
Design
The design section of your documentation gives a high-level picture of your implementation strategy.
Overview (0.5 - 3 pages). An overview of the design: top-level organization, particularly interesting design issues, use of libraries and other third party modules, and pointers to any aspects that are unsettled or likely to change. Also include problems with the design: decisions that may turn out to be wrong and tradeoffs between flexibility and performance that may turn out to be ill-judged.
Runtime Structure (1 - 5 pages). Representations of data types should be explained (along with their abstraction functions and rep. invariants) if those representations are unusual, particularly complex, or crucial to the overall design. Note that abstraction functions and rep invariants should still appear in their natural place in the code itself.
Module Structure (1 - 5 pages). A description of the syntactic structure of the program text, expressed as a module dependency diagram. Should include package structure and should show Java interfaces as well as classes. It is not necessary to show dependences on Java API classes. Your MDD should be neatly laid out for readability. Explain why the particular syntactic structure was chosen (e.g., introduction of interfaces for decoupling - what they decouple and why), and how particular design patterns were used. To explain the decomposition and other design decisions, argue that they contribute to simplicity, extensibility (ease of adding new features), partitionability (different team members can work on different parts of the design without communicating constantly), or similar software engineering goals.
Documenting Code
Specification-level Comments
Abstract data types. Every abstract data type (class or interface) should have:
An overview section that gives a one or two line explanation of what objects of the type represent and whether they are mutable.
A list of specification fields. There might be only one; for example, a set may have the field elems representing the set of elements. Each field should have a name, a type, and a short explanation. You may find it useful to define extra derived fields that make it easier to write the specifications of methods; for each of these, you should indicate that it is derived and say how it is obtained from the other fields. There may be specification invariants that constrain the possible values of the specification fields; if so, you should specify them.
Method Specifications. All public methods of classes should have specifications; tricky private methods should also be specified. Method specifications should follow the requires, modifies, throws, effects, returns structure described in the specifications handout and in class. Note that for CS3217, you may assume arguments are non-null unless otherwise specified.
Implementation-level Comments
Implementation notes. Class comments should include the following elements (which, for notable classes, appear also in the Runtime Structure section of the design documentation):
An abstraction function that defines each specification field in terms of the representation fields. Abstraction functions are only required for classes which are abstract data types, and not for classes like exceptions or some GUI widgets.
A representation invariant. RIs are required for any class that has a representation (e.g., not most exceptions). We strongly recommend that you test invariants in a checkRep() method where feasible. Take care to include in your invariants assumptions about what can and cannot be null.
For classes with complex representations, a note explaining the choice of representation (also called the representation rationale): what tradeoffs were made and what alternatives were considered and rejected (and why).
Runtime assertions. These should be used judiciously, as explained in lecture. For a longer discussion of the how runtime assertions can improve the quality of your code, see Writing Solid Code by Steve Maguire, Microsoft Press, 1995.
Comments. Your code should be commented carefully and tastefully. For an excellent discussion of commenting style and for much good advice in general about programming, see The Practice of Programming by Brian W. Kernighan and Rob Pike, Addison-Wesley, Inc.,1999.
Testing
The testing section of your documentation indicates the approach you have taken to verifying and validating your system. (For a real system, this might include user tests to determine the system's suitability as a solution to the problem described in the requirements section, as well as running test suites to verify the algorithmic correctness of the code.) Just as you should not convey the design of your system by presenting the code or even listing the classes, you should not merely list the tests performed. Rather, discuss how tests were selected, why they are sufficient, why a reader should believe that no important tests were omitted, and why the reader should believe that the system will really operate as desired when fielded.
Strategy (1 - 2 pages). An explanation of the overall strategy for testing: Black box and/or glass box, top down and/or bottom up, kinds of test beds or test drivers used, sources of test data, test suites, coverage metrics, compile-time checks vs. run-time assertions, reasoning about your code, etc. You might want to use different techniques (or combinations of techniques) in different parts of the program. In each case, justify your decisions. Explain what classes of errors you expect to find (and not to find!) with your strategy. Discuss what aspects of the design make it hard or easy to validate.
Test results (0.5 - 2 pages). Summary of what testing has been accomplished and what if any remains: Which modules have been tested, and how thoroughly? Indicate degree of confidence in the code: What kinds of fault have been eliminated? What kinds might remain?
Reflection
The reflection (more commonly called "post mortem") section of the document is where you can generalize from specific failures or successes to rules that you or others can use in future software development. What surprised you most? What do you wish you knew when you started? How could you have avoided problems that you encountered during development?
Evaluation (0.5 - 1 pages). What you regard as the successes and failures of the development: unresolved design problems, performance problems, etc. Identify which features of your design are the important ones. Point out design or implementation techniques that you are particularly proud of. Discuss what mistakes you made in your design, and the problems that they caused.
Lessons (0.2 - 1 pages). What lessons you learned from the experience: how you might do it differently a second time round, and how the faults of the design and implementation may be corrected. Describe factors that caused problems such as missed milestones or to the known bugs and limitations. Known Bugs and Limitations In what ways does your implementation fall short of the specification? Be precise. Although you will lose points for bugs and missing features, you will receive partial credit for accurately identifying those errors, and the source of the problem.
Appendix
The appendix contains low-level details about the system that are not necessary in order to understand it at a high level, but are required in order to use it in practice or verify claims made elsewhere in the document.
Formats. A description of all formats assumed or guaranteed by the program: for file I/O, command line arguments, user dialogs, message formats for network communications, etc. These should be broken down into user-visible formats, which are conceptually part of the user-visible requirements and user manual, and internal formats that are conceptually part of other components of your documentation.
Module Specifications. You should extract the specifications from your code and present them separately here. The specification of an abstract type should include its overview, specification fields, and abstract invariants (specification constraints). The abstraction function and rep invariant are not part of a type's specification.
Test cases. Ideally, your testbed reads tests from a file of test cases in a format that is convenient to read and write. You need not include very large test cases; for example, you might just note the size of a random input generated for stress testing, and provide the program that generated the tests. Indicate for each group of tests what they are for (e.g., "stress tests, huge inputs", "partition tests, all combinations of +/-/0 for integer args").
Acknowledgement
This document was adopted from the course documents for MIT 6.170.