Software Design Patterns

Florian Rappl, Fakultät für Physik, Universität Regensburg

Software Design Patterns

Introduction to modern software architecture

Best practices

Introduction

In this section best practices will be shown for:
- object-oriented design,
- code transformations (refactoring),
- robustness (test driven development),
- performance (device independent) and
- readability.
The content is complementary to the previous chapters

Levels of design

Creating software is much more than just programming
A lot of time needs to be taken for designing an application
There are various levels of architecture:
- System
- Packages (e.g. business rules, UI, DB, dependencies on the system)
- Classes
- Routines
- Logic / algorithm

Desirable characteristics of a design

Minimal complexity
Ease of maintenance
Loose coupling (good abstraction, information hiding)
Extensibility and reusability
High fan-in (large number of classes that use a given class)
Low fan-out (a given class should not use too many other classes)
Portability
Leanness, i.e. no extra parts, backward-compatible
Stratification: The system must be consistent at any level

Defensive programming

Protecting from invalid input
Unit testing
Error-handling (e.g. return neutral value, error code, same answer as last time, ...)
Robustness and correctness
Exceptions are introduced if wrong inputs happen
Being not too defensive is key (remove trivial error checking from production code, ...)

Common errors

Test-driven development

How to ensure that software is robust? We need tests!
But software complexity usually grows exponentially
TDD tries to give us a plan for automated tests
In the end our software is able to inform us about bugs before we experience them by running the application
The basic concept is to get some rapid feedback during development
The risk of change is controlled by having a sufficient number of tests
We are able to detect problems in the specification

TDD cycle

A TDD cycle consists of three phases
1. Red
2. Green
3. Refactor
In the first phase we ensure that everything is compiling, but the test is failing (since the method just returns a dummy value)
Now we try to create an implementation that makes the test succeed
In the third phase we improve the implementation of the method
Any scenario that needs to be covered by the method has to be tested

TDD cycle

Red-Green-Refactor

The red phase is key, since it tries to ensure that the test is bug-free
One should first see the test failing (if it should) before succeeding
So overall the process is
- Add a test and run all tests (new one should fail)
- Implement the method and run all tests (should be green now)
- Refactor code and repeat the test (should still be green)
The test itself should be as simple as possible
No logic, and following a certain pattern

Test structure

Remarks

Every test consists of creating a test class, performing some setup, invoking the test method and a final cleanup step
TDD is an important part of any agile development process
The KISS (Keep It Simple Stupid) and YAGNI (You Aren't Gonna Need It) principles are usually followed
This means that small, extensible units are build that only have one responsibility
The focus lies on the desired job (project goal)

Shortcomings

UI, any external resources (databases, filesystems, network, ...) and others are hard to test (require functional tests)
Writing tests is time-consuming
Blind spots are more likely if writing the tests is not delegated
Integration and compliance testing might be reduced due to a false sense of security
The tests need to be maintained as well
Big changes in the architecture might result in a time-consuming update of all tests

Code refactoring

In short, refactoring is a

disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.

Therefore refactoring should improve code readability by reducing complexity
Also the code should be made maintenance friendly, yet extensible

Refactoring strategies

Refactor when adding a routine
Refactor when adding a class
Refactor when fixing a bug
Target error-prone modules
Target high-complexity modules
Improve the parts that are touched
Define an interface between clean and ugly code

Reasons to refactor (1)

Code is duplicated
Routine is loo long
A loop is too long or too deeply nested
A class has poor cohesion
A class interface does not provide a consistent level of abstraction
A parameter list has too many parameters
Changes within a class tend to be compartmentalized
Changes require parallel modifications to multiple classes

Reasons to refactor (2)

Inheritance hierarchies have to be modified in parallel
Case statements have to be modified in parallel
Related data items that are used together are not organized into classes
A routine uses more features of another class than its own class
A primitive data type is overloaded
A class doesn't do very much
A chain of routines passes tramp data
A middleman object isn't doing anything

Reasons to refactor (3)

One class is overly intimate with another
A routine has a poor name
Data members are public
A subclass uses only a small percentage of its parents' routines
Comments are used to explain difficult code
Global variables are used
A routine uses setup code before a routine call or takedown code after a routine call

Kinds of refactorings

Data-Level (e.g. hide class fields)
Statement-Level (e.g. simplify ifs)
Routine-Level (e.g. extract method)
Class implementation (e.g. value to reference)
Class interface (e.g. SRP)
System-Level (e.g. factory pattern)

Refactoring and TDD

Data-Level refactorings

Replace a magic number with a named constant
Rename a variable with a clearer or more informative name
Move an expression inline or replace an expression with a routine
Introduce an intermediate variable
Convert a multi-use variable to multiple single-use variables
Use a local variable for local purposes rather than a parameter
Convert a data primitive to a class
Convert a set of type codes to a class or an enumeration
Change an array to an object / encapsulate a collection

Statement-Level refactorings

Decompose a Boolean expression
Move a complex Boolean expression into a well-named Boolean function
Consolidate fragments that are duplicated within different parts of a conditional
Use break or return instead of a loop control variable
Return as soon as you know the answer instead of assigning a return value within nested if-then-else statements
Replace conditionals with polymorphism
Create and use null objects instead of testing for null values

Routine-Level refactorings

Extract routine / extract method
Move a routine's code inline
Convert a long routine to a class
Substitute a simple algorithm for a complex algorithm
Add or remove a parameter
Separate query operations from modification operations
Combine similar routines by parametrization
Separate routines whose behavior depends on parameters passed in
Pass a whole object rather than specific fields or vice versa

Class implementation refactorings

Change value objects to reference objects
Change reference objects to value objects
Replace virtual routines with data initialization
Change member routine or data placement
Extract specialized code into a subclass
Combine similar code into a superclass

Class interface refactorings

Move a routine to another class
Convert one class to two or vice versa
Hide a delegate or remove a middleman
Replace inheritance with delegation or vice versa
Introduce a foreign routine or extension class
Encapsulate an exposed member variable
Hide routines that are not supposed to be used outside the class
Collapse a superclass and subclass if their implementations are very similar

System-Level refactorings

Create a defined reference source for data that is beyond our control
Change unidirectional class associations to bidirectional class associations
Change bidirectional class associations to unidirectional class associations
Provide a factory method rather than a simple constructor
Replace error codes with exceptions or vice versa

About subroutines

Historically two kind of operations have been established:
1. A function (does some computation and returns the result)
2. A procedure (modifies something and has no result)
Now we mostly talk about methods (implemented operations of a class)
There are several questions concerning these operations and their parameters
Specifically when to extract methods, how to name them and how to structure output and input parameters

Why another routine?

Reduce complexity
Avoid duplicate code
Support subclassing
Hide sequences or pointer operations
Improve portability
Simplify complicated boolean tests
Improve performance
However: NOT to ensure that all routines are small!

Naming routines

Describe what the routine does
Avoid meaningless verbs
Don't use numbers for differentiation
Make routine names as long as necessary (for variables 9-15 chars)
For a procedure: Use a strong verb followed by an object
For a function: Use a description of the return value
Otherwise use opposites precisely (open/close, add/remove, create/destroy)
Establish connections

Routine parameters

Parameters should always be as general as possible
The opposite is the type of the return value
This type should be as specific as possible
We call incoming parameters contra-variant
Outgoing parameters (return values) are co-variant
Such a style increases flexibility by allowing methods to be used with more types
Also the return value is then a lot more useful

Using parameters

Follow the order: Input-Modify-Output
Consider creating own input and output keywords if possible
Otherwise create or use conventions like in*, out* for names
Use all parameters
Put status or error variables last
Don't use routine parameters as working variables
Limit routine parameters to max. 7
Use named parameters if possible

Code optimization

Not trivial, since code should be readable and follow our conventions
Nevertheless sometimes parts of the application are performance critical
Problem: Most optimizations should have been integrated in the design
But "premature performance optimization is the root of all evil"
Only solution: Try to maximize performance and change design if still not good enough

Performance analysis

Common techniques

Substitute table lookups for complicated logic
Jam loops
Use integer instead of floating point variables when possible
Initialize data at compile time
Use constants of the correct type
Precompute results
Eliminate common subexpressions
Translate key routines to a low-level language

Quick improvements (1)

Order tests (switch-case, if-else) by frequency
Compare performance of similar logic structures
Use lazy evaluation
Unswitch loops that contain if tests
Unroll loops
Minimize work performed in loops
Put the busiest loop on the inside of nested loops

Quick improvements (2)

Change multi-dimensional to one-dimensional array
Minimize array references
Augment data types with indices
Cache frequently used variables
Exploit algebraic identities
Reduce strength in logical and mathematical expressions
Rewrite routines inline

References

Literature

McConnell, Steve (2004). Design in Construction.
Sedgewick, Robert (1984). Algorithms.
Kerievsky, Joshua (2004). Refactoring to Patterns.
Fowler, Martin (1999). Refactoring: Improving the design of existing code.
Weisfeld, Matt (2004). The Object-Oriented Thought Process.
Beck, Kent (2003). Test-Driven Development by Example.