when a working set is too large and to understand how it can be modified. The
programmer can control the working set size by choosing functions, objects, and
algorithms that minimize or localize memory references.
Programs that do not scatter their memory references widely in virtual memory
are said to possess locality of reference. For example, a program has a much
greater locality of reference if it accesses the elements of an array linearly
through memory, rather than accessing the elements in a different order.
Algorithm Design
Successful algorithm design must take into account paging as well as runtime
considerations. The most obvious (yet often overlooked) aspect of algorithm design
that affects paging is choosing proper data structures. Data structures that are
optimal for paging are often not optimal for runtime; tradeoffs can be made
between paging and runtime performance to optimize overall performance. See
[Andre 86] for a more detailed discussion of the relation between runtime
complexity and paging complexity of programs.
It is important to be aware of this tradeoff at the initial development phase.
During development, programs are typically run using very small data sets, where
paging performance is not an important factor. Since paging is not perceived to
be a problem, little thought is given to improving paging performance.
Unfortunately, in the production phase the program operates on larger data sets,
and the paging overhead can dominate performance.
A reasonable design strategy is to achieve improvements in paging complexity at
the expense of runtime performance. Considering the runtime performance of
hash lookups, it is better to search on a hash miss by computing a new hash than
to search linearly through the hash table from the point of the hash miss. For
large hash tables, random (hashed) references to the table will likely cause a page
fault, whereas linear references will not. Therefore, in applications where hash
tables are very large, it can be advantageous to use a linear search on hash
misses.
The following two Common Lisp functions illustrate another way that the
algorithm design affects paging performance. Each function computes the average
of the products of each column's elements in a square matrix: