A big problem for us was that the ATOM-instrumented code runs very slowly. This is logical since the original programs themselves are quite time-consuming. Through our instrumentation routines, we have further increased the time required for every memory access. After some thought, we realized that we did not really need to execute the entire program every time we modified our cache policy. What was important for us was the trace of the data access. Given this, a driver could apply the cache replacement strategies in essentially a platform-independent way, to execute our implementation. Collecting the program traces was a non-trivial exercise, mostly because large programs required huge amounts of storage. For example, one of the SPEC benchmarks (the JPEG program) generated about 8 billion addresses. Thus in some cases, it was better to run the program, albeit slowly, than to use a stored trace. Thus there was a trade-off between time and disk-space. The trace method was used for the smaller programs and for debugging purposes.