Accuracy of MARSS

IPC Comparison of MARSS, AMD Athlon X2 and Intel Core-2 Duo

The above chart shows the accuracy of the MARSS simulator by comparing IPCs (Instructions-Commits Per Cycle) obtained from simulation against the IPCs realized in executing the same benchmark programs on two real implementations, an AMD Athlon X2 and an Intel Core-2 Duo for SPEC 2006 benchmarks. These IPC values are for only user space execution and does not contain any kernel simulation because the kernel execution paths are different on the MARSS VM and on the test machines used to gather above statistics. All the benchmarks are run from start to completion for the simulated runs and the runs on the real machines. We have used Linux Kernel's Performance Counters to get the IPCs realized on the actual hardware.

The simulation IPC values are very close to the real machine IPC values in general. The discrepancies between the results on the real machines and MARRSS are due to the fact that we do not have all of the specific hardware details of real processors like ROB size, LSQ size, issue width, number of Function Units, latencies etc. Obviously, the simulated results and results realized on the actual hardware get closer as we use accurate hardware level details to configure the simulated core.

Performance Scalability of MARSS

Scalability of MARSS from 2-cores to 8-cores

The above chart shows the simulation speed of MARSSx86. We show the per-core instruction commit rates realized in simulating some of the Parsec 2.1 benchmarks, changing the configuration of the simulated multicore chip from 2 to 4 cores and then to 8 cores. The host platform has two quad-core Intel Xeon E5450 processors with 16 GB RAM and was running 3 independent simulations in parallel. As seen above, the IPS realized per core come down linearly with the number of simulated cores (due to the overhead on inter-core activities on the coherent caches, the interconnection and synchronization etc.)

