Profiling an Architectural Simulator
Overview
gem5 is a state-of-the-art software-based architectural simulator with wide spread use both in academia and industry. We set out to profile the performance of gem5 on different platforms and evaluate its performance. Our observation show that gem5 is 1.7x~3.02x faster on a MacBook Pro w/ M1 vs. Dell server w/ Intel Xeon Gold. Hence, we use FireSim to validate our hypothesis that gem5 is largely impacted by its cache sizes. Insightful stats like cache misses, branch mispredictions, cpu utilization etc. are collected by reading performance counters on these platforms. In this documentation, we describe the steps for running gem5 as a workload on FireSim.
Details
Running gem5 on FireSim
The main idea is to execute gem5 as a workload on FireSim to validate our hypothesis that gem5 is largely imparted by the size of the l1 cache. To do this, the user must prepare the gem5 workload (Sieve of Eratosthenes), the FireSim workload, which in this case is the gem5 simulator, and finally, launch the FireSim simulation. Below we give the general steps required to achieve this:
Steps to run gem5 on FireSim
Set up the AWS FireSim environment
Build the gem5 binary for RISC-V ISA
Prepare gem5 workload and transfer it to the instance
Create FireSim workload using FireMarshal
Build the target design and modify its parameter
Set up the AWS FireSim environment
We use a Z1d.2xlarge FireSim manager instance. Check out the FireSim documentation for more details. https://docs.fires.im/en/stable/Initial-Setup/index.html
mosh --ssh"=ssh -i firesim.pem" username@ip_addr #username is centos, ip_addr is dynamically assigned to the manager instance upon initialization
Build the gem5 binary for RISC-V ISA
Use QEMU to emulate a RISC-V architecture for building the gem5 binary and installing dependencies.
Test the compiled binary binary on We use a SiFive HiFive Unleashed developmental board, which natively runs Ubuntu.
Prepare gem5 workload and transfer it to the instance
In this step, you should compile your binary (we used Sieve of Erastosthenes) for the gem5 target ISA.
Next, transfer your compiled binary to the AWS EC2 F1 instance. We used sftp like this:
sudo sftp -i firesim.pem "username@ip_addr"
put <filename> #this apples to any file
Create FireSim workload using FireMarshal
FireSim requires a .json input file format to define workloads (e.g. gem5) that will run on the target design. FireMarshal is used to manage this process. Check out the FireMarshal documentation for more details. https://firemarshal.readthedocs.io/en/latest/index.html.
This produces the following .json file in the /home/centos/firesim/deploy/workload directory, which defines the gem5 workload, as well as its output
"benchmark_name": "gem5-workload",
"common_simulation_outputs": [ "uartlog"],
"workloads":
[
{
"name": "gem5-workload-gem5",
"bootbinary": "../../../target-design/chipyard/software/firemarshal/images/gem5-workload-gem5-bin",
"rootfs": "../../../target-design/chipyard/software/firemarshal/images/gem5-workload-gem5.img",
"outputs": [ "/root/sim-environment/m5out" ]
}
]
Build our target design and Modify parameters
To build your target design on FireSim, you can utilize any of the Chipyard’s included RTL generators (e.g. Rocket Chip).
We use a quad-core Rocket Chip with an 16KB 2-way set associative icache & dcache, and a 512KB l2 cache base config.
To change the base system configuration, we had to specify new design parameters in TargetConfigs.scala file in the following path.
/home/centos/firesim/target-design/chipyard/generators/firechip/src/main/scala/TargetConfigs.Scala
An example of creating a target design with 64KB L1I and L1D Caches
We specify a quad-core rocket chip with a 64KB L1 icache and dcache in the TargetConfigs.scala file. Precedence of the parameters defined before goess from bottom up. Note that: The default block size is 64Bytes.
class FireSimGem5ConfigQuadRocketConfig extends Config(
new freechips.rocketchip.subsystem.WithL1ICacheWays(16) ++ // change rocket I$
new freechips.rocketchip.subsystem.WithL1ICacheSets(64) ++ // change rocket I$
new freechips.rocketchip.subsystem.WithL1DCacheWays(16) ++ // change rocket D$
new freechips.rocketchip.subsystem.WithL1DCacheSets(64) ++ // change rocket D$
new WithDefaultFireSimBridges ++
new WithDefaultMemModel ++
new WithFireSimConfigTweaks ++
new chipyard.QuadRocketConfig)
Modify config_build_recipe.yaml, config_build.yaml, & config_runtime.yaml files by adding the following lines.
config_build_receipes.yaml
Modifying config_build_recipe.yaml
firesim_rocket_quadcore_gem5_config: # This can be any name specified by the user
DESIGN: FireSim
TARGET_CONFIG: DDR3FRFCFSLLC4MB_WithDefaultFireSimBridges_WithFireSimTestChipConfigTweaks_FireSimGem5ConfigQuadRocketConfig
PLATFORM_CONFIG: WithAutoILA_F140MHz_BaseF1Config
deploy_triplet: null
post_build_hook: null
metasim_customruntimeconfig: null
bit_builder_recipe: bit-builder-recipes/f1.yaml
config_build.yaml
builds_to_run:
- firesim_rocket_quadcore_gem5_config # This name must match the name specified in config_build_recipes.yaml
config_runtime.yaml
run_farm:
# run farm hosts to spawn: a mapping from a spec below (which is an EC2
# instance type) to the number of instances of the given type that you
# want in your runfarm.
run_farm_hosts_to_use:
- f1.16xlarge: 0
- f1.4xlarge: 0
- f1.2xlarge: 1 # we want to use f1.2xlarge as the runfarm instance
- m4.16xlarge: 0
- z1d.3xlarge: 0
- z1d.6xlarge: 0
- z1d.12xlarge: 0
target_config:
topology: no_net_config
no_net_num_nodes: 1
link_latency: 6405
switching_latency: 10
net_bandwidth: 200
profile_interval: -1
# This references a section from config_hwdb.yaml for fpga-accelerated simulation
# or from config_build_recipes.yaml for metasimulation
# In homogeneous configurations, use this to set the hardware config deployed
# for all simulators
default_hw_config: firesim_rocket_quadcore_gem5_config
workload:
workload_name: gem5-workload.json
Next, we use golden gate compiler to generate the verilog code from the Chisel-generated RTL code for the AWS AGFI build process.
To move to the golden gate compiler directory, run:
cd /home/centos/firesim/sim/
Run make
make DESIGN=FireSim TARGET_CONFIG=DDR3FRFCFSLLC4MB_WithDefaultFireSimBridges_WithFireSimTestChipConfigTweaks _FireSimGem5ConfigQuadRocketConfig PLATFORM_CONFIG=WithAutoILA_F140MHz_BaseF1Config f1
Build the AWS FPGA Image by executing:
firesim buildbitstream
After a successfull build, update config_hwdb.yaml with the AGFI info.
firesim_rocket_quadcore_gem5_config: # Add your AGFI info to config_hwdb.yaml, so they can be deployed during simulation
agfi: agfi-06e876ba9378cc9ff
deploy_triplet_override: null
custom_runtime_config: null
Then, launch runfarm instance, setup the simulation infrastructure, and run your firesim simulation.
firesim launchrunfarm; firesim infrasetup; firesim runworkload
Finally, results can be collected from the following directory.
cd /home/centos/firesim/results-workload/
Publications
Johnson Umeike, Neel Patel, Alex Manley, Amin Mamandipoor, Heechul Yun, Mohammad Alian, “Profiling gem5 Simulator,” ISPASS 2023 [paper] [slides]
- FireSim and Chipyard User and Developer Workshop at ASPLOS 2023 [website]
Title: Profiling an Architectural Simulator (Using Firesim to Profile gem5) [presentation]