What tips can you give for profiling on devices that have thermal throttling? What margin would you target to avoid thermal throttling (i.e., targeting 20 ms instead of 33 ms)?

Arm: Optimizing for frame time can be misleading on Android because devices will constantly adjust frequency to optimize energy usage, making frame time an incomplete measure by itself. Preferably, monitor CPU and GPU cycles per frame, as well as GPU memory bandwidth per frame, to get some value that is independent of frequency. The cycle target you need will depend on each device’s chip design, so you’ll need to experiment.

Any optimization helps when it comes to managing power consumption, even if it doesn’t directly improve frame rate. For example, reducing CPU cycles will reduce thermal load even if the CPU isn’t the critical path for your game.

Beyond that, optimizing memory bandwidth is one of the biggest savings you can make. Accessing DRAM is orders of magnitude more expensive than accessing local data on-chip, so watch your triangle budget and keep data types in memory as small as possible.

Unity: To limit the impact of CPU clock frequency on the performance metrics, we recommend trying to run at a consistent temperature. There are a couple of approaches for doing this:

  • Run warm: Run the device for a while so that it reaches a stable warm state before profiling.
  • Run cool: Leave the device to cool for a while before profiling. This strategy can eliminate confusion and inconsistency in profiling sessions by taking captures that are unlikely to be thermally throttled. However, such captures will always represent the best case performance a user will see rather than what they might actually see after long play sessions. This strategy can also delay the time between profiling runs due to the need to wait for the cooling period first.

With some hardware, you can fix the clock frequency for more stable performance metrics. However, this is not representative of most devices your users will be using, and will not report accurate real-world performance. Basically, it’s a handy technique if you are using a continuous integration setup to check for performance changes in your codebase over time.

Source: Unity Technologies Blog