summaryrefslogtreecommitdiff
path: root/util/exynos/fixed_cksum.py
diff options
context:
space:
mode:
authorSubrata Banik <subratabanik@google.com>2023-05-29 13:53:57 +0530
committerSubrata Banik <subratabanik@google.com>2023-06-01 07:52:12 +0000
commit36faccfec51dc5314193f048ff77f23a0304e0b1 (patch)
treec26f0867def3266ed2d6c22f5dba3c818e259236 /util/exynos/fixed_cksum.py
parentdffb1c8933c09c041b10e1f6a666449e7a7531da (diff)
include/cpu/x86: Skip `wbinvd` on CPUs with cache self-snooping (SS)
This patch refers and backport some of previous work from Linux Kernel (https://lore.kernel.org/all/1561689337-19390-3-git-send-email-ricardo. neri-calderon@linux.intel.com/T/#u) that optimizes the MTRR register programming in multi-processor systems by relying on the CPUID (self-snoop feature supported). Refer to the details below: Programming MTRR registers in multi-processor systems is a rather lengthy process as it involves flushing caches. As a result, the process may take a considerable amount of time. Furthermore, all processors must program these registers serially. `wbinvd` instruction is used to invalidate the cache line to ensure that all modified data is written back to memory. All logical processors are stopped from executing until after the write-back and invalidate operation is completed. The amount of time or cycles for WBINVD to complete will vary due to the size of different cache hierarchies and other factors. As a consequence, the use of the WBINVD instruction can have an impact on response time. As per measurements, around 98% of the time needed by the procedure to program MTRRs in multi-processor systems is spent flushing caches with wbinvd(). As per the Section 11.11.8 of the Intel 64 and IA 32 Architectures Software Developer's Manual, it is not necessary to flush caches if the CPU supports cache self-snooping (ss). "Flush all caches using the WBINVD instructions. Note on a processor that supports self-snooping, CPUID feature flag bit 27, this step is unnecessary." Thus, skipping the cache flushes can reduce by several tens of milliseconds the time needed to complete the programming of the MTRR registers: Platform Before After 12-core (14 Threads) MeteorLake 35ms 1ms BUG=b:260455826 TEST=Able to build and boot google/rex. Change-Id: I83cac2b1e1707bbb1bc1bba82cf3073984e9768f Signed-off-by: Subrata Banik <subratabanik@google.com> Reviewed-on: https://review.coreboot.org/c/coreboot/+/75511 Tested-by: build bot (Jenkins) <no-reply@coreboot.org> Reviewed-by: Jérémy Compostella <jeremy.compostella@intel.com> Reviewed-by: Lean Sheng Tan <sheng.tan@9elements.com> Reviewed-by: Himanshu Sahdev <himanshu.sahdev@intel.com> Reviewed-by: Tarun Tuli <taruntuli@google.com>
Diffstat (limited to 'util/exynos/fixed_cksum.py')
0 files changed, 0 insertions, 0 deletions