summaryrefslogtreecommitdiff
path: root/src/include
diff options
context:
space:
mode:
authorSubrata Banik <subratabanik@google.com>2023-05-29 13:53:57 +0530
committerSubrata Banik <subratabanik@google.com>2023-06-01 07:52:12 +0000
commit36faccfec51dc5314193f048ff77f23a0304e0b1 (patch)
treec26f0867def3266ed2d6c22f5dba3c818e259236 /src/include
parentdffb1c8933c09c041b10e1f6a666449e7a7531da (diff)
include/cpu/x86: Skip `wbinvd` on CPUs with cache self-snooping (SS)
This patch refers and backport some of previous work from Linux Kernel (https://lore.kernel.org/all/1561689337-19390-3-git-send-email-ricardo. neri-calderon@linux.intel.com/T/#u) that optimizes the MTRR register programming in multi-processor systems by relying on the CPUID (self-snoop feature supported). Refer to the details below: Programming MTRR registers in multi-processor systems is a rather lengthy process as it involves flushing caches. As a result, the process may take a considerable amount of time. Furthermore, all processors must program these registers serially. `wbinvd` instruction is used to invalidate the cache line to ensure that all modified data is written back to memory. All logical processors are stopped from executing until after the write-back and invalidate operation is completed. The amount of time or cycles for WBINVD to complete will vary due to the size of different cache hierarchies and other factors. As a consequence, the use of the WBINVD instruction can have an impact on response time. As per measurements, around 98% of the time needed by the procedure to program MTRRs in multi-processor systems is spent flushing caches with wbinvd(). As per the Section 11.11.8 of the Intel 64 and IA 32 Architectures Software Developer's Manual, it is not necessary to flush caches if the CPU supports cache self-snooping (ss). "Flush all caches using the WBINVD instructions. Note on a processor that supports self-snooping, CPUID feature flag bit 27, this step is unnecessary." Thus, skipping the cache flushes can reduce by several tens of milliseconds the time needed to complete the programming of the MTRR registers: Platform Before After 12-core (14 Threads) MeteorLake 35ms 1ms BUG=b:260455826 TEST=Able to build and boot google/rex. Change-Id: I83cac2b1e1707bbb1bc1bba82cf3073984e9768f Signed-off-by: Subrata Banik <subratabanik@google.com> Reviewed-on: https://review.coreboot.org/c/coreboot/+/75511 Tested-by: build bot (Jenkins) <no-reply@coreboot.org> Reviewed-by: Jérémy Compostella <jeremy.compostella@intel.com> Reviewed-by: Lean Sheng Tan <sheng.tan@9elements.com> Reviewed-by: Himanshu Sahdev <himanshu.sahdev@intel.com> Reviewed-by: Tarun Tuli <taruntuli@google.com>
Diffstat (limited to 'src/include')
-rw-r--r--src/include/cpu/x86/cache.h15
1 files changed, 14 insertions, 1 deletions
diff --git a/src/include/cpu/x86/cache.h b/src/include/cpu/x86/cache.h
index 4143d972f5..d4d9160252 100644
--- a/src/include/cpu/x86/cache.h
+++ b/src/include/cpu/x86/cache.h
@@ -9,9 +9,11 @@
#define CR0_NoWriteThrough (CR0_NW)
#define CPUID_FEATURE_CLFLUSH_BIT 19
+#define CPUID_FEATURE_SELF_SNOOP_BIT 27
#if !defined(__ASSEMBLER__)
+#include <arch/cpuid.h>
#include <stdbool.h>
#include <stddef.h>
@@ -51,6 +53,16 @@ static __always_inline void enable_cache(void)
write_cr0(cr0);
}
+/*
+ * Cache flushing is the most time-consuming step when programming the MTRRs.
+ * However, if the processor supports cache self-snooping (ss), we can skip
+ * this step and save time.
+ */
+static __always_inline bool self_snooping_supported(void)
+{
+ return (cpuid_edx(1) >> CPUID_FEATURE_SELF_SNOOP_BIT) & 1;
+}
+
static __always_inline void disable_cache(void)
{
/* Disable and write back the cache */
@@ -58,7 +70,8 @@ static __always_inline void disable_cache(void)
cr0 = read_cr0();
cr0 |= CR0_CD;
write_cr0(cr0);
- wbinvd();
+ if (!self_snooping_supported())
+ wbinvd();
}
#endif /* !__ASSEMBLER__ */