OSDN Git Service

x86/asm: Use CC_SET/CC_OUT in percpu_cmpxchg8b_double() to micro-optimize code generation
authorUros Bizjak <ubizjak@gmail.com>
Tue, 5 Jun 2018 16:39:09 +0000 (18:39 +0200)
committerIngo Molnar <mingo@kernel.org>
Thu, 21 Jun 2018 13:21:47 +0000 (15:21 +0200)
Use CC_SET(z)/CC_OUT(z) instead of explicit SETZ instruction.

Using these two defines, the compiler that supports generation of
condition code outputs from inline assembly flags generates e.g.:

  cmpxchg8b %fs:(%esi)
  jne    172255 <__kmalloc+0x65>

instead of:

  cmpxchg8b %fs:(%esi)
  sete   %al
  test   %al,%al
  je     172255 <__kmalloc+0x65>

Note that older compilers now generate:

  cmpxchg8b %fs:(%esi)
  sete   %cl
  test   %cl,%cl
  je     173a85 <__kmalloc+0x65>

since we have to mark that cmpxchg8b instruction outputs to %eax
register and this way clobbers the value in the register.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180605163910.13015-1-ubizjak@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
arch/x86/include/asm/percpu.h

index a06b073..e9202a0 100644 (file)
@@ -450,9 +450,10 @@ do {                                                                       \
        bool __ret;                                                     \
        typeof(pcp1) __o1 = (o1), __n1 = (n1);                          \
        typeof(pcp2) __o2 = (o2), __n2 = (n2);                          \
-       asm volatile("cmpxchg8b "__percpu_arg(1)"\n\tsetz %0\n\t"       \
-                   : "=a" (__ret), "+m" (pcp1), "+m" (pcp2), "+d" (__o2) \
-                   :  "b" (__n1), "c" (__n2), "a" (__o1));             \
+       asm volatile("cmpxchg8b "__percpu_arg(1)                        \
+                    CC_SET(z)                                          \
+                    : CC_OUT(z) (__ret), "+m" (pcp1), "+m" (pcp2), "+a" (__o1), "+d" (__o2) \
+                    : "b" (__n1), "c" (__n2));                         \
        __ret;                                                          \
 })