OSDN Git Service

[x86, MemCmpExpansion] allow 2 pairs of loads per block (PR33325)
authorSanjay Patel <spatel@rotateright.com>
Sat, 6 Jan 2018 16:16:04 +0000 (16:16 +0000)
committerSanjay Patel <spatel@rotateright.com>
Sat, 6 Jan 2018 16:16:04 +0000 (16:16 +0000)
commitdf87029dcfc6278f4753e9f1aa42b433c517c6b2
tree6f36f764a8353aab11cf1f097057a864a3b0a79c
parent0179ddc7fa1ccf9ece5298a159ed0ecf24d4af09
[x86, MemCmpExpansion] allow 2 pairs of loads per block (PR33325)

This is the last step needed to fix PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325

We're trading branch and compares for loads and logic ops.
This makes the code smaller and hopefully faster in most cases.

The 24-byte test shows an interesting construct: we load the trailing scalar
elements into vector registers and generate the same pcmpeq+movmsk code that
we expected for a pair of full vector elements (see the 32- and 64-byte tests).

Differential Revision: https://reviews.llvm.org/D41714

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321934 91177308-0d34-0410-b5e6-96231b3b80d8
lib/CodeGen/ExpandMemCmp.cpp
lib/Target/X86/X86ISelLowering.h
test/CodeGen/X86/memcmp-optsize.ll
test/CodeGen/X86/memcmp.ll
test/Transforms/ExpandMemCmp/X86/memcmp.ll