git.osdn.net Git - android-x86/external-llvm.git/commit

[AMDGPU] Collapse adjacent SI_END_CF

Add a pass to remove redundant S_OR_B64 instructions enabling lanes in
the exec. If two SI_END_CF (lowered as S_OR_B64) come together without any
vector instructions between them we can only keep outer SI_END_CF, given
that CFG is structured and exec bits of the outer end statement are always
not less than exec bit of the inner one.

This needs to be done before the RA to eliminate saved exec bits registers
but after register coalescer to have no vector registers copies in between
of different end cf statements.

Differential Revision: https://reviews.llvm.org/D35967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309762 91177308-0d34-0410-b5e6-96231b3b80d8

author	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
	Tue, 1 Aug 2017 23:14:32 +0000 (23:14 +0000)
committer	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
	Tue, 1 Aug 2017 23:14:32 +0000 (23:14 +0000)
commit	5b53ac928df501548823370b8f0ee5c656ee6491
tree	ff27e6a2625f9a225f4f6751b4b02aee75a501ce	tree \| snapshot
parent	6aacb6c808d82a8cd5018ee04080f8d92635832d	commit \| diff

lib/Target/AMDGPU/AMDGPU.h		diff \| blob \| history
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp		diff \| blob \| history
lib/Target/AMDGPU/CMakeLists.txt		diff \| blob \| history
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp	[new file with mode: 0644]	blob
test/CodeGen/AMDGPU/collapse-endcf.ll	[new file with mode: 0644]	blob