git.osdn.net Git - android-x86/external-llvm.git/commit

author	Hal Finkel <hfinkel@anl.gov>
	Fri, 4 Apr 2014 23:51:18 +0000 (23:51 +0000)
committer	Hal Finkel <hfinkel@anl.gov>
	Fri, 4 Apr 2014 23:51:18 +0000 (23:51 +0000)
commit	e6a5b33e6e3e9626634f08f2dab8cbc0866e30b5
tree	48a6115dc1972a6184a8c04732dd3b94235975e8	tree \| snapshot
parent	cef9f7ef271e9d95c8151ed8e683ef272b6b8c18	commit \| diff

[PowerPC] Adjust load/store costs in PPCTTI

This provides more realistic costs for the insert/extractelement instructions
(which are load/store pairs), accounts for the cheap unaligned Altivec load
sequence, and for unaligned VSX load/stores.

Bad news:
MultiSource/Applications/sgefa/sgefa - 35% slowdown (this will require more investigation)
SingleSource/Benchmarks/McGill/queens - 20% slowdown (we no longer vectorize this, but it was a constant store that was scalarized)
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 - 2% slowdown

Good news:
SingleSource/Benchmarks/Shootout/ary3 - 54% speedup
SingleSource/Benchmarks/Shootout-C++/ary - 40% speedup
MultiSource/Benchmarks/Ptrdist/ks/ks - 35% speedup
MultiSource/Benchmarks/FreeBench/neural/neural - 30% speedup
MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt - 20% speedup

Unfortunately, estimating the costs of the stack-based scalarization sequences
is hard, and adjusting these costs is like a game of whac-a-mole :( I'll
revisit this again after we have better codegen for vector extloads and
truncstores and unaligned load/stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205658 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/PowerPC/PPCTargetTransformInfo.cpp		diff \| blob \| history
test/Analysis/CostModel/PowerPC/ext.ll		diff \| blob \| history
test/Analysis/CostModel/PowerPC/insert_extract.ll		diff \| blob \| history
test/Analysis/CostModel/PowerPC/load_store.ll		diff \| blob \| history