git.osdn.net Git - android-x86/external-llvm-project.git/commit

author	Changpeng Fang <changpeng.fang@gmail.com>
	Fri, 24 Jan 2020 00:57:43 +0000 (16:57 -0800)
committer	Changpeng Fang <changpeng.fang@gmail.com>
	Fri, 24 Jan 2020 00:57:43 +0000 (16:57 -0800)
commit	2531535984ad989ce88aeee23cb92a827da6686e
tree	70cc36b82c2a6c75b86a1ea1106c164397333bf9	tree \| snapshot
parent	7ad17e008b0abec9b791f17de2f75f9112510d9d	commit \| diff

AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare

    Summary:
      RCP has the accuracy limit. If FDIV fpmath require high accuracy rcp may not
    meet the requirement. However, in DAG lowering, fpmath information gets lost,
    and thus we may generate either inaccurate rcp related computation or slow code
    for fdiv.

    In patch implements fdiv optimizations in the AMDGPUCodeGenPrepare, which could
    exactly know !fpmath.

     FastUnsafeRcpLegal: We determine whether it is legal to use rcp based on
                         unsafe-fp-math, fast math flags, denormals and fpmath
                         accuracy request.

     RCP Optimizations:
       1/x -> rcp(x) when fast unsafe rcp is legal or fpmath >= 2.5ULP with
                                                      denormals flushed.
       a/b -> a*rcp(b) when fast unsafe rcp is legal.

     Use fdiv.fast:
       a/b -> fdiv.fast(a, b) when RCP optimization is not performed and
                              fpmath >= 2.5ULP with denormals flushed.

       1/x -> fdiv.fast(1,x)  when RCP optimization is not performed and
                              fpmath >= 2.5ULP with denormals.

    Reviewers:
      arsenm

    Differential Revision:
      https://reviews.llvm.org/D71293

llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIISelLowering.cpp		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fdiv.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fdiv.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fdiv32-to-rcp-folding.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fneg-combines.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/known-never-snan.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rcp.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/rcp-pattern.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/rcp_iflag.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/rsq.ll		diff \| blob \| history