git.osdn.net Git - android-x86/external-llvm.git/log

[InstCombine] Allow fptrunc (fpext X)) to be reduced to a single fpext/ftrunc

If we are only truncating bits from the extend we should be able to just use a smaller extend.

If we are truncating more than the extend we should be able to just use a fptrunc since the presense of the fpextend shouldn't affect rounding.

Differential Revision: https://reviews.llvm.org/D43970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326595 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][x32] Save callee-save register used as base pointer for x32 ABI

For the x32 ABI, since the base pointer register (EBX) is a callee save register
it should be saved before use.

This fixes https://bugs.llvm.org/show_bug.cgi?id=36011

Differential Revision: https://reviews.llvm.org/D42358

Patch by Pratik Bhatu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326593 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fold variable into assert.

Avoids unused variable warnings in Release mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326592 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] Add utils/update_cc_test_checks.py

A utility to update LLVM IR in C/C++ FileCheck test files.

Example RUN lines in .c/.cc test files:

// RUN: %clang -S -Os -DXX %s -o - | FileCheck %s
// RUN: %clangxx -S -Os %s -o - | FileCheck -check-prefix=IR %s

Usage:

% utils/update_cc_test_checks.py --llvm-bin=release/bin test/a.cc
% utils/update_cc_test_checks.py --c-index-test=release/bin/c-index-test --clang=release/bin/clang /tmp/c/a.cc

    // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
    // RUN: %clang -emit-llvm -S -Os -DXX %s -o - | FileCheck -check-prefix=AA %s
    // RUN: %clangxx -emit-llvm -S -Os %s -o - | FileCheck -check-prefix=BB %s
    using T =
    #ifdef XX
        int __attribute__((vector_size(16)))
    #else
        short __attribute__((vector_size(16)))
    #endif
        ;

    // AA-LABEL: _Z3fooDv4_i:
    // AA:       entry:
    // AA-NEXT:    %add = shl <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>
    // AA-NEXT:    ret <4 x i32> %add
    //
    // BB-LABEL: _Z3fooDv8_s:
    // BB:       entry:
    // BB-NEXT:    %add = shl <8 x i16> %a, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
    // BB-NEXT:    ret <8 x i16> %add
    T foo(T a) {
      return a + a;
    }

Differential Revision: https://reviews.llvm.org/D42712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326591 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: InstrMapping for G_ZEXT

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326589 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: InstrMapping for G_TRUNC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326588 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define InstrMappings for G_FCMP

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326587 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for @llvm.minnum

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326586 91177308-0d34-0410-b5e6-96231b3b80d8

LoopUnroll: respect pragma unroll when AllowRemainder is disabled

Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a loop
fully unrolled in many cases even though the user specifies a unroll
count. Especially it affects OpenCL/CUDA since in many cases a loop
contains convergent instructions and currently AllowRemainder is
disabled for such loops.

Differential Revision: https://reviews.llvm.org/D43826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326585 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix access to stack arguments when re-aligning SP in Armv6m

When an Armv6m function dynamically re-aligns the stack, access to incoming
stack arguments (and to stack area, allocated for register varargs) is done via
SP, which is incorrect, as the SP is offset by an unknown amount relative to the
value of SP upon function entry.

This patch fixes it, by making access to "fixed" frame objects be done via FP
when the function needs stack re-alignment. It also changes the access to
"fixed" frame objects be done via FP (instead of using R6/BP) also for the case
when the stack frame contains variable sized objects. This should allow more
objects to fit within the immediate offset of the load instruction.

All of the above via a small refactoring to reuse the existing
`ARMFrameLowering::ResolveFrameIndexReference.`

Differential Revision: https://reviews.llvm.org/D43566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326584 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeICmps] Revert accidentally submitted failing test case.

Reverts r326574.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326582 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add missing instructions to the Power 9 scheduler

Adding more instructions using InstRW so that we can move away from ItinRW
and ultimately have a complete Power 9 scheduler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326578 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Check function type indexes

Also update tests containing invalid Wasm files, exposed by the check

Differential Revision: https://reviews.llvm.org/D43954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326577 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Add LLVM for Grad Students to Contributing page.

Adrian Sampson's blog post provides a good and relatively up-do-date
introduction to LLVM. I think this post could be helpful for people wanting
to get started with LLVM.

Reviewers: asb, tonic, silvas, probinson, kristof.beyls, rengolin

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D42904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326576 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeICmps] Revert 324317 "Enable the MergeICmps Pass by default."

While working on PR36557.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326575 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeIcmps] Add the test case from PR36557.

Summary: See PR36557.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326574 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit: Remove an extraneous space. NFC

Test commit access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326573 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[WebAssembly] More uses of uint8_t" and "[WebAssembly] Update tests"

This reverts commits r326541 and r326571.

The tests were correct, and were updated with incorrect expectations.
The original commit was broken and should be reverted to get things back
to a working state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326572 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Update tests after r326541

r326541 slightly increased the size of WebAssembly object files
and it broke test/MC/WebAssembly/global-ctor-dtor.ll.

This commit updates the test to unbreak it, also mentioned this to the
author of the original commit in case they don't want it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326571 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix codegen for VLD3/VLD4/VST3/VST4 with WB

Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is
broken due to 2 separate bugs:

1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing
   rules to expand them to non pseudo MIR. These are selected for
   ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD.

2) Selection of the right VLD/VST instruction is broken for load and
   store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called
   with MIR opcode for fixed writeback (ie increment is access size)
   and call getVLDSTRegisterUpdateOpcode() to select an opcode with
   register writeback if base register update is of a different size.
   Since getVLDSTRegisterUpdateOpcode() only knows about
   VLD1/VLD2/VST1/VST2 the call is currently conditional on the number
   of element in the vector.

   However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for
   load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not
   updated which later lead to a fixed writeback instruction being
   constructed with an extra operand for the register writeback.

This patch addresses the two issues as follows:
- it adds the necessary mapping from VLD1d64TPseudoWB_register and
  VLD1d64QPseudoWB_register to VLD1d64Twb_register and
  VLD1d64Qwb_register respectively. Like for the existing _fixed
  variants, the cost of these is bumped for unaligned access.
- it changes the logic in SelectVLD and SelectVSD to call isVLDfixed
  and isVSTfixed respectively to decide whether the opcode should be
  updated. It also reworks the logic and comments for pushing the
  writeback offset operand and r0 operand to clarify the logic:
  writeback offset needs to be pushed if it's a register writeback,
  r0 needs to be pushed if not and the instruction is a
  VLD1/VLD2/VST1/VST2.

Reviewers: rengolin, t.p.northover, samparker

Reviewed By: samparker

Patch by Thomas Preud'homme <thomas.preudhomme@arm.com>

Differential Revision: https://reviews.llvm.org/D42970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326570 91177308-0d34-0410-b5e6-96231b3b80d8

[LV][CFG] Add irreducible CFG detection for outer loops

This patch adds support for detecting outer loops with irreducible control
flow in LV. Current detection uses SCCs and only works for innermost loops.
This patch adds a utility function that works on any CFG, given its RPO
traversal and its LoopInfoBase. This function is a generalization
of isIrreducibleCFG from lib/CodeGen/ShrinkWrap.cpp. The code in
lib/CodeGen/ShrinkWrap.cpp is also updated to use the new generic utility
function.

Patch by Diego Caballero <diego.caballero@intel.com>

Differential Revision: https://reviews.llvm.org/D40874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326568 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for @llvm.maxnum

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326567 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove old UNIMPLEMENTED list

All of these are implemented and have appropriate test coverage

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326553 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] More uses of uint8_t for single byte values

Summary: It looks like this was missing from D43921.

Reviewers: sbc100

Subscribers: jfb, dschuff, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D43991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326541 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Added a couple of C LTO API interfaces to control the cache policy.
- thinlto_codegen_set_cache_size_bytes to control the absolute size of cache directory.
- thinlto_codegen_set_cache_size_files the size and amount of files in cache directory.
These functions have been supported in C++ LTO API for a long time, but were absent in C LTO API.

Differential Revision: https://reviews.llvm.org/D42446

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326537 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GCN: Promote i16 ctpop

i16 capable ASICs do not support i16 operands for this instruction.
Add tablegen pattern to merge chained i16 additions.

Differential Revision: https://reviews.llvm.org/D43985

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326535 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FPTOSI

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326534 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FPTOUI

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326533 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FMUL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326532 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add more test case to fpextend.ll.

This includes the test cases from D43970 and additional tests for combining (fptrunc (binop (fpext), (fpext))) where the pre-extended types don't match the trunc and therefore can't be completely removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326528 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FADD

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326526 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_SHL

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326525 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_XOR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326524 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_AND

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326523 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Gather EH instructions in one place. NFC.

Summary:
- Gather EH instructions in one place for easy tracking (more will be
added later)
- Variable name change

Reviewers: dschuff

Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D43742

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326522 91177308-0d34-0410-b5e6-96231b3b80d8

[ArgumentPromotion] don't break musttail invariant PR36543

Summary:
Do not break musttail invariant by promoting arguments of musttail
callee or caller.

Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk

Reviewed By: rnk

Subscribers: rnk, llvm-commits

Differential Revision: https://reviews.llvm.org/D43926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326521 91177308-0d34-0410-b5e6-96231b3b80d8

Utility functions for checked arithmetic

Provide checkedAdd and checkedMul functions, providing checked
arithmetic on signed integers.

Differential Revision: https://reviews.llvm.org/D43704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326516 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Simplify test cases by removing loads/stores that aren't required for what is being tested.

The loads and stores were getting the data and storing the results. There's no reason we can't just use function arguments and return.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326515 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow fmul fold with less than 'fast'

This is a retry of r326502 with updates to the reassociate
test file that I missed the first time.

@test15_reassoc in the supposed -reassociate test file
(except that it tests 2 other passes too...) shows that
there's no clear responsiblity for reassociation transforms.

Instcombine now gets that case, but only because the
constant values are identical. Otherwise, it would still
miss that pattern.

Reassociate doesn't get that case because it hasn't been
updated to use less than 'fast' FMF.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326513 91177308-0d34-0410-b5e6-96231b3b80d8

[Reassociate] regenerate checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326511 91177308-0d34-0410-b5e6-96231b3b80d8

revert r326502: [InstCombine] allow fmul fold with less than 'fast'

I forgot that I added tests for 'reassoc' to -reassociate, but
suprisingly that file calls -instcombine too, so it is affected.
I'll update that file and try again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326510 91177308-0d34-0410-b5e6-96231b3b80d8

bpf: introduce -mattr=dwarfris to disable DwarfUsesRelocationsAcrossSections

Commit e4507fb8c94b ("bpf: disable DwarfUsesRelocationsAcrossSections")
disables MCAsmInfo DwarfUsesRelocationsAcrossSections unconditionally
so that dwarf will not use cross section (between dwarf and symbol table)
relocations. This new debug format enables pahole to dump structures
correctly as libdwarves.so does not have BPF backend support yet.

This new debug format, however, breaks bcc (https://github.com/iovisor/bcc)
source debug output as llvm in-memory Dwarf support has some issues to
handle it. More specifically, with DwarfUsesRelocationsAcrossSections
disabled, JIT compiler does not generate .debug_abbrev and Dwarf
DIE (debug info entry) processing is not happy about this.

This patch introduces a new flag -mattr=dwarfris
(dwarf relocation in section) to disable DwarfUsesRelocationsAcrossSections.
DwarfUsesRelocationsAcrossSections is true by default.

Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326505 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow fmul fold with less than 'fast'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326502 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] When combining zero_extend of a truncate, only mask before extending for vectors.

Masking first, prevents the extend from being combine with loads. Its also interfering with some vXi1 extraction code.

Differential Revision: https://reviews.llvm.org/D42679

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326500 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][MMX] Improve handling of 64-bit MMX constants

64-bit MMX constant generation usually ends up lowering into SSE instructions before being spilled/reloaded as a MMX type.

This patch bitcasts the constant to a double value to allow correct loading directly to the MMX register.

I've added MMX constant asm comment support to improve testing, it's better to always print the double values as hex constants as MMX is mainly an integer unit (and even with 3DNow! its just floats).

Differential Revision: https://reviews.llvm.org/D43616

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326497 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Support some SimplifySetCC cases for comparing against vector splats of constants.

This supports things like

(setcc ugt X, 0) -> (setcc ne X, 0)

I've restricted to only make changes to vectors before legalize ops because I doubt all targets have accurate condition code legality information for vectors given how little we did before.

Differential Revision: https://reviews.llvm.org/D42948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326495 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Add v2f32 <-> v2i8/v2i16/v2i32 vector tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326494 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add trap1 instruction

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326492 91177308-0d34-0410-b5e6-96231b3b80d8

Add an llc testcase analogous to test/LTO/X86/strip-debug-info.ll

rdar://problem/37963669

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326491 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.cvt.pkrtz

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326490 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_OR

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326489 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Regenerate float to/from i8/i16 vector tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326488 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Remove default register mapping

This crashes for some opcodes, which prevents the SelectionDAG
fallback from working.

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326487 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Clean up code (NFC)

Clean up a couple of functions in `AArch64TargetLowering` by removing
redundant statements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326486 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Regenerate odd sized sext/zext tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326484 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Use a more correct getValueMapping

This was finding the wrong size registers for anything with
more than 2 components.

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326483 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_BITCAST

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326482 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Mark i32->i64 zext as legal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326481 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add support for secrel add/load/store relocations for COFF

Differential Revision: https://reviews.llvm.org/D43288

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326480 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: InstrMapping for llvm.amdgcn.exp.compr

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326479 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.exp

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326477 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] Update an obviously copy and pasted header comment to match this file. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326475 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Auto-generate complete checks. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326474 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define InstrMappings for G_ICMP

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326472 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Make i32 mul legal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326471 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_IMPLICIT_DEF

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326470 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Define instruction mapping for G_FCONSTANT

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326468 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Add copyCost for VGPR->SGPR copies

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326467 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Make i32 xor legal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326466 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Mark 32/64-bit G_FCMP as legal

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326465 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Mark 32-bit G_FPTOSI as legal

Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326464 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix broken gcc build after rL326454

The gcc builders were broken by rL326454
See: https://reviews.llvm.org/D43921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326460 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] use pattern matching to lower int_nvvm_match_all_sync*.

Now that patterns can handle intrinsics returning multiple results,
use tablegen'ed pattern matching instead of custom lowering.

Differential Revision: https://reviews.llvm.org/D43890

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326457 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Use uint8_t for single byte values to match the spec

The original BinaryEncoding.md document used to specify that
these values were `varint7`, but the official spec lists them
explicitly as single byte values and not LEB.

A similar change for wabt is in flight:
https://github.com/WebAssembly/wabt/pull/782

Differential Revision: https://reviews.llvm.org/D43921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326454 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Defer writing the build id until the rest of the PDB is written.

For now this is NFC, but this small refactor opens the door to
letting us embed a hash of the PDB in the build id field of the
PDB.

Differential Revision: https://reviews.llvm.org/D43913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326453 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] : fix for the crash in SIRegisterInfo when the regiser class not found

Differential revision: https://reviews.llvm.org./D43334

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326451 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add guest registers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326450 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] remove stale comments for tests; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326448 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add missing instructions to the Power 9 scheduler

Adding more instructions using InstRW so that we can move away from ItinRW
and ultimately have a complete Power 9 scheduler.

Differential Revision: https://reviews.llvm.org/D43899

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326447 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Update pre-generated test files to match latest llc output. NFC.

The ordering of llc's output was changed in rL326334.

Differential Revision: https://reviews.llvm.org/D43941

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326445 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] simplify code for (X*Y) * X => (X*X) * Y ; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326444 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] generate vuzp instead of mov

when a BUILD_VECTOR is created out of a sequence of EXTRACT_VECTOR_ELT with a
specific pattern sequence, either <0, 2, 4, ...> or <1, 3, 5, ...>, replace the
BUILD_VECTOR with either vuzp1 or vuzp2.

With this patch LLVM generates the following code for the first function fun1 in the testcase:
adrp x8, .LCPI0_0
ldr  q0, [x8, :lo12:.LCPI0_0]
tbl  v0.16b, { v0.16b }, v0.16b
ext  v1.16b, v0.16b, v0.16b, #8
uzp1 v0.8b, v0.8b, v1.8b
str  d0, [x8]
ret

Without this patch LLVM currently generates this code:
adrp    x8, .LCPI0_0
ldr     q0, [x8, :lo12:.LCPI0_0]
tbl     v0.16b, { v0.16b }, v0.16b
mov     v1.16b, v0.16b
mov     v1.b[1], v0.b[2]
mov     v1.b[2], v0.b[4]
mov     v1.b[3], v0.b[6]
mov     v1.b[4], v0.b[8]
mov     v1.b[5], v0.b[10]
mov     v1.b[6], v0.b[12]
mov     v1.b[7], v0.b[14]
str     d1, [x8]
ret

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326443 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] move/add tests for fmul reassociation; NFC

This transform may be out-of-scope for instcombine,
but this is only documenting the current behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326442 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] auto-generate full checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326440 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[DEBUGINFO] Add flag for DWARF2 or less to use sections as references."

This reverts commit r326328 to remove checks for emission of certain
sections after discussion with Eric Christofer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326436 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] fix argument attribute in lowering statepoint/patchpoint

Summary:
Use the correct loop index varaible, ArgI, to retrieve attributes.

Reviewers: thanm, sanjoy, rnk

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326433 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Fix unused variable warning in release builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326429 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] Move string pool into its own implementatino file. NFC.

The DwarfLinker implementation is already relatively large with over 4k
LOC. This commit moves the implementation of NonRelocatableStringpool
into a separate cpp file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326425 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Smart range calculation for SCEVUnknown Phis

The range of SCEVUnknown Phi which merges values `X1, X2, ..., XN`
can be evaluated as `U(Range(X1), Range(X2), ..., Range(XN))`.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D43810

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326418 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Stop passing two arguments by reference. NFC

I think these used to be out parameters, but they haven't been for a while.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326417 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Fix comments for handleAllErrors: it calls llvm_unreachable if the
contract is violated, not report_fatal_error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326413 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] cache symbolized function names for a repeatedly queried function ID

Summary:
Processing 2 GB XRay traces with "llvm-xray convert -symbolize" needs to
go over each trace record and symbolize the function name refered to by
its ID. Currently this happens by asking the LLVM symbolizer code every
single time. A simple cache can save around 30 minutes of processing of
that trace.

llvm-xray's resident memory usage increased negligibly with this cache.

Reviewers: dberris

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43896

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326407 91177308-0d34-0410-b5e6-96231b3b80d8

[RuntimeDyld][MachO] Fix assertion in encodeAddend, add missing directive to
test case.

r326290 fixed the assertion for decodeAddend, but not encodeAddend. The
regression test failed to catch this because it was missing the
subsections_via_symbols flag, so the desired relocation was not applied.

This patch also fixes the formatting of the assertion from r326290.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326406 91177308-0d34-0410-b5e6-96231b3b80d8

[IPSCCP] do not break musttail invariant (PR36485)

Do not replace results of `musttail` calls with a constant if the
call itself can't be removed.

Do not zap returns of `musttail` callees, if the call site can't be
removed and replaced with a constant.

Do not zap returns of `musttail`-calling blocks, this breaks
invariant too.

Patch by Fedor Indutny

Differential Revision: https://reviews.llvm.org/D43695

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326404 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Adding -disable-gisel-legality-check CL option

Currently it's impossible to test InstructionSelect pass with MIR which
is considered illegal by the Legalizer in Assert builds. In early stages
of porting an existing backend from SelectionDAG ISel to GlobalISel,
however, we would have very basic CallLowering, Legalizer, and
RegBankSelect implementations, but rather functional Instruction Select
with quite a few patterns selectable due to the semi-automatic porting
process borrowing them from SelectionDAG ISel.

As we are trying to define legality as a property of being selectable by
the instruction selector, it would be nice to be able to easily check
what the selector can do in its current state w/o the legality check
provided by the Legalizer getting in the way.

It also seems beneficial to have a regression testing set up that would
not allow the selector to silently regress in its support of the MIR not
supported yet by the previous passes in the GlobalISel pipeline.

This commit adds -disable-gisel-legality-check command line option to
llc that disables those legality checks in RegBankSelect and
InstructionSelect passes.

It also adds quite a few MIR test cases for AArch64's Instruction
Selector. Every one of them would fail on the legality check at the
moment, but will select just fine if the check is disabled. Every test
MachineFunction is intended to exercise a specific selection rule and
that rule only, encoded in the MachineFunction's name by the rule's
number, ID, and index of its GIM_Try opcode in TableGen'erated
MatchTable (-optimize-match-table=false).

Reviewers: ab, dsanders, qcolombet, rovka

Reviewed By: bogner

Subscribers: kristof.beyls, volkan, aditya_nandakumar, aemerson,
rengolin, t.p.northover, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D42886

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326396 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Emit a split line table only if there are split type units.

A .debug_info.dwo section doesn't use the .debug_line.dwo section.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326395 91177308-0d34-0410-b5e6-96231b3b80d8

[DAE] don't remove args of musttail target/caller

`musttail` requires identical signatures of caller and callee. Removing
arguments breaks `musttail` semantics.

PR36441

Patch by Fedor Indutny

Differential Revision: https://reviews.llvm.org/D43708

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326394 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make sure we don't combine (fneg (fma X, Y, Z)) to a target specific node when there are no FMA instructions.

This would cause a 'cannot select' error at isel when we should have emitted a lib call and an xor.

Fixes PR36553.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326393 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Lower loads from global constants using ld.global.nc (aka LDG).

Summary:
After D43914, loads from global variables in addrspace(1) happen with
ld.global. But since they're constants, even better would be to use
ld.global.nc, aka ldg.

Reviewers: tra

Subscribers: jholewinski, sanjoy, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D43915

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326390 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Use addrspacecast instead of target-specific intrinsics in NVPTXGenericToNVVM.

Summary:
NVPTXGenericToNVVM was using target-specific intrinsics to do address
space casts. Using the addrspacecast instruction is (a lot) simpler.
But it also has the advantage of being understandable to other passes.
In particular, InferAddrSpaces is able to understand these address space
casts and remove them in most cases.

Reviewers: tra

Subscribers: jholewinski, sanjoy, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D43914

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326389 91177308-0d34-0410-b5e6-96231b3b80d8