git.osdn.net Git - android-x86/external-llvm.git/log

git-llvm: Fix incremental population of svn tree.

"svn update --depth=..." is, annoyingly, not a specification of the
desired depth, but rather a _limit_ added on top of the "sticky" depth
in the working-directory. However, if the directory doesn't exist yet,
then it sets the sticky depth of the new directory entries.

Unfortunately, the svn command-line has no way of expanding the depth
of a directory from "empty" to "files", without also removing any
already-expanded subdirectories. The way you're supposed to increase
the depth of an existing directory is via --set-depth, but
--set-depth=files will also remove any subdirs which were already
requested.

This change avoids getting into the state of ever needing to increase
the depth of an existing directory from "empty" to "files" in the
first place, by:

1. Use svn update --depth=files, not --depth=immediates.

The latter has the effect of checking out the subdirectories and
marking them as depth=empty. The former excludes sub-directories from
the list of entries, which avoids the problem.

2. Explicitly populate missing parent directories.

Using --parents seemed nice and easy, but it marks the parent dirs as
depth=empty. Instead, check out parents explicitly if they're missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347883 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] auto-generate complete checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347882 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] auto-generate complete checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347881 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Add and update scalar instructions

This patch adds support for S_ANDN2, S_ORN2 32-bit and 64-bit instructions and adds splits to move them to the vector unit (for which there is no equivalent instruction). It modifies the way that the more complex scalar instructions are lowered to vector instructions by first breaking them down to sequences of simpler scalar instructions which are then lowered through the existing code paths. The pattern for S_XNOR has also been updated to apply inversion to one input rather than the output of the XOR as the result is equivalent and may allow leaving the NOT instruction on the scalar unit.

A new tests for NAND, NOR, ANDN2 and ORN2 have been added, and existing tests now hit the new instructions (and have been modified accordingly).

Differential: https://reviews.llvm.org/D54714

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347877 91177308-0d34-0410-b5e6-96231b3b80d8

Fix: Add support for TFE/LWE in image intrinsic

My change svn-id: 347871 caused a buildbot failure due to an unused
variable def (used in an assert).

Change-Id: Ia882d18bb6fa79b4d7bbfda422b9ea5d23eab336

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347876 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r347823 "[TextAPI] Switch back to a custom Platform enum."

It broke the Windows buildbots, e.g.
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/21829/steps/test/logs/stdio

This also reverts the follow-ups: r347824, r347827, and r347836.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347874 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSiteSplitting] Report edge deletion to DomTreeUpdater

Summary:
When splitting musttail calls, the split blocks' original terminators
get removed; inform the DTU when this happens.

Also add a testcase that fails an assertion in the DTU without this fix.

Reviewers: fhahn, junbuml

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D55027

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347872 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for TFE/LWE in image intrinsics

TFE and LWE support requires extra result registers that are written in the
event of a failure in order to detect that failure case.
The specific use-case that initiated these changes is sparse texture support.

This means that if image intrinsics are used with either option turned on, the
programmer must ensure that the return type can contain all of the expected
results. This can result in redundant registers since the vector size must be a
power-of-2.

This change takes roughly 6 parts:
1. Modify the instruction defs in tablegen to add new instruction variants that
can accomodate the extra return values.
2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE
(where the bulk of the work for these instruction types is now done)
3. Extra verification code to catch cases where intrinsics have been used but
insufficient return registers are used.
4. Modification to the adjustWritemask optimisation to account for TFE/LWE being
enabled (requires extra registers to be maintained for error return value).
5. An extra pass to zero initialize the error value return - this is because if
the error does not occur, the register is not written and thus must be zeroed
before use. Also added a new (on by default) option to ensure ALL return values
are zero-initialized that is required for sparse texture support.
6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO
for this to re-enable and handle correctly).

There's an additional fix now to avoid a dmask=0

For an image intrinsic with tfe where all result channels except tfe
were unused, I was getting an image instruction with dmask=0 and only a
single vgpr result for tfe. That is incorrect because the hardware
assumes there is at least one vgpr result, plus the one for tfe.

Fixed by forcing dmask to 1, which gives the desired two vgpr result
with tfe in the second one.

The TFE or LWE result is returned from the intrinsics using an aggregate
type. Look in the test code provided to see how this works, but in essence IR
code to invoke the intrinsic looks as follows:

%v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15,
i32 %s, <8 x i32> %rsrc, i32 1, i32 0)
%v.vec = extractvalue {<4 x float>, i32} %v, 0
%v.err = extractvalue {<4 x float>, i32} %v, 1

Differential revision: https://reviews.llvm.org/D48826

Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347871 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] tidy processCmp(); NFC

1. The variables were confusing: 'C' typically refers to a constant, but here it was the Cmp.
2. Formatting violations.
3. Simplify code to return true/false constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347868 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[LICM] Enable control flow hoisting by default" and "[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix"

This reverts commits r347776 and r347778.

The first one, r347776, caused significant compile time regressions
for certain input files, see PR39836 for details.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347867 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] auto-generate complete test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347866 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r347596 "Support for inserting profile-directed cache prefetches"

It causes asserts building BoringSSL. See https://crbug.com/91009#c3 for
repro.

This also reverts the follow-ups:
Revert r347724 "Do not insert prefetches with unsupported memory operands."
Revert r347606 "[X86] Add dependency from X86 to ProfileData after rL347596"
Revert r347607 "Add new passes to X86 pipeline tests"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347864 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Fix insertion of stack-protector epilogue

* Tell the StackProtector pass to generate the epilogue instrumentation
  when GlobalISel is enabled because GISel currently does not implement
  the same deferred epilogue insertion as SelectionDAG.
* Update StackProtector::InsertStackProtectors() to find a stack guard
  slot by searching for the llvm.stackprotector intrinsic when the
  prologue was not created by StackProtector itself but the pass still
  needs to generate the epilogue instrumentation. This fixes a problem
  when the pass would abort because the stack guard AllocInst pointer
  was null when generating the epilogue -- test
  CodeGen/AArch64/GlobalISel/arm64-irtranslator-stackprotect.ll.

Differential Revision: https://reviews.llvm.org/D54518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347862 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Make EnableGlobalISel always set when GISel is enabled

Change meaning of TargetOptions::EnableGlobalISel. The flag was
previously set only when a target switched on GlobalISel but it is now
always set when the GlobalISel pipeline is enabled. This makes the flag
consistent with TargetOptions::EnableFastISel and allows its use in
other parts of the compiler to determine when GlobalISel is enabled.

The EnableGlobalISel flag had previouly only one use in
TargetPassConfig::isGlobalISelAbortEnabled(). The method used its value
to determine if GlobalISel was enabled by a target and returned false in
such a case. To preserve the current behaviour, a new flag
TargetOptions::GlobalISelAbort is introduced to separately record the
abort behaviour.

Differential Revision: https://reviews.llvm.org/D54518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347861 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-rc] Support EXSTYLE statement.

Patch by Jacek Caban!

Differential Revision: https://reviews.llvm.org/D55020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347858 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][MC] Add the ability to declare which processor resources model load/store queues (PR36666).

This patch adds the ability to specify via tablegen which processor resources
are load/store queue resources.

A new tablegen class named MemoryQueue can be optionally used to mark resources
that model load/store queues.  Information about the load/store queue is
collected at 'CodeGenSchedule' stage, and analyzed by the 'SubtargetEmitter' to
initialize two new fields in struct MCExtraProcessorInfo named `LoadQueueID` and
`StoreQueueID`.  Those two fields are identifiers for buffered resources used to
describe the load queue and the store queue.
Field `BufferSize` is interpreted as the number of entries in the queue, while
the number of units is a throughput indicator (i.e. number of available pickers
for loads/stores).

At construction time, LSUnit in llvm-mca checks for the presence of extra
processor information (i.e. MCExtraProcessorInfo) in the scheduling model.  If
that information is available, and fields LoadQueueID and StoreQueueID are set
to a value different than zero (i.e. the invalid processor resource index), then
LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value
declared by the two processor resources.

With this patch, we more accurately track dynamic dispatch stalls caused by the
lack of LS tokens (i.e. load/store queue full). This is also shown by the
differences in two BdVer2 tests. Stalls that were previously classified as
generic SCHEDULER FULL stalls, are not correctly classified either as "load
queue full" or "store queue full".

About the differences in the -scheduler-stats view: those differences are
expected, because entries in the load/store queue are not released at
instruction issue stage. Instead, those are released at instruction executed
stage.  This is the main reason why for the modified tests, the load/store
queues gets full before PdEx is full.

Differential Revision: https://reviews.llvm.org/D54957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347857 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo

Summary:
MachineLoopInfo cannot be relied on for correctness, because it cannot
properly recognize loops in irreducible control flow which can be
introduced by late machine basic block optimization passes. See the new
test case for the reduced form of an example that occurred in practice.

Use a simple fixpoint iteration instead.

In order to facilitate this change, refactor WaitcntBrackets so that it
only tracks pending events and registers, rather than also maintaining
state that is relevant for the high-level algorithm. Various accessor
methods can be removed or made private as a consequence.

Affects (in radv):
- dEQP-VK.glsl.loops.special.{for,while}_uniform_iterations.select_iteration_count_{fragment,vertex}

Fixes: r345719 ("AMDGPU: Rewrite SILowerI1Copies to always stay on SALU")

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347853 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points

Summary:
There is one obsolete reference to using -1 as an indication of "unknown",
but this isn't actually used anywhere.

Using unsigned makes robust wrapping checks easier.

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, llvm-commits, tpr, t-tye, hakzsam

Differential Revision: https://reviews.llvm.org/D54230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347852 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347851 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/InsertWaitcnts: Simplify pending events tracking

Summary:
Instead of storing the "score" (last time point) of the various relevant
events, only store whether an event is pending or not.

This is sufficient, because whenever only one event of a count type is
pending, its last time point is naturally the upper bound of all time
points of this count type, and when multiple event types are pending,
the count type has gone out of order and an s_waitcnt to 0 is required
to clear any pending event type (and will then clear all pending event
types for that count type).

This also removes the special handling of GDS_GPR_LOCK and EXP_GPR_LOCK.
I do not understand what this special handling ever attempted to achieve.
It has existed ever since the original port from an internal code base,
so my best guess is that it solved a problem related to EXEC handling in
that internal code base.

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347850 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types

Summary:
It hides the type casting ugliness, and I happened to have to add a new
such loop (in a later patch).

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347849 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/InsertWaitcnts: Untangle some semi-global state

Summary:
Reduce the statefulness of the algorithm in two ways:

1. More clearly split generateWaitcntInstBefore into two phases: the
   first one which determines the required wait, if any, without changing
   the ScoreBrackets, and the second one which actually inserts the wait
   and updates the brackets.

2. Communicate pre-existing s_waitcnt instructions using an argument to
   generateWaitcntInstBefore instead of through the ScoreBrackets.

To simplify these changes, a Waitcnt structure is introduced which carries
the counts of an s_waitcnt instruction in decoded form.

There are some functional changes:

1. The FIXME for the VCCZ bug workaround was implemented: we only wait for
   SMEM instructions as required instead of waiting on all counters.

2. We now properly track pre-existing waitcnt's in all cases, which leads
   to less conservative waitcnts being emitted in some cases.

     s_load_dword ...
     s_waitcnt lgkmcnt(0)    <-- pre-existing wait count
     ds_read_b32 v0, ...
     ds_read_b32 v1, ...
     s_waitcnt lgkmcnt(0)    <-- this is too conservative
     use(v0)
     more code
     use(v1)

   This increases code size a bit, but the reduced latency should still be a
   win in basically all cases. The worst code size regressions in my shader-db
   are:

WORST REGRESSIONS - Code Size
Before After     Delta Percentage
   1724  1736        12    0.70 %   shaders/private/f1-2015/1334.shader_test [0]
   2276  2284         8    0.35 %   shaders/private/f1-2015/1306.shader_test [0]
   4632  4640         8    0.17 %   shaders/private/ue4_elemental/62.shader_test [0]
   2376  2384         8    0.34 %   shaders/private/f1-2015/1308.shader_test [0]
   3284  3292         8    0.24 %   shaders/private/talos_principle/1955.shader_test [0]

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347848 91177308-0d34-0410-b5e6-96231b3b80d8

[CODE_OWNERS] Add myself as code owner for MinGW

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347847 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Add two XFAIL tests from PR39783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347845 91177308-0d34-0410-b5e6-96231b3b80d8

Disable TermFolding in LoopSimplifyCFG until PR39783 is fixed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347844 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopStrengthReduce] ComplexityLimit as an option

Convert ComplexityLimit into a command line value.

Differential Revision: https://reviews.llvm.org/D54899

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347843 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Modify the merging of min-legal-vector-width attribute to better handle when the caller or callee don't have the attribute.

Lack of an attribute means that the function hasn't been checked for what vector width it requires. So if the caller or the callee doesn't have the attribute we should make sure the combined function after inlining does not have the attribute.

If the caller already doesn't have the attribute we can just avoid adding it. Otherwise if the callee doesn't have the attribute just remove the caller's attribute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347841 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Add test for merging of min-legal-vector-width function attribute.

This should have been added in r337844, but apparently was I failed to 'git add' the file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347840 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] Improve compile time for complex addressing mode

This is a fix for PR39625 with improvement the compile time
by reducing the number of intermediate Phi nodes created.

Reviewers: john.brawn, reames
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54932

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347839 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[TextAPI] Fix a memory leak in the TBD reader."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347838 91177308-0d34-0410-b5e6-96231b3b80d8

[TextAPI] Fix a memory leak in the TBD reader.

This fixes an issue where we were leaking the YAML document if there was a
parsing error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347837 91177308-0d34-0410-b5e6-96231b3b80d8

[TextAPI] Switch back to a custom Platform enum.

Moving to PlatformType from BinaryFormat had some UB fallout when handing
unknown platforms or malformed input files.

This should fix the sanitizer bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347836 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Correct comment. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347835 91177308-0d34-0410-b5e6-96231b3b80d8

Add Hurd target to LLVMSupport (1/2)

Add the required target triples to LLVMSupport to support Hurd
in LLVM (formally `pc-hurd-gnu`).

Patch by sthibaul (Samuel Thibault)

Differential Revision: https://reviews.llvm.org/D54378

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347832 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix a conversion is not considered when the ISD::BR_CC node making the instruction selection

Summary:
A signed comparison of i1 values produces the opposite result to an unsigned one if the condition code
includes less-than or greater-than. This is so because 1 is the most negative signed i1 number and the
most positive unsigned i1 number. The CR-logical operations used for such comparisons are non-commutative
so for signed comparisons vs. unsigned ones, the input operands just need to be swapped.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D54825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347831 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] [NFC] Add test cases to the ISD::BR_CC node in the instruction selection
Add the following test case for the ISD::BR_CC node in the instruction selection
define i64 @testi64slt(i64 %c1, i64 %c2, i64 %c3, i64 %c4, i64 %a1, i64 %a2) #0 {
entry:
  %cmp1 = icmp eq i64 %c3, %c4
  %cmp3tmp = icmp eq i64 %c1, %c2
  %cmp3 = icmp slt i1 %cmp3tmp, %cmp1
  br i1 %cmp3, label %iftrue, label %iffalse
iftrue:
  ret i64 %a1
iffalse:
  ret i64 %a2
}
The data type i64 can be replaced by i32, i64, float, double 
And condition codes can be replaced by: SETEQ, SETEN, SELT, SETLE, SETGT, SETGE,SETULT, SETULE, SSETGT, and SETUGE

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D54824

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347828 91177308-0d34-0410-b5e6-96231b3b80d8

[TextAPI] TBD Reader/Writer (bot fixes: take 2)

Replace the tuple with a struct to work around an explicit constructor bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347827 91177308-0d34-0410-b5e6-96231b3b80d8

NFC. Use unsigned type for uses counter in CaptureTracking

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347826 91177308-0d34-0410-b5e6-96231b3b80d8

[TextAPI] TBD Reader/Writer (bot fixes)

Trying if switching from a vector to an array will appeas the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347824 91177308-0d34-0410-b5e6-96231b3b80d8

[TextAPI] TBD Reader/Writer

Add basic infrastructure for reading and writting TBD files (version 1 - 3).

The TextAPI library is not used by anything yet (besides the unit tests). Tool
support will be added in a separate commit.

The TBD format is currently documented in the implementation file (TextStub.cpp).

https://reviews.llvm.org/D53945

Update: This contains changes to fix issues discovered by the bots:
- add parentheses to silence warnings.
- rename variables
- use PlatformType from BinaryFormat

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347823 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try select simplification for target-specific nodes

This failed to select (which might be a separate bug) in
X86ISelDAGToDAG because we try to create a select node
that can be simplified away after rL347227.

This change avoids the problem by simplifying the SHRUNKBLEND
node sooner. In the test case, we manage to realize that the
true/false values of the select (SHRUNKBLEND) are the same thing,
so it simplifies away completely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347818 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[TextAPI] TBD Reader/Writer"

Reverting to unbreak bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347809 91177308-0d34-0410-b5e6-96231b3b80d8

[TextAPI] TBD Reader/Writer

Add basic infrastructure for reading and writting TBD files (version 1 - 3).

The TextAPI library is not used by anything yet (besides the unit tests). Tool
support will be added in a separate commit.

The TBD format is currently documented in the implementation file (TextStub.cpp).

https://reviews.llvm.org/D53945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347808 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] IR/Bitcode changes for DISubprogram flags.

Packing the flags into one bitcode word will save effort in
adding new flags in the future.

Differential Revision: https://reviews.llvm.org/D54755

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347806 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "[llvm-mca] Return the total number of cycles from method Pipeline::run()."

This reapplies r347767 (originally reviewed at: https://reviews.llvm.org/D55000)
with a fix for the missing std::move of the Error returned by the call to
Pipeline::runCycle().

Below is the original commit message from r347767.

If a user only cares about the overall latency, then the best/quickest way is to
change method Pipeline::run() so that it returns the total number of cycles to
the caller.

When the simulation pipeline is run, the number of cycles (or an error) is
returned from method Pipeline::run().
The advantage is that no hardware event listener is needed for computing that
latency. So, the whole process should be faster (and simpler - at least for that
particular use case).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347795 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make X86TTIImpl::getCastInstrCost properly handle the case where AVX512 is enabled, but 512-bit vectors aren't legal.

Unlike most cost model functions this code makes a lot of table lookups without using the results from getTypeLegalizationCost. This means 512-bit vectors can be looked up even when the type isn't legal.

This patch adds a check around the two tables that contain 512-bit types to make sure that neither of the types would be split by type legalization. Meaning 512 bit types are illegal. I wanted to write this in a somewhat generic way that uses type legalization query hooks. But if prefered, I can switch to just using is512BitVector and the subtarget feature.

Differential Revision: https://reviews.llvm.org/D54984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347786 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add some cost model entries for sext/zext for avx512bw

This fixes some of scalarization costs reported for sext/zext using avx512bw. This does not fix all scalarization costs being reported. Just the worst.

I've restricted this only to combinations of types that are legal with avx512bw like v32i1/v64i1/v32i16/v64i8 and conversions between vXi1 and vXi8/vXi16 with legal vXi8/vXi16 result types.

Differential Revision: https://reviews.llvm.org/D54979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347785 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add a combine for back to back VSRAI instructions

Expansion of SIGN_EXTEND_INREG can create a VSRAI instruction. If there is already a VSRAI after it, we should combine them into a larger VSRAI

Differential Revision: https://reviews.llvm.org/D54959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347784 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Give inlinable calls DILocs (PR39807)

In PR39807 we incorrectly handle circumstances where calls are common'd
from conditional blocks into the parent BB. Calls that can be inlined
must always have DebugLocs, however we strip them during commoning, which
the IR verifier asserts on.

Fix this by using applyMergedLocation: it will perform the same DebugLoc
stripping of conditional Locs, but will also generate an unknown location
DebugLoc that satisfies the requirement for inlinable calls to always have
locations.

Some of the prior logic for selecting a DebugLoc is now likely redundant;
I'll generate a follow-up to remove it (involves editing more regression
tests).

Differential Revision: https://reviews.llvm.org/D54997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347782 91177308-0d34-0410-b5e6-96231b3b80d8

[LICM] Enable control flow hoisting by default

Differential Revision: https://reviews.llvm.org/D54949

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347778 91177308-0d34-0410-b5e6-96231b3b80d8

[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix

This commit caused failures because it failed to correctly handle cases where
we hoist a phi, then hoist a use of that phi, then have to rehoist that use. We
need to make sure that we rehoist the use to _after_ the hoisted phi, which we
do by always rehoisting to the immediate dominator instead of just rehoisting
everything to the original preheader.

An option is also added to control whether control flow is hoisted, which is
off in this commit but will be turned on in a subsequent commit.

Differential Revision: https://reviews.llvm.org/D52827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347776 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [llvm-mca] Return the total number of cycles from method Pipeline::run().

This reverts commits 347767.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347775 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Support .option push and .option pop

This adds support in the RISCVAsmParser the storing of Subtarget feature bits to a stack so that they can be pushed/popped to enable/disable multiple features at once.

Differential Revision: https://reviews.llvm.org/D46424
Patch by Lewis Revill.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347774 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Combine saturating add/sub with constant operands

Combine
sat(sat(X + C1) + C2) -> sat(X + (C1+C2))
and
sat(sat(X - C1) - C2) -> sat(X - (C1+C2))
if the sign of C1 and C2 matches.

In the unsigned case we can compute C1+C2 with saturating arithmetic,
and InstSimplify will reduce this just to the saturation value. For
the signed case, we cannot perform the simplification if the result
of the addition overflows.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347773 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Canonicalize ssub.sat to sadd.sat

Canonicalize ssub.sat(X, C) to ssub.sat(X, -C) if C is constant and
not signed minimum. This will help further optimizations to apply.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347772 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Determine always-overflow condition for unsigned sub

Always-overflow was already determined for unsigned addition, but
not subtraction. This patch establishes parity.

This allows us to perform some additional simplifications for
signed saturating subtractions.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347771 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Use known overflow information for saturating add/sub

If ValueTracking can determine that the add/sub can newer overflow,
replace it with the corresponding nuw/nsw add/sub.

Additionally, for the unsigned case, if ValueTracking determines
that the add/sub always overflows, replace the result with the
saturation value.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347770 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Canonicalize const arg for saturating adds

If a saturating add intrinsic has one constant argument, make sure
it is on the RHS. This will simplify further transformations.

This change is part of https://reviews.llvm.org/D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347769 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add missing flags to ELF YAMLIO

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347768 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Return the total number of cycles from method Pipeline::run().

If a user only cares about the overall latency, then the best/quickest way is to
change method Pipeline::run() so that it returns the total number of cycles to
the caller.

When the simulation pipeline is run, the number of cycles (or an error) is
returned from method Pipeline::run().
The advantage is that no hardware event listener is needed for computing that
latency. So, the whole process should be faster (and simpler - at least for that
particular use case).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347767 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-git: More tweaks.

On python3, use bytes for reading and applying the patch file, rather
than str. This fixes encoding issues when applying patches with
python3.X (reported by zturner).

Also, simplify and speed up "svn update" via svn's "--parents"
argument, instead of manually computing and supplying the list of
parent directories to update.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347766 91177308-0d34-0410-b5e6-96231b3b80d8

Fix DynamicLibraryTests build on Windows when LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is ON

extract_symbols.py (introduced in D18826) expects all of its library arguments
to be in the same directory - typically <config>/lib. DynamicLibraryLib.lib is
instead to be found in lib/<config>.
This patch intended to make DynamicLibraryLib.lib be created in <config>/lib
alongside most of the other libraries.

I previously tried passing absolute paths to extract_symbols.py but this
generated command lines that were too long for Visual Studio 2015: D54587

Differential Revision: https://reviews.llvm.org/D54701

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347764 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Correct linkonce_any function import linkage. NFC.

Summary:
This is a NFC as we do not import non-odr vague linkage when computing
for import list for a module.

Reviewers: tejohnson, pcc

Subscribers: inglorion, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D54928

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347763 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build error due to missing cctype include
in ARMTargetParser.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347762 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP]Fix PR39774: Set ReductionRoot if the original instruction is vectorized.

Summary:
If the original reduction root instruction was vectorized, it might be
removed from the tree. It means that the insertion point may become
invalidated and the whole vectorization of the reduction leads to the
incorrect output result.
The ReductionRoot instruction must be marked as externally used so it
could not be removed. Otherwise it might cause inconsistency with the
cost model and we may end up with too optimistic optimization.

Reviewers: RKSimon, spatel, hfinkel, mkuper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54955

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347759 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Winfinite-recursion compile error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347749 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build of r347741 by adding missing vector
include to ARMTargetParser.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347748 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineScheduler] Add support for clustering mem ops with FI base operands

Before this patch, the following stores in `merge_fail` would fail to be
merged, while they would get merged in `merge_ok`:

```
void use(unsigned long long *);
void merge_fail(unsigned key, unsigned index)
{
  unsigned long long args[8];
  args[0] = key;
  args[1] = index;
  use(args);
}
void merge_ok(unsigned long long *dst, unsigned a, unsigned b)
{
  dst[0] = a;
  dst[1] = b;
}
```

The reason is that `getMemOpBaseImmOfs` would return false for FI base
operands.

This adds support for this.

Differential Revision: https://reviews.llvm.org/D54847

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347747 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen][NFC] Make `TII::getMemOpBaseImmOfs` return a base operand

Currently, instructions doing memory accesses through a base operand that is
not a register can not be analyzed using `TII::getMemOpBaseRegImmOfs`.

This means that functions such as `TII::shouldClusterMemOps` will bail
out on instructions using an FI as a base instead of a register.

The goal of this patch is to refactor all this to return a base
operand instead of a base register.

Then in a separate patch, I will add FI support to the mem op clustering
in the MachineScheduler.

Differential Revision: https://reviews.llvm.org/D54846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347746 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Rename EmitDebugThreadLocal back to EmitDebugValue. NFC

This reverts r294500. DwarfCompileUnit::addAddressExpr uses DIEExpr
for PCOffset. In that case the expression is unrelated to thread locals
and so emitting a value of the DIEExpr does not have to always mean
emit-debug-thread-local.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347744 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Better error checking for TIED_TO constraints.

There are quite strong constraints on how you can use the TIED_TO
constraint between MC operands, many of which are currently not
checked until compiler run time.

MachineVerifier enforces that operands can only be tied together in
pairs (no three-way ties), and MachineInstr::tieOperands enforces that
one of the tied operands must be an output operand (def) and the other
must be an input operand (use).

Now we check these at TableGen time, so that if you violate any of
them in a new instruction definition, you find out immediately,
instead of having to wait until you compile something that makes code
generation hit one of those assertions.

Also in this commit, all the error reports in ParseConstraint now
include the name and source location of the def where the problem
happened, so that if you do trigger any of these errors, it's easier
to find the part of your TableGen input where you made the mistake.

The trunk sources already build successfully with this additional
error check, so I think no in-tree target has any of these problems.

Reviewers: fhahn, lhames, nhaehnle, MatzeB

Reviewed By: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53815

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347743 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM, AArch64] Move ARM/AArch64 target parsers into
separate files to enable future changes.

This moves ARM and AArch64 target parsing into their
own files. They are still accessible through
TargetParser.h as before.

Several functions in AArch64 which were just forwarders to ARM
have been removed. All except AArch64::getFPUName were unused,
and that was only used in a test. Which itself was overlapping
one in ARM, so it has also been removed.

Differential revision: https://reviews.llvm.org/D53980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347741 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ::TTI] Improve cost for compare of i64 with extended i32 load

CGF/CLGF compares an i64 register with a sign/zero extended loaded i32 value
in memory.

This patch makes such a load considered foldable and so gets a 0 cost.

Review: Ulrich Weigand
https://reviews.llvm.org/D54944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347735 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ::TTI] Improve costs for i16 add, sub and mul against memory.

AH, SH and MH costs are already covered in the cases where LHS is 32 bits and
RHS is 16 bits of memory sign-extended to i32.

As these instructions are also used when LHS is i16, this patch recognizes
that the loads will get folded then as well.

Review: Ulrich Weigand
https://reviews.llvm.org/D54940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347734 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ::TTI] Improved cost values for comparison against memory.

Single instructions exist for i8 and i16 comparisons of memory against a
small immediate.

This patch makes sure that if the load in these cases has a single user (the
ICmp), it gets a 0 cost (folded), and also that the ICmp gets a cost of 1.

Review: Ulrich Weigand
https://reviews.llvm.org/D54897

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347733 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ::TTI] Return zero cost for scalar load/store connected with a bswap.

Since byte-swapping loads and stores are supported, a 'load -> bswap' or
'bswap -> store' sequence should have the cost of one.

Review: Ulrich Weigand
https://reviews.llvm.org/D54870

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347732 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Hook up the -V alias to --version, output "GNU strip"

This allows libtool to detect the presence of llvm-strip and use
it with the options --strip-debug and --strip-unneeded.

Also hook up the -V alias for objcopy.

Differential Revision: https://reviews.llvm.org/D54936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347731 91177308-0d34-0410-b5e6-96231b3b80d8

Do not insert prefetches with unsupported memory operands.

Summary:
Ignore advices where the memory operand of the 'anchor' instruction
uses unsupported register types.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54983

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347724 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases to show that we don't properly take -mprefer-vector-width=256 and -min-legal-vector-width=256 into account when costing sext/zext.

The check lines marked AVX256 in the zext256/sext256 functions should be closer to the AVX values which would take into account a splitting cost.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347722 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add exhaustive cost model testing for sext/zext for all vector types we reasonably support. Add cost model tests for truncating to vXi1.

Our sext/zext cost modeling was somewhat incomplete. And had no coverage for the fact that avx512bw v32i16/v64i8 types return a scalarization cost.

Truncates are a whole different mess because isTruncateFree is returning true for vectors when it shouldn't and that's the fall back for anything not in the tables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347719 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Improve readability of generated code (NFC)

Improve the readability of the generated code for `MCOpcodeSwitchStatement`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347707 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Refactor macro names (NFC)

Make the names for the macros for `TargetInstrInfo` uniform.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347706 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] Treat COFF/ARM64 as a 64 bit architecture

Differential Revision: https://reviews.llvm.org/D54935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347703 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add enough build files to be able to build llvm-tblgen.

Adds build files for:

- llvm/lib/DebugInfo/CodeView
- llvm/lib/DebugInfo/MSF
- llvm/lib/MC
- llvm/lib/TableGen
- llvm/utils/TableGen

All the build files just list sources and deps and are uninteresting.

Differential Revision: https://reviews.llvm.org/D54931

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347702 91177308-0d34-0410-b5e6-96231b3b80d8

[clang][slh] add attribute for speculative load hardening

Summary:
Resubmit this with no changes because I think the build was broken
by a different diff.
-----
The prior diff had to be reverted because there were two tests
that failed. I updated the two tests in this diff

clang/test/Misc/pragma-attribute-supported-attributes-list.test
clang/test/SemaCXX/attr-speculative-load-hardening.cpp

----- Summary from Previous Diff (Still Accurate) -----

LLVM IR already has an attribute for speculative_load_hardening. Before
this commit, when a user passed the -mspeculative-load-hardening flag to
Clang, every function would have this attribute added to it. This Clang
attribute will allow users to opt into SLH on a function by function basis.

This can be applied to functions and Objective C methods.

Reviewers: chandlerc, echristo, kristof.beyls, aaron.ballman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54915

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347701 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add tests for saturating add/sub; NFC

These are baseline tests for D54534.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347700 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add cost model tests for experimental.vector.reduce.* with -x86-experimental-vector-widening-legalization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347697 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add cost model test for masked load an store with -x86-experimental-vector-widening-legalization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347696 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add cost model tests for fp_to_int/int_to_fp with -x86-experimental-vector-widening-legalization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347695 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add cost model tests for shifts with -x86-experimental-vector-widening-legalization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347694 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Pass more environment variables through to child processes.

This arose when I was trying to have a substitution which invoked a
python script P, and that python script tried to invoke clang-cl (or
even cl). Since we invoke P with a custom environment, it doesn't
inherit the environment of the parent, and then when we go to invoke
clang-cl, it's unable to find the MSVC installation directory. There
were many more I could have passed through which are set by vcvarsall,
but I tried to keep it simple and only pass through the important ones.

Differential Revision: https://reviews.llvm.org/D54963

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347691 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing error checking code intended for r347687

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347690 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Add symbol records in bulk

Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.

Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.

With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).

Some stats on symbol sizes for the curious:
  PDB size: 507M
  sym bytes: 316,508,016
  sym count:  18,954,971
  sym byte avg: 16.7

As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.

Reviewers: zturner, aganea, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54554

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347687 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Preprocessing support

Differential Revision: https://reviews.llvm.org/D54926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347686 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Replace an APInt that is guaranteed to be 8-bits with just an 'unsigned'

We're already mixing this APInt with other 'unsigned' variables. This allows us to use regular comparison operators instead of needing to use APInt::ult or APInt::uge. And it removes a later conversion from APInt to unsigned.

I might be adding another combine to this function and this will probably simplify the logic required for that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347684 91177308-0d34-0410-b5e6-96231b3b80d8

[PartialInliner] Make PHIs free in cost computation.

InlineCost also treats them as free and the current implementation
can cause assertion failures if PHI nodes are moved outside the region
from entry BBs to the region.

It also updates the code to use the instructionsWithoutDebug iterator.

Reviewers: davidxl, davide, vsk, graham-yiu-huawei

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D54748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347683 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add cascade lake arch in X86 target.

This is skylake-avx512 with the addition of avx512vnni ISA.

Patch by Jianping Chen

Differential Revision: https://reviews.llvm.org/D54785

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347681 91177308-0d34-0410-b5e6-96231b3b80d8

Documentation: add \file markup as needed.

This makes Doxygen correctly associate the doc comment with the current
file rather than adding to the documentation for namespace llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347679 91177308-0d34-0410-b5e6-96231b3b80d8

[Demangle] remove itaniumFindTypesInMangledName

Summary:
This (very specialized) function was added to enable an LLDB use case.
Now that a more generic interface (overriding of parser functions -
D52992) is available, and LLDB has been converted to use that (D54074),
the function is unused and can be removed.

Reviewers: erik.pilkington, sgraenitz, rsmith

Subscribers: mgorny, hiraditya, christof, libcxx-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D54893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347670 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] pass -dispatch-stats flag to a couple of tests. NFC

This change is in preparation for a patch that fixes PR36666.

llvm-mca currently doesn't know if a buffered processor resource describes a
load or store queue. So, any dynamic dispatch stall caused by the lack of
load/store queue entries is normally reported as a generic SCHEDULER stall. See for
example the -dispatch-stats output from the two tests modified by this patch.

In future, processor models will be able to tag processor resources that are
used to describe load/store queues. That information would then be used by
llvm-mca to correctly classify dynamic dispatch stalls caused by the lack of
tokens in the LS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347662 91177308-0d34-0410-b5e6-96231b3b80d8