git.osdn.net Git - android-x86/external-swiftshader.git/log

Subzero: Add a new document describing the register allocator.

There's some good work in Subzero's register allocation, which deserves to be captured in a single place.

BUG= none
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2277493003 .

Optimize single-vector shuffling.

Change-Id: Id3d40a72cb74c75ef4431e6af8855e08bde2bb5c
Reviewed-on: https://chromium-review.googlesource.com/433329
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Perform address optimization of sub-vector load/store.

Change-Id: I3459f9a5472aba37e1b7016b27403094e17bb9f7
Reviewed-on: https://chromium-review.googlesource.com/433372
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Optimize x86-64 addressing with a 32-bit unsigned index.

Change-Id: I65aff3da87dfb9c3e5db58621a1a02944a6065e8
Reviewed-on: https://chromium-review.googlesource.com/433365
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

[SubZero] Fix code generation of AtomicCmpxchg

The patch fixes a code generation issue occurred in the 2nd iteration of LL-SC loop.

R=stichnot@chromium.org

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Review-Url: https://codereview.chromium.org/2656723003 .

Work around empty set default parameter compilation issue.

Older versions of Clang don't allow using an empty set as the default
parameter of a class with an explicit constructor.
"error: chosen constructor is explicit in copy-initialization"

BUG=chromium:630728

Change-Id: I580073788ce3346d1ecffab336a0fcee210b2e0f
Reviewed-on: https://chromium-review.googlesource.com/431080
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Fix signed/unsigned comparison warning.

Change-Id: Idf81fb96dd32df8f96b5bc688bdce290265ff372
Reviewed-on: https://chromium-review.googlesource.com/430230
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Implement an intrinsic for nearbyint.

The round intrinsic gets translated to roundps on x86, which is SSE4.1
only. cvtps2pd + cvtdq2ps can be used as an SSE2 fallback. cvtps2pd
also corresponds to LLVM's nearbyint intrinsic.

BUG=swiftshader:20

Change-Id: I8b5896c443f202a5b25125b4e5049b0b3d3a11b0
Reviewed-on: https://chromium-review.googlesource.com/428491
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Support 16-byte aligned stack on 32-bit Windows.

On Windows x86-32, the ABI only guarantees the stack to be 4-byte
aligned. We therefore need the stack pointer to be explicitly
aligned when using vectors. This demands using a frame pointer (to
access function arguments). Also, we had to change accessing spilled
variables from the stack pointer instead of the frame pointer so they
are also aligned. This change does not affect PNaCl. Projects using
the Microsoft ABI should define SUBZERO_USE_MICROSOFT_ABI.

BUG=swiftshader:29

Change-Id: I186ce9435244d6fa9494ec514a91122b6be130b3
Reviewed-on: https://chromium-review.googlesource.com/427348
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Always align the stack to the fixed alloca requirements.

Local variables that use a fixed alloca stack slot are assigned offsets
starting at 0, before the prolog is written. Therefore the stack pointer
needs to be aligned to the alloca's maximum alignment requirement. This
required the following changes:
- Add FixedAllocaSizeBytes to SpillAreaSizeBytes before aligning it.
- Compute the maximum alignment requirement from FixedAllocaAlignBytes
  and SpillAreaAlignmentBytes, and prior NeedsStackAlignment uses.
- Always align the stack pointer to this maximum.
- Affected lit tests have been rebased. Note that in some cases the
  frame size is now bigger than necessary. This is due to
  FixedAllocaSizeBytes being padding to be a multiple of the alignment.
  This isn't strictly necessary since the spill areas take care of their
  own alignment.

BUG=swiftshader:29

Change-Id: Ief30acda91c958d072528b8b59c2e933f68adbb1
Reviewed-on: https://chromium-review.googlesource.com/419816
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Subzero, MIPS32: Atomic intrinsics fixes

This patch introduces changes to the MIPS32 intrinsic functions to
comply with PNaCl smoke tests.

Also made a change regarding addressing relative to frame pointer,
since it differs in MIPS compared to ARM and x86.

R=stichnot@chromium.org

Patch from Stefan Maksimovic <makdstefan@gmail.com>.

Review-Url: https://codereview.chromium.org/2619363003 .

[SubZero] Fix code generation issues occurred in Cross-test and PNaCL smoke-tests

The patch fixes various code generation issues found during testing of Cross-test and PNaCL smoke-test framework.

     1) To keep track of branches to same label, relative position of the branch from previous branch is used.
     2) Fixed encoding of conditional mov instructions
     3) Added MovFP64ToI64 instruction for f64 to i64 move
     4) Handled vector-types in Phi nodes
     5) Fixed alignment of spilled vector arguments on stack
     6) Save-restore FP registers
     7) Fixed code generation for Zext and Sext operations
     8) Fixed InsertElement for vi16x8 type

R=stichnot@chromium.org

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Review-Url: https://codereview.chromium.org/2619943003 .

Fix Mac OS compilation.

Change-Id: I140c17b1b48156ae5dd5ca6cf4ef41f3bc03f16a
Reviewed-on: https://chromium-review.googlesource.com/425780
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Fix inadvertent use of the Microsoft x86-64 calling convention.

Change-Id: I3d60dd5d2020b0fe234587810bdddbf855fa9e4a
Reviewed-on: https://chromium-review.googlesource.com/425829
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Subzero: Fix a build error against LLVM trunk.

LLVM trunk commit a5479197371e169bc676e0e570f2a994329f8773 split include/llvm/Bitcode/ReaderWriter.h into two separate header files. This CL hacks it back, in a sense.

With this change (along with the rest of the recent fixes), Subzero builds against the latest LLVM trunk - 759dd39154f0bbf1adc87bf613c95f9564f64aa8 at this writing.

BUG= none
R=kschimpf@google.com

Review-Url: https://codereview.chromium.org/2604753002 .

Subzero: Fix build problem against LLVM trunk.

LLVM commit 4ac35e82723d4bdbc094f73b2d33aaaa86a00d6a removed some code that Subzero still needs, specifically StreamingMemoryObject and its parents.

Deal with this by adding those files back to the Subzero repo.

This allows Subzero to build through LLVM c75e3c2fd2bb9ca010f4e1c32acbd142adc32c7f.

BUG= none
R=kschimpf@google.com

Review-Url: https://codereview.chromium.org/2605653002 .

Subzero: Fix some build problems against LLVM trunk.

This allows Subzero to build through LLVM e8516587a2604386a8faaab28c410663c2ec884c.

1. Mirror an internal change to the llvm::cl implementation.

2. Something changed in llvm::format() that now requires an explicit conversion from RegNumT.

BUG= none
R=kschimpf@google.com

Review-Url: https://codereview.chromium.org/2602713002 .

Subzero: Fix a build issue against LLVM trunk.

The existing code gives a build error against LLVM trunk.

Actually, the latest LLVM trunk causes a lot more Subzero build problems, but if we roll LLVM back to 1d79fff9e65e77f84bf80c2cf4f0155bd167c90d, everything builds except for this one issue.

BUG= none
R=kschimpf@google.com

Review-Url: https://codereview.chromium.org/2598263002 .

Subzero: Fix multiply defined symbols in Windows/g++ build.

Based on build failure messages in https://build.chromium.org/p/tryserver.nacl/builders/nacl-toolchain-win7-pnacl-x86_64/builds/3609/steps/llvm_i686_w64_mingw32%20%28build%29/logs/stdio .

BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/2601653002 .

Subzero: Legalize the movzx argument.

The movzx operand must be a register or memory operand. An immediate operand is not allowed.

BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4384
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/2598153002 .

Generalize vector shuffling to accept any operand.

The arguments get legalized to Reg or Mem, so we can allow constants
as well (including undef values). This change makes all instruction's
source arguments Ice::Operands.

BUG=swiftshader:24

Change-Id: I1659cdfdb1b8a12c4acc7c473211d8a67bfd5868
Reviewed-on: https://chromium-review.googlesource.com/418504
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Ensure that the sub-vector load destination is a register.

BUG=swiftshader:15

Change-Id: I7e10342fa1ef9bce22bc8c445240fc34a68e8f47
Reviewed-on: https://chromium-review.googlesource.com/414992
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Fix skipping deleted instructions before replacing operands.

Fixes hitting a (benign) assert in replaceSource().

Change-Id: I7f984d484133e619717d004f20cd671a54473185
Reviewed-on: https://chromium-review.googlesource.com/414490
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

[SubZero] Fix size of arguments on stack

This patch fixes size of arguments on stack

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2533563002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Match sub-vector load/store operand order to regular load/store.

BUG=swiftshader:15

Change-Id: If608ab4903d97daa0ad342d02f496ac3fa6471d9
Reviewed-on: https://chromium-review.googlesource.com/414389
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Fix offset adjustment in x86 address optimization.

Change-Id: I469a7ddaa658d79fc491112b63972bd9b056689d
Reviewed-on: https://chromium-review.googlesource.com/414186
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Reviewed-by: Nicolas Capens <nicolascapens@google.com>

Subzero, MIPS32: Changes for improving sandbox crosstest results

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2519863002 .

Patch from Stefan Maksimovic <makdstefan@gmail.com>.

[Subzero][MIPS] Implements atomic intrinsics for MIPS32

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2504253002 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

Subzero, MIPS32: Sandbox initial patch

Initial patch regarding the Subzero MIPS32 sandboxing stage.
At the moment, the results of the crosstests with vector tests
disabled are as follows:

ASM mode:
19 passing / 5 failing
test_bitmanip: O2
test_calling_conv: Om1, O2
test_sync_atomic: Om1, O2

ELF mode:
15 passing / 9 failing
test_bitmanip: O2
test_calling_conv: Om1, O2
test_global: Om1, O2
test_stacksave: Om1, O2
test_sync_atomic: Om1, O2

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2482123002 .

Patch from Stefan Maksimovic <makdstefan@gmail.com>.

Support 64-bit jump tables with LP64 data model.

BUG=swiftshader:9

Change-Id: I779abfe7775632e1108e9d608bf21a63c8cefe9e
Reviewed-on: https://chromium-review.googlesource.com/407882
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

[SubZero] Utilize instructions with immediate operands

This patch optimizes code generation of instructions with 16-bit immediate operands

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2478113003 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

[SubZero] Generate MIPS.abiflags section

The patch generates MIPS.abiflags section. This section contains a
versioned data structure with essential information required for
loader to determine the requirements of the application.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2471883005 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Implement floating-point rounding intrinsic.

BUG=swiftshader:15

Change-Id: I8e53f2fdb8208f8be0f4cdff3241b4a5efe9bc8a
Reviewed-on: https://chromium-review.googlesource.com/404352
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Subzero, MIPS32: Stacksave/Stackrestore implementation

Implemets Stacksave/Stackrestore; test_stacksave runs successfully
when jal implementation is present, both in forceasm as well as in
elf mode

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2455933002 .

Patch from Stefan Maksimovic <makdstefan@gmail.com>.

[SubZero] Fix code generation for vector type

The patch fixes legalizeToReg issues in vector code generation.
The patch also generates JALR for pointer to function and corrects encoding of FP conditional move instruction.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2468133002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Implement saturated vector add/subtract.

BUG=swiftshader:15

Change-Id: Ic120eddd1761e33b7d76bf3ed8ec5ca74634f958
Reviewed-on: https://chromium-review.googlesource.com/403477
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Implement integer vector multiply intrinsics.

BUG=swiftshader:15

Change-Id: Ib822b50c0a14e5ebc114db9759cbeecbb9f7a3c1
Reviewed-on: https://chromium-review.googlesource.com/403472
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Generalize the Sqrt intrinsic to process vectors.

BUG=swiftshader:15

Change-Id: Ib89d628c85696c20a249b8810cd357a292d10402
Reviewed-on: https://chromium-review.googlesource.com/405293
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

This patch enables running a couple more of lit tests for MIPS32

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2448193008 .

Patch from Stefan Maksimovic <makdstefan@gmail.com>.

Preserve rsi and rdi when using Microsoft x86-64 calling convention.

Also, their priority is lowered so that registers which are scratch on both
Unix and Windows are preferred by the register allocator.

BUG=swiftshader:22

Change-Id: Id55d8c8b8c106947e3041a082099069d7c6c6ed0
Reviewed-on: https://chromium-review.googlesource.com/404503
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Jim Stichnoth <stichnot@chromium.org>

[SubZero] Generate relocations for MIPS

The patch generate Hi, Lo, Jal and data relocations. Instruction encoding of instructions like ldc1, sdc1 etc. has been added.

Following tests from cross-test framework were tested successfully: (non-vector, OBJ mode, Om1, O2)

mem_intrin
TotalTests=114300 Passes=114300 Failures=0

simple_loop
TotalTests=102 Passes=102 Failures=0

test_arith
TotalTests=49489704 Passes=49489704 Failures=0

test_bitmanip
TotalTests=1200 Passes=1200 Failures=0

test_cast
TotalTests=7444 Passes=7444 Failures=0

test_fcmp
TotalTests=123904 Passes=123904 Failures=0

test_global
TotalTests=270 Passes=270 Failures=0

test_icmp
TotalTests=3341520 Passes=3341520 Failures=0

test_strengthreduce
TotalTests=240 Passes=240 Failures=0

Following tests are disabled as they are either all-vectors or contain unimplemented intrinsic lowering:

test_calling_conv
test_select
test_stacksave
test_sync_atomic
test_vector_ops

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2446273003 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Subzero, MIPS32: Remove --skip-unimplemented from lit tests

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2433243003 .

Patch from Stefan Maksimovic <makdstefan@gmail.com>.

[SubZero] Fix f64 to/from i64 moves

The allocation of Hi/Lo part of i64 on stack has been corrected as per MIPS32 ABI. The patch also fixes ZEXT issues occurred while lowering unsigned operations.
Following tests from cross-test framework were testing successfully: (non-vector, ASM mode, Om1, O2)

mem_intrin
TotalTests=114300 Passes=114300 Failures=0

simple_loop
TotalTests=102 Passes=102 Failures=0

test_arith
TotalTests=49489704 Passes=49489704 Failures=0

test_bitmanip
TotalTests=1200 Passes=1200 Failures=0

test_cast
TotalTests=3722 Passes=3722 Failures=0

test_fcmp
TotalTests=123904 Passes=123904 Failures=0

test_global
TotalTests=270 Passes=270 Failures=0

test_icmp
TotalTests=3341520 Passes=3341520 Failures=0

test_strengthreduce
TotalTests=240 Passes=240 Failures=0

Following tests are disabled as they are either all-vectors or contain unimplemented intrinsic lowering:

test_calling_conv
test_select
test_stacksave
test_sync_atomic
test_vector_ops

There are couple of fixes to ARM32 and X86 specific files occurred due to compile-time errors.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2432373002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Fix two-vector unpack case.

Bug swiftshader:15

Change-Id: I351268b44491091c271d6c7c5b644cd21ffb623b
Reviewed-on: https://chromium-review.googlesource.com/403409
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Fix unit tests.

Change-Id: I70899be0455958aaad6af8d8218f1db50591beae
Reviewed-on: https://chromium-review.googlesource.com/401385
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Remove verified asserts.

BUG=swiftshader:15

Change-Id: I3c3314f3787d42835a9483c7b797dc1dbdc0b76a
Reviewed-on: https://chromium-review.googlesource.com/400663
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Assert that PNaCl bitcode only uses 128-bit vector casts.

Change-Id: I5aee2c998842f95ccc44d5c0fed90aa289bdf67b
Reviewed-on: https://chromium-review.googlesource.com/401639
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Implement vector sign mask intrinsic.

BUG=swiftshader:15

Change-Id: I7fec56518a5b4e07d2189ab01a0a223b038564c1

Implement bitcast between i32 and (emulated) v4i8.

BUG=swiftshader:15

Change-Id: Ic795def8a914508ab0d850c846b73b343ace45de

Implement vector packing intrinsics.

BUG=swiftshader:15

Change-Id: Id95a08f82c47ec20bb958358c01f389b6fb5565b

Fix 64-bit pointer type for non-x32 ABIs.

BUG=swiftshader:9

Change-Id: Ife06416736d47acba4f2cff1ea8b17be61134752

Subzero: Fix compiler warnings.

src/IceTargetLoweringX86BaseImpl.h:6093:13: error: unused variable 'Src1RM' [-Werror,-Wunused-variable]
      auto *Src1RM = legalize(Src1, Legal_Reg | Legal_Mem);
            ^

src/IceTargetLoweringX86BaseImpl.h:4007:3: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
  default:
  ^

src/IceTargetLoweringMIPS32.cpp:4065:3: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
  default:
  ^

src/IceTargetLoweringARM32.cpp:4975:3: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
  default:
  ^

BUG= none
R=capn@chromium.org

Review URL: https://codereview.chromium.org/2434643002 .

Optimize shuffles corresponding to x86 punpckh instructions.

BUG=swiftshader:15

Change-Id: I04a7c4206f3936c604ec623e43834c2a153fd3cb
Reviewed-on: https://chromium-review.googlesource.com/399379
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

[Subzero][MIPS32] Account for variable alloca alignment bytes in addProlog

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2425673002 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

Generate error on unexpected intrisics.

Change-Id: I5a02aee156a64f48baca356f0a5263123f570741
Reviewed-on: https://chromium-review.googlesource.com/399590
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Add x86 vector packing instructions.

BUG=swiftshader:15

Change-Id: I0d40fab6287130143693e8e4752859b7142a503d
Reviewed-on: https://chromium-review.googlesource.com/394007
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Fix unpacking from a single vector.

Both vector arguments were being used in a punpckl instruction, while
the shuffle mask repeats elements from just the first vector.

BUG=swiftshader:15

Change-Id: I8e29c252ee4957692c4949e724ae67253b423e89
Reviewed-on: https://chromium-review.googlesource.com/399419
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

[SubZero] Implement Fcmp, ICmp, Cast and Select for vector type

The patch scalarizes Fcmp, ICmp, Cast and Select for operands of vector type.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2412053002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

[SubZero] Handle relocatable constants for MIPS

The patch generates HI/LO modifiers for relocatable constants.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2420033002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

[Subzero][MIPS32] Fix alloca alignment and offset for Om1 and O2 optimization

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2417233002 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

[SubZero] Legalize load, store for MIPS post lower

This patch legalizes load, store instructions post lowering.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2411193003 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

[Subzero][MIPS32] Implement bitcast operation for both 32-bit and 64-bit operands

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2404803002 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

Support running unit tests on Windows.

BUG=swiftshader:7

Change-Id: I83e51a3256365700dbaf550ed4b50c2352612f7d
Reviewed-on: https://chromium-review.googlesource.com/394887
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Optimize x86 vector shift by constant.

BUG=swiftshader:15

Change-Id: I4b7b97f3de18c201a502d0bc38a2c845a1caf278
Reviewed-on: https://chromium-review.googlesource.com/392627
Tested-by: Nicolas Capens <nicolascapens@google.com>
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>

Optimize lowering of x86 byte and word vector unpack.

BUG=swiftshader:15

Change-Id: Id0d3bed46d00336fc31501c41a26ebe2d4ddd697
Reviewed-on: https://chromium-review.googlesource.com/392626
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Implement intrinsics for loading/storing subvectors.

This enables emulating 64-bit and 32-bit vectors using 128-bit
vectors internally (x86 only for now). Note that these Intrinsics
are not part of the PNaCL specification.

BUG=swiftshader:15

Change-Id: I61a666243832c2856e60eb477d42a72dec07d01d
Reviewed-on: https://chromium-review.googlesource.com/392246
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Subzero: Fix "make -f Makefile.standalone check-lit FORCEASM=1".

https://codereview.chromium.org/2384983002/ apparently didn't test the full "make presubmit", otherwise this would have been caught.

BUG= none
TBR=jpp@chromium.org

Review URL: https://codereview.chromium.org/2399873003 .

Subzero, MIPS32: Fix conditional mov instructions

This patch implements changes needed for conditional mov instructions
to fix problem with failing crosstest and invalid register allocation.
Problem is visible from icmp test examples, causing cross test for icmp
to fail. Eg:

Incorrect, before this change:
674: 00653026 xor a2,v1,a1
678: 00a3182b sltu v1,a1,v1
67c: 0082102b sltu v0,a0,v0
680: 0043180a movz v1,v0,v0

Correct, aftrer this change:
674: 00653026 xor a2,v1,a1
678: 00a3182b sltu v1,a1,v1
67c: 0082102b sltu v0,a0,v0
680: 0046180a movz v1,v0,a2

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2394773004 .

Patch from Stefan Maksimovic <makdstefan@gmail.com>.

Subzero: Remove --skip-unimplemented from ARM lit tests.

ARM support is complete, so clean up some of the lit tests:

1. Remove --skip-unimplemented
2. Use --filetype=obj instead of =asm, and remove --assemble
3. Remove --need=allow_dump requirement
4. Remove related TODOs.
5. Fix some CHECK lines because objdump output is slightly different from filetype=asm output.

BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/2384983002 .

[SubZero] Vector types support for MIPS

This patch implements vector operations on MIPS32 using VariableVecOn32 method (on the lines of Variable64On32).
Vector operations are scalarized prior to lowering. Each vector variable is split into 4 containers to hold a variable of vector type.
For MIPS32, four GP/FP registers are used to hold a vector variable. Arguments are passed in GP registers irrespective of the type of the vector variable.

Lit test vector-mips.ll has been added to test this implementation.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2380023002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

[Subzero][MIPS] Implement conditional branches with 64-bit integer compares

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2384433002 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

[Subzero][MIPS] Add RUN command line with -Om1 in test 64bit.pnacl.ll

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2376233004 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

Don't emit address size prefixes for native x86-64 ABI.

Address size prefixes are used in 64-bit x86 for PNaCl's use of the
x32 ABI with ILP32 data model. Don't emit them for any other ABI.

BUG=swiftshader:9

Change-Id: I1351db086d44ce4b144b3428866a54e84637b9a4
Reviewed-on: https://chromium-review.googlesource.com/390409
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Subzero, MIPS32: SRAV instruction encoding

Implements SRAV instruction encoding

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2375923002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

[Subzero][MIPS] Implement 64-bit integer compare operations

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2369323002 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

Subzero, MIPS32: MOVZ instruction encoding

Implements MOVZ instruction encoding

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2377783002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: DIVU instruction encoding

Implements DIVU instruction encoding

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2377733002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Intrinsic call Bswap for i16, i32 and i64

Implements intrinsic call llvm.bswap for i16, i32 and i64

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2368343003 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Intrinsic calls Ctlz and Cttz for i64

Implements intrinsic calls llvm.ctlz and llvm.cttz for i64.
Also adds test cases for constant operands.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2364093002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Binding intrablock labels, unconditional branch

This patch was supposed to be a part of patch with instruction encodings.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2367743004 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Author: Jaydeep Patil

Subzero, MIPS32: Filling missing bits from genTargetHelperCallFor

Implements missing calls to runtime libraries, covering mostly
data casting.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2363333002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Cross-testing enabled for MIPS32

Enables running crosstests for MIPS32 target.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2085303002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: lowerSelect for i64

Implements lowerSelect for i64.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2364143002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

[Subzero][MIPS32] Implements 64-bit shl, lshr, ashr for MIPS

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2359713003 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

Subzero, MIPS32: Intrinsic call Cttz for i32

Implements intrinsic call llvm.cttz for i32

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2358393004 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Fix floating point comparison crosstest

Floating point comparison crosstest was failing in
filetype=obj mode because of missing breaks in load
encoding functions. With this patch, crosstest
generator, with vector tests disabled, for parameters

--filetype=obj --include=test_fcmp,mips32,native,Om1,base

returns

TotalTests=123904 Passes=123904 Failures=0

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2355413008 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Intrinsic call Ctlz for i32

Implements intrinsic call llvm.ctlz for i32

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2354293002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Instruction NOR, pseudoinstruction NOT

These two are prerequisites for some intrinsic calls and
bitwise operations.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2356293002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Remove duplicate functionalities

Removes lowering functions functionalities already covered by
genTargetHelperCallFor. Adds error messages for appropriate cases.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2358123002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

[SubZero] Fix floating-point comparison for MIPS

The patch fixes code generation and encoding of floating-point comparison.
All floating-point comparison related test in test_fcmp cross test pass (after removing vector related tests):
TotalTests=123904 Passes=123904 Failures=0

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2357143002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Subzero, MIPS32: Intrinsic call Trap

Implements intrinsic call llvm.trap.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2351893004 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: Encoding of FP comparison instructions

Patch implements encoding for instructions used for floating point number comparison.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2350833002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

[SubZero] lower float and double constants for MIPS

The patch emits constant pool for float and double constants.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2351583002 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Subzero, MIPS32: lowerUnreachable

Patch implements lowerUnreachable and encoding for teq
instruction. To avoid duplicated code, class describing
trap instruction is borrowed from
https://codereview.chromium.org/2339323004/

Review URL: https://codereview.chromium.org/2350903002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

[SubZero] Use DIV instruction instead of TargetHelperCall

Use DIV/DIVU instructions provided by MIPS32 ISA instead of calling target
helper function (__divsi3 etc.). These instructions give 32-bit quotient and
remainder in 32-bit special LO/HI registers respectively. An additional
instructions to check for divide-by-zero (Trap if equal) is emitted after the
DIV/DIVU instructions.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2339323004 .

Patch from Jaydeep Patil <jaydeep.patil@imgtec.com>.

Implement Microsoft x86-64 calling convention support.

BUG=swiftshader:9

Change-Id: Ie58412c13991143c1ee39f3a122475bf93ead242
Reviewed-on: https://chromium-review.googlesource.com/385117
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Allow 64-bit code to be stored as ELF64.

Previously all unsandboxed 64-bit code was assumed to use ILP32 and
be stored in ELF32 format using the x32 ABI.

BUG=swiftshader:9

Change-Id: I2476a09d1f0af60b1ac6f8807ee9ed37d54a99d4
Reviewed-on: https://chromium-review.googlesource.com/385277
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>

Subzero, MIPS32: Floating point support in ELF output

Patch implements improvements and instruction encodings for many COP1 instructions for handling floating point values.

Patch covers load, store, basic arithmetic, data movement for FPR<->FPR, GPR<->FPR, FPR<->GPR, and format conversion instructinos.

Added instruction encodings:
Load: lb, lh, lwc1, ldc1
Store: sb, sh, swc1, sdc1
FP arith: abs_d, abs_s, add_d, add_s, div_d, div_s, mul_d, mul_s, sqrt_d, sqrt_s, sub_d, sub_s
FP movs: mfc1, mov_d, mov_s, movn_d, movn_s, movz_d, movz_s, mtc1
Conversion: cvt_d_l, cvt_d_s, cvt_d_w, cvt_s_d, cvt_s_l, cvt_s_w, trunc_l_d, trunc_l_s, trunc_w_d, trunc_w_s

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2341713003 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Subzero, MIPS32: nacl-other-intrinsics-mips merged to original file

With fix related to stack alignment bytes increasing, it is
possible to return mips tests from nacl-other-intrinsics-mips
to its original place. However, with existing vector test, O2 test
had to be turned off. This does not affect anything imortant,
because it only tested one case (test_sqrt_ignored).

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2342083003 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

Abstract the ELFStreamer class.

This enables other implementations, such as streaming to memory
instead of a file.

BUG=swiftshader:9

Change-Id: I2a780ee67e9bccd157c120b7a0895d9764117464
Reviewed-on: https://chromium-review.googlesource.com/384911
Reviewed-by: Jim Stichnoth <stichnot@chromium.org>
Tested-by: Nicolas Capens <nicolascapens@google.com>