7 years agoMergedLoadStoreMotion pass
Gerolf Hoflehner [Fri, 18 Jul 2014 19:13:09 +0000 (19:13 +0000)]
MergedLoadStoreMotion pass

Merges equivalent loads on both sides of a hammock/diamond
and hoists into into the header.
Merges equivalent stores on both sides of a hammock/diamond
and sinks it to the footer.
Can enable if conversion and tolerate better load misses
and store operand latencies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213396 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoReapply "DebugInfo: Ensure that all debug location scope chains from instructions...
David Blaikie [Fri, 18 Jul 2014 17:49:10 +0000 (17:49 +0000)]
Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."""

Recommits 212776 which was reverted in r212793. This has been committed
and recommitted a few times as I try to test it harder and find/fix more
issues. The most recent revert was due to an asan bot failure which I
can't seem to reproduce locally, though I believe I'm following all the
steps the buildbot does.

So I'm going to recommit this in the hopes of investigating the failure
on the buildbot itself... apologies in advance for the bot noise. If
anyone sees failures with this /please/ provide me with any
reproductions, etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213391 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoFix build failure on windows
David Peixotto [Fri, 18 Jul 2014 16:41:58 +0000 (16:41 +0000)]
Fix build failure on windows

Add explicit constructor to struct instead of using brace initialization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213389 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMC: support different sized constants in constant pools
David Peixotto [Fri, 18 Jul 2014 16:05:14 +0000 (16:05 +0000)]
MC: support different sized constants in constant pools

On AArch64 the pseudo instruction ldr <reg>, =... supports both
32-bit and 64-bit constants. Add support for 64 bit constants for
the pools to support the pseudo instruction fully.

Changes the AArch64 ldr-pseudo tests to use 32-bit registers and
adds tests with 64-bit registers.

Patch by Janne Grunau!

Differential Revision: http://reviews.llvm.org/D4279

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213387 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAdd a dereferenceable attribute
Hal Finkel [Fri, 18 Jul 2014 15:51:28 +0000 (15:51 +0000)]
Add a dereferenceable attribute

This attribute indicates that the parameter or return pointer is
dereferenceable. Practically speaking, loads from such a pointer within the
associated byte range are safe to speculatively execute. Such pointer
parameters are common in source languages (C++ references, for example).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213385 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAdd MIPS Technologies to the vendors in llvm::Triple.
Daniel Sanders [Fri, 18 Jul 2014 14:28:19 +0000 (14:28 +0000)]
Add MIPS Technologies to the vendors in llvm::Triple.

This is a prerequisite for checking for 'mti' and 'img' in a consistent way in
clang. Previously 'img' could use Triple::getVendor() but 'mti' could only use

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213381 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAArch64: implement efficient f16 bitcasts
Tim Northover [Fri, 18 Jul 2014 13:07:05 +0000 (13:07 +0000)]
AArch64: implement efficient f16 bitcasts

Because i16 is illegal, there's no native DAG method to
represent a bitcast to or from an f16 type. This meant LLVM was
inserting a stack store/load pair which is really not ideal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213378 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoNVPTX: support fpext/fptrunc to and from f16.
Tim Northover [Fri, 18 Jul 2014 13:01:43 +0000 (13:01 +0000)]
NVPTX: support fpext/fptrunc to and from f16.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213377 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoR600: support fpext/fptrunc operations to and from f16.
Tim Northover [Fri, 18 Jul 2014 13:01:37 +0000 (13:01 +0000)]
R600: support fpext/fptrunc operations to and from f16.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213376 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAArch64: support f16 extend/trunc operations.
Tim Northover [Fri, 18 Jul 2014 13:01:31 +0000 (13:01 +0000)]
AArch64: support f16 extend/trunc operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213375 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoX86: support fpext/fptrunc operations to and from 16-bit floats.
Tim Northover [Fri, 18 Jul 2014 13:01:25 +0000 (13:01 +0000)]
X86: support fpext/fptrunc operations to and from 16-bit floats.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213374 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoARM: support legalisation of "fptrunc ... to half" operations.
Tim Northover [Fri, 18 Jul 2014 13:01:19 +0000 (13:01 +0000)]
ARM: support legalisation of "fptrunc ... to half" operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213373 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoCodeGen: soften f16 type by default instead of marking legal.
Tim Northover [Fri, 18 Jul 2014 12:41:46 +0000 (12:41 +0000)]
CodeGen: soften f16 type by default instead of marking legal.

Actual support for softening f16 operations is still limited, and can be added
when it's needed.  But Soften is much closer to being a useful thing to try
than keeping it Legal when no registers can actually hold such values.

Longer term, we probably want something between Soften and Promote semantics
for most targets, it'll be more efficient to promote the 4 basic operations to
f32 than libcall them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213372 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoSuppress 'not handled in switch' warning
Renato Golin [Fri, 18 Jul 2014 12:13:04 +0000 (12:13 +0000)]
Suppress 'not handled in switch' warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213371 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[ARM] Add earlyclobber constraint to pre/post-indexed ARM STR instructions.
Tilmann Scheller [Fri, 18 Jul 2014 12:05:49 +0000 (12:05 +0000)]
[ARM] Add earlyclobber constraint to pre/post-indexed ARM STR instructions.

The post-indexed instructions were missing the constraint, causing unpredictable STR instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

This fixes PR20323.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213369 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRefactor ARM subarchitecture parsing
Renato Golin [Fri, 18 Jul 2014 12:00:48 +0000 (12:00 +0000)]
Refactor ARM subarchitecture parsing

Re-commit of a patch to rework the triple parsing on ARM to a more sane

Patch by Gabor Ballabas.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213367 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoextracting swapStruct into include/llvm/Support/MachO.h (no functional change)
Artyom Skrobov [Fri, 18 Jul 2014 09:26:16 +0000 (09:26 +0000)]
extracting swapStruct into include/llvm/Support/MachO.h (no functional change)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213361 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoR600: rename misleading fp16 test.
Tim Northover [Fri, 18 Jul 2014 08:43:30 +0000 (08:43 +0000)]
R600: rename misleading fp16 test.

This test is actually going in the opposite direction to what the
filename and function name suggested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213358 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoR600: support f16 -> f64 conversion intrinsic.
Tim Northover [Fri, 18 Jul 2014 08:43:24 +0000 (08:43 +0000)]
R600: support f16 -> f64 conversion intrinsic.

Unfortunately, we don't seem to have a direct truncation, but the
extension can be legally split into two operations so we should
support that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213357 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoNVPTX: support direct f16 <-> f64 conversions via intrinsics.
Tim Northover [Fri, 18 Jul 2014 08:30:10 +0000 (08:30 +0000)]
NVPTX: support direct f16 <-> f64 conversions via intrinsics.

Clang may well start emitting these soon, and while it may not be
directly relevant for OpenCL or GLSL, the instructions were just
sitting there waiting to be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213356 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRename AlignAttribute to IntAttribute
Hal Finkel [Fri, 18 Jul 2014 06:51:55 +0000 (06:51 +0000)]
Rename AlignAttribute to IntAttribute

Currently the only kind of integer IR attributes that we have are alignment
attributes, and so the attribute kind that takes an integer parameter is called
AlignAttr, but that will change (we'll soon be adding a dereferenceable
attribute that also takes an integer value). Accordingly, rename AlignAttribute
to IntAttribute (class names, enums, etc.).

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213352 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoR600: Implement TTI:getPopcntSupport
Matt Arsenault [Fri, 18 Jul 2014 06:07:13 +0000 (06:07 +0000)]
R600: Implement TTI:getPopcntSupport

The test is just copied from X86, and I don't know of a better
way to test it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213351 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoX86: Constant fold converting vector setcc results to float.
Jim Grosbach [Fri, 18 Jul 2014 00:40:56 +0000 (00:40 +0000)]
X86: Constant fold converting vector setcc results to float.

Since the result of a SETCC for X86 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result

Before this change, the SSE code is generated as:
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  cvtdq2ps  %xmm0, %xmm0

After, the code is improved to:
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0

The cvtdq2ps has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via the ModRM operand of andps.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213342 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAArch64: Constant fold converting vector setcc results to float.
Jim Grosbach [Fri, 18 Jul 2014 00:40:52 +0000 (00:40 +0000)]
AArch64: Constant fold converting vector setcc results to float.

Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRevert "[x86] Fold extract_vector_elt of a load into the Load's address computation."
Michael J. Spencer [Fri, 18 Jul 2014 00:15:50 +0000 (00:15 +0000)]
Revert "[x86] Fold extract_vector_elt of a load into the Load's address computation."

There's a bug where this can create cycles in the DAG. It will take a bit
to fix, so I'm backing it out for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213339 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoReset the Subtarget in the AsmPrinter for each machine function
Eric Christopher [Fri, 18 Jul 2014 00:08:53 +0000 (00:08 +0000)]
Reset the Subtarget in the AsmPrinter for each machine function
and add explanatory comment about dual initialization. Fix
use of the Subtarget to grab the information off of the target machine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213336 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAvoid resetting the UseSoftFloat and FloatABIType on the TargetMachine
Eric Christopher [Fri, 18 Jul 2014 00:08:50 +0000 (00:08 +0000)]
Avoid resetting the UseSoftFloat and FloatABIType on the TargetMachine
Options struct and move the comment to inMips16HardFloat. Use the
fact that we now know whether or not we cared about soft float to
set the libcalls.
Accordingly rename mipsSEUsesSoftFloat to abiUsesSoftFloat and
propagate since it's no longer CPU specific.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213335 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[MCJIT] Fix the alignment requirements for ARM and AArch64 which were mistakenly
Lang Hames [Thu, 17 Jul 2014 23:11:30 +0000 (23:11 +0000)]
[MCJIT] Fix the alignment requirements for ARM and AArch64 which were mistakenly
relaxed in the big RuntimeDyldMachO cleanup of r213293.

No test case yet - this was found via inspection and there's no easy way to test
GOT alignment in RuntimeDyldChecker at the moment. I'm working on adding support
for this now, and hope to have a test case for this soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213331 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoTweak formating to match what clang-format would be for llvm-nm.cpp .
Kevin Enderby [Thu, 17 Jul 2014 22:56:27 +0000 (22:56 +0000)]
Tweak formating to match what clang-format would be for llvm-nm.cpp .
No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213330 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAdd printing of Mach-O stabs in llvm-nm.
Kevin Enderby [Thu, 17 Jul 2014 22:47:16 +0000 (22:47 +0000)]
Add printing of Mach-O stabs in llvm-nm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213327 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRemove rules against std::function from the programmer's manual
Reid Kleckner [Thu, 17 Jul 2014 22:43:00 +0000 (22:43 +0000)]
Remove rules against std::function from the programmer's manual

Clarify that llvm::function_ref is like StringRef for callables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213326 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoms inline asm: Don't add x86 segment registers to the clobber list.
Nico Weber [Thu, 17 Jul 2014 20:24:55 +0000 (20:24 +0000)]
ms inline asm: Don't add x86 segment registers to the clobber list.

Clang tries to check the clobber list but doesn't list segment registers in its
x86 register list. This fixes PR20343.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213303 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMake myself code owner of MCJIT.
Lang Hames [Thu, 17 Jul 2014 20:23:31 +0000 (20:23 +0000)]
Make myself code owner of MCJIT.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213302 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoDrop the udis86 wrapper from llvm::sys
Alp Toker [Thu, 17 Jul 2014 20:05:29 +0000 (20:05 +0000)]
Drop the udis86 wrapper from llvm::sys

This optional dependency on the udis86 library was added some time back to aid
JIT development, but doesn't make much sense to link into LLVM binaries these

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213300 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoTableGen: Add 'static' to a large array to avoid a huge stack allocation
Reid Kleckner [Thu, 17 Jul 2014 19:43:40 +0000 (19:43 +0000)]
TableGen: Add 'static' to a large array to avoid a huge stack allocation

Speculative fix for a -Wframe-larger-than warning from gcc.  Clang will
implicitly promote such constant arrays to globals, so in theory it
won't hit this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213298 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[AArch64] Cleanup AsmParser: no need to use dyn_cast + assert. cast does it for us.
Arnaud A. de Grandmaison [Thu, 17 Jul 2014 19:08:14 +0000 (19:08 +0000)]
[AArch64] Cleanup AsmParser: no need to use dyn_cast + assert. cast does it for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213296 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRectify r213231. Use proper version of 'ComputeNumSignBits'.
Suyog Sarda [Thu, 17 Jul 2014 19:07:00 +0000 (19:07 +0000)]
Rectify r213231. Use proper version of 'ComputeNumSignBits'.

Earlier when the code was in InstCombine, we were calling the version of ComputeNumSignBits in InstCombine.h
that automatically added the DataLayout* before calling into ValueTracking.
When the code moved to InstSimplify, we are calling into ValueTracking directly without passing in the DataLayout*.
This patch rectifies the same by passing DataLayout in ComputeNumSignBits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213295 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[MCJIT] Significantly refactor the RuntimeDyldMachO class.
Lang Hames [Thu, 17 Jul 2014 18:54:50 +0000 (18:54 +0000)]
[MCJIT] Significantly refactor the RuntimeDyldMachO class.

The previous implementation of RuntimeDyldMachO mixed logic for all targets
within a single class, creating problems for readability, maintainability, and
performance. To address these issues, this patch strips the RuntimeDyldMachO
class down to just target-independent functionality, and moves all
target-specific functionality into target-specific subclasses RuntimeDyldMachO.

The new class hierarchy is as follows:

class RuntimeDyldMachO
Implemented in RuntimeDyldMachO.{h,cpp}
Contains logic that is completely independent of the target. This consists
mostly of MachO helper utilities which the derived classes use to get their
work done.

template <typename Impl>
class RuntimeDyldMachOCRTPBase<Impl> : public RuntimeDyldMachO
Implemented in RuntimeDyldMachO.h
Contains generic MachO algorithms/data structures that defer to the Impl class
for target-specific behaviors.

RuntimeDyldMachOARM : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM>
RuntimeDyldMachOARM64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM64>
RuntimeDyldMachOI386 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOI386>
RuntimeDyldMachOX86_64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOX86_64>
Implemented in their respective *.h files in lib/ExecutionEngine/RuntimeDyld/MachOTargets
Each of these contains the relocation logic specific to their target architecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213293 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[ASan] Don't instrument load/stores with !nosanitize metadata.
Alexey Samsonov [Thu, 17 Jul 2014 18:48:12 +0000 (18:48 +0000)]
[ASan] Don't instrument load/stores with !nosanitize metadata.

This is used to avoid instrumentation of instructions added by UBSan
in Clang frontend (see r213291). This fixes PR20085.

Reviewed in http://reviews.llvm.org/D4544.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213292 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoTypo: exists -> exits
Hans Wennborg [Thu, 17 Jul 2014 18:33:44 +0000 (18:33 +0000)]
Typo: exists -> exits

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213290 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[NVPTX] Improve handling of FP fusion
Justin Holewinski [Thu, 17 Jul 2014 18:10:09 +0000 (18:10 +0000)]
[NVPTX] Improve handling of FP fusion

We now consider the FPOpFusion flag when determining whether
to fuse ops.  We also explicitly emit add.rn when fusion is
disabled to prevent ptxas from fusing the operations on its

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213287 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoFix typos
Matt Arsenault [Thu, 17 Jul 2014 17:50:22 +0000 (17:50 +0000)]
Fix typos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213285 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[BUG] Due to a typo introduced in r199933 and r200027 two tests for FMA were never...
Zinovy Nis [Thu, 17 Jul 2014 17:14:35 +0000 (17:14 +0000)]
[BUG] Due to a typo introduced in r199933  and r200027 two tests for FMA were never even started.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213283 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[X86] AVX512: Add disassembler support for compressed displacement
Adam Nemet [Thu, 17 Jul 2014 17:04:56 +0000 (17:04 +0000)]
[X86] AVX512: Add disassembler support for compressed displacement

There are two parts here.  First is to modify tablegen to adjust the encoding
type ENCODING_RM with the scaling factor.

The second is to use the new encoding types to compute the correct
displacement in the decoder.

Fixes <rdar://problem/17608489>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213281 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[X86] AVX512: Rename EVEX_CD8V to CD8_Form
Adam Nemet [Thu, 17 Jul 2014 17:04:52 +0000 (17:04 +0000)]
[X86] AVX512: Rename EVEX_CD8V to CD8_Form

This is to match the naming of CD8_EltSize, CD8_Scale, etc.

No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213280 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[X86] AVX512: Use the TD version of CD8_Scale in the assembler
Adam Nemet [Thu, 17 Jul 2014 17:04:50 +0000 (17:04 +0000)]
[X86] AVX512: Use the TD version of CD8_Scale in the assembler

Passes the computed scaling factor in TSFlags rather than the old attributes.

Also removes the C++ version of computing the scaling factor (MemObjSize)
along with the asserts added by the previous patch.

No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213279 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[X86] AVX512: Move compressed displacement logic to TD
Adam Nemet [Thu, 17 Jul 2014 17:04:34 +0000 (17:04 +0000)]
[X86] AVX512: Move compressed displacement logic to TD

This does not actually move the logic yet but reimplements it in the Tablegen
language.  Then asserts that the new implementation results in the same value.

The next patch will remove the assert and the temporary use of the TSFlags and
remove the C++ implementation.

The formula requires a limited form of the logical left and right operators.
I implemented these with the bit-extract/insert operator (i.e. blah{bits}).

No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213278 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[TableGen] Allow shift operators to take bits<n>
Adam Nemet [Thu, 17 Jul 2014 17:04:27 +0000 (17:04 +0000)]
[TableGen] Allow shift operators to take bits<n>

Convert the operand to int if possible, i.e. if the value is properly
initialized.  (I suppose there is further room for improvement here to also
peform the shift if the uninitialized bits are shifted out.)

With this little change we can now compute the scaling factor for compressed
displacement with pure tablegen code in the X86 backend.  This is useful
because both the X86-disassembler-specific part of tablegen and the assembler
need this and TD is the natural sharing place.

The patch also adds the missing documentation for the shift and add operator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213277 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[NVPTX] Add missing .v4 qualifier on vector store instruction
Justin Holewinski [Thu, 17 Jul 2014 16:58:56 +0000 (16:58 +0000)]
[NVPTX] Add missing .v4 qualifier on vector store instruction

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213276 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMC: correct DWARF header for PE/COFF assembly input
Saleem Abdulrasool [Thu, 17 Jul 2014 16:27:44 +0000 (16:27 +0000)]
MC: correct DWARF header for PE/COFF assembly input

The header contains an offset to the DWARF abbreviations for the CU.  The offset
must be section relative for COFF and absolute for others.  The non-assembly
code path for the DWARF header generation already had the correct emission for
the headers.  This corrects just the assembly path.  Due to the invalid
relocation, processing of the debug information would halt previously on the
first assembly input as the associated abbreviations would be out of range as
they would have the location increased by image base and the section offset.

This address PR20332.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213275 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMC: fix MCAsmInfo usage for windows-itanium
Saleem Abdulrasool [Thu, 17 Jul 2014 16:27:40 +0000 (16:27 +0000)]
MC: fix MCAsmInfo usage for windows-itanium

Windows itanium uses the GNUCOFF assmebly format, not ELF.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213274 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMC: collapse emission of producer
Saleem Abdulrasool [Thu, 17 Jul 2014 16:27:35 +0000 (16:27 +0000)]
MC: collapse emission of producer

Rather than use three EmitBytes, concatenate the string at compile time,
constructing a single StringRef and emitting the data in one shot.  This also
creates nicer assembly output.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213273 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[NVPTX] Flag surface/texture query instructions with IsTexSurfQuery
Justin Holewinski [Thu, 17 Jul 2014 14:51:33 +0000 (14:51 +0000)]
[NVPTX] Flag surface/texture query instructions with IsTexSurfQuery

Also, add some tests to make sure we can handle surface/texture
queries on both Fermi and Kepler+.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213268 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch
Justin Holewinski [Thu, 17 Jul 2014 11:59:04 +0000 (11:59 +0000)]
[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch

This also uses TSFlags to mark machine instructions that are surface/texture
accesses, as well as the vector width for surface operations.  This is used
to simplify some of the switch statements that need to detect surface/texture

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213256 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoARM: support direct f16 <-> f64 conversions
Tim Northover [Thu, 17 Jul 2014 11:27:04 +0000 (11:27 +0000)]
ARM: support direct f16 <-> f64 conversions

ARMv8 has instructions to handle it, otherwise a libcall is needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213254 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[TABLEGEN] Do not crash on intrinsics with names longer than 40 characters
Justin Holewinski [Thu, 17 Jul 2014 11:23:29 +0000 (11:23 +0000)]
[TABLEGEN] Do not crash on intrinsics with names longer than 40 characters

Differential Revision: http://reviews.llvm.org/D4537

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213253 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoCodeGen: generate single libcall for fptrunc -> f16 operations.
Tim Northover [Thu, 17 Jul 2014 11:12:12 +0000 (11:12 +0000)]
CodeGen: generate single libcall for fptrunc -> f16 operations.

Previously we asserted on this code. Currently compiler-rt doesn't
actually implement any of these new libcalls, but external help is
pretty much the only viable option for LLVM.

I've followed the much more generic "__truncST2" naming, as opposed to
the odd name for f32 -> f16 truncation. This can obviously be changed
later, or overridden by any targets that need to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213252 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoX86: support double extension of f16 type.
Tim Northover [Thu, 17 Jul 2014 11:04:04 +0000 (11:04 +0000)]
X86: support double extension of f16 type.

x86 has no native ability to extend an f16 to f64, but the same result
is obtained if we expand it into two separate extensions: f16 -> f32
-> f64.

Unfortunately the same is not true for truncate, so that still results
in a compilation failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213251 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoCodeGen: extend f16 conversions to permit types > float.
Tim Northover [Thu, 17 Jul 2014 10:51:23 +0000 (10:51 +0000)]
CodeGen: extend f16 conversions to permit types > float.

This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoPort memory barriers intrinsics to AArch64
Yi Kong [Thu, 17 Jul 2014 10:50:20 +0000 (10:50 +0000)]
Port memory barriers intrinsics to AArch64

Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to
implement their corresponding ACLE and MSVC intrinsics.

This patch ports ARM dmb, dsb, isb intrinsic to AArch64.

Differential Revision: http://reviews.llvm.org/D4520

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213247 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[mips] .reginfo is 8 byte aligned on N32.
Daniel Sanders [Thu, 17 Jul 2014 10:10:04 +0000 (10:10 +0000)]
[mips] .reginfo is 8 byte aligned on N32.

Differential Revision: http://reviews.llvm.org/D4540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213246 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than...
Daniel Sanders [Thu, 17 Jul 2014 10:02:08 +0000 (10:02 +0000)]
[mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than a mips64-* triple

Generally speaking, mips-* vs mips64-* should not be used to make decisions
about the content or format of the ELF. This should be based on the ABI
and CPU in use. For example, `mips-linux-gnu-clang -mips64r2 -mabi=64`
should produce an ELF64 as should `mips64-linux-gnu-clang -mabi=64`.
Conversely, `mips64-linux-gnu-clang -mabi=n32` should produce an ELF32 as
should `mips-linux-gnu-clang -mips64r2 -mabi=n32`.

This patch fixes the e_flags but leaves the ELF32 vs ELF64 issue for now
since there is no apparent way to base this decision on the ABI and CPU.

Differential Revision: http://reviews.llvm.org/D4539

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213244 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6
Daniel Sanders [Thu, 17 Jul 2014 09:57:23 +0000 (09:57 +0000)]
[mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6

The cpr1_size field describes the minimum register width to run the program
rather than the size of the registers on the target. MIPS32r6 was acting
as if -mfp64 has been given because it starts off with 64-bit FPU registers.

Differential Revision: http://reviews.llvm.org/D4538

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213243 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[mips] Fix ELF e_flags related to -mabicalls and -mplt.
Daniel Sanders [Thu, 17 Jul 2014 09:52:56 +0000 (09:52 +0000)]
[mips] Fix ELF e_flags related to -mabicalls and -mplt.

These options are not implemented yet but we act as if they are always

The integrated assembler is driven by the clang driver so the e_flag test
cases should match the e_flags emitted by GCC+GAS rather than GAS
by itself.

Differential Revision: http://reviews.llvm.org/D4536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213242 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoFix the prefix for arm64 triple
Yi Kong [Thu, 17 Jul 2014 09:43:27 +0000 (09:43 +0000)]
Fix the prefix for arm64 triple

Triple.cpp still returns "arm64" as prefix for arm64 triple, causing Clang not
being able to select the correct GCCBuiltin IR.

This patch changes the value to correct prefix "aarch64". Regression test will
be added in the coming patch.

Differential Revision: http://reviews.llvm.org/D4516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213240 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[msan] Avoid redundant origin stores.
Evgeniy Stepanov [Thu, 17 Jul 2014 09:10:37 +0000 (09:10 +0000)]
[msan] Avoid redundant origin stores.

Origin is meaningless for fully initialized values. Avoid
storing origin for function arguments that are known to
be always initialized (i.e. shadow is a compile-time null

This is not about correctness, but purely an optimization.
Seems to affect compilation time of blacklisted functions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213239 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMove ashr optimization from InstCombineShift to InstSimplify.
Suyog Sarda [Thu, 17 Jul 2014 06:28:15 +0000 (06:28 +0000)]
Move ashr optimization from InstCombineShift to InstSimplify.
Refactor code, no functionality change, test case moved from instcombine to instsimplify.

Differential Revision: http://reviews.llvm.org/D4102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213231 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoUse range for
Matt Arsenault [Thu, 17 Jul 2014 06:19:06 +0000 (06:19 +0000)]
Use range for

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213230 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoR600: Short circuit alloca check if address space isn't private.
Matt Arsenault [Thu, 17 Jul 2014 06:13:41 +0000 (06:13 +0000)]
R600: Short circuit alloca check if address space isn't private.

Skip calling GetUnderlyingObject in cases where it obviously
isn't from an alloca. This should only be a compile time improvement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213229 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoFix Typo (first commit to test commit access)
Suyog Sarda [Thu, 17 Jul 2014 06:09:34 +0000 (06:09 +0000)]
Fix Typo (first commit to test commit access)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213228 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[lit] Add --show-unsupported flag to LIT
Eric Fiselier [Thu, 17 Jul 2014 05:53:00 +0000 (05:53 +0000)]
[lit] Add --show-unsupported flag to LIT

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213227 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMC: make WinEH opcode an opaque value
Saleem Abdulrasool [Thu, 17 Jul 2014 03:08:50 +0000 (03:08 +0000)]
MC: make WinEH opcode an opaque value

This makes the opcode an opaque value (unsigned int) rather than the
enumeration.  This permits the use of target specific operands.

Split out the generic type into a MCWinEH header and add a supporting
MCWin64EH::Instruction to abstract out the selection of the opcode and
construction of the actual instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213221 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoImprove BasicAA CS-CS queries (redux)
Hal Finkel [Thu, 17 Jul 2014 01:28:25 +0000 (01:28 +0000)]
Improve BasicAA CS-CS queries (redux)

This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both

Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.

Original commit message:

BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.

Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.

This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.

Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213219 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoPartially revert r210444 due to performance regression
Jingyue Wu [Wed, 16 Jul 2014 23:25:00 +0000 (23:25 +0000)]
Partially revert r210444 due to performance regression

Converting outermost zext(a) to sext(a) causes worse code when the
computation of zext(a) could be reused. For example, after converting

... = array[zext(a)]
... = array[zext(a) + 1]


... = array[sext(a)]
... = array[zext(a) + 1],

the program computes sext(a), which is actually unnecessary. I added one
test in split-gep-and-gvn.ll to illustrate this scenario.

Also, with r211281 and r211084, we annotate more "nuw" tags to
computation involving CUDA intrinsics such as threadIdx.x. These
annotations help with splitting GEP a lot, rendering the benefit we get
from this reverted optimization only marginal.

Test Plan: make check-all

Reviewers: eliben, meheff

Reviewed By: meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D4542

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213209 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoFixed formatting, removed bug reference, renamed testcase
Sanjay Patel [Wed, 16 Jul 2014 22:40:28 +0000 (22:40 +0000)]
Fixed formatting, removed bug reference, renamed testcase

Thanks to Duncan Exon Smith for reviewing and cleanup suggestions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213205 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[FastISel] Local values shouldn't be alive across an inline asm call with side effects.
Juergen Ributzka [Wed, 16 Jul 2014 22:20:51 +0000 (22:20 +0000)]
[FastISel] Local values shouldn't be alive across an inline asm call with side effects.

This fixes an issue where a local value is defined before and used after an
inline asm call with side effects.

This fix simply flushes the local value map, which updates the insertion point
for the inline asm call to be above any previously defined local values.

This fixes <rdar://problem/17694203>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213203 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[MCJIT] Improve a RuntimeDyldChecker diagnostic.
Lang Hames [Wed, 16 Jul 2014 22:02:20 +0000 (22:02 +0000)]
[MCJIT] Improve a RuntimeDyldChecker diagnostic.

When a RuntimeDyldChecker test requests an invalid operand for an instruction,
print the decoded instruction to aid diagnosis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213202 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoFix a typo in the inalloca description
Hal Finkel [Wed, 16 Jul 2014 21:22:46 +0000 (21:22 +0000)]
Fix a typo in the inalloca description

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213200 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agotrivial fix for PR20314
Sanjay Patel [Wed, 16 Jul 2014 21:08:10 +0000 (21:08 +0000)]
trivial fix for PR20314

Make sure that the AddrInst is an Instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213197 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRemove Atom references in description.
Sanjay Patel [Wed, 16 Jul 2014 20:18:49 +0000 (20:18 +0000)]
Remove Atom references in description.

Any CPU can run this pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213190 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoUtilize CastInst::CreatePointerBitCastOrAddrSpaceCast here.
Manuel Jacob [Wed, 16 Jul 2014 20:13:45 +0000 (20:13 +0000)]
Utilize CastInst::CreatePointerBitCastOrAddrSpaceCast here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213189 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegist...
Chris Bieneman [Wed, 16 Jul 2014 20:13:31 +0000 (20:13 +0000)]
[RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213188 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[NVPTX] Honor alignment on vector loads/stores
Justin Holewinski [Wed, 16 Jul 2014 19:45:35 +0000 (19:45 +0000)]
[NVPTX] Honor alignment on vector loads/stores

We were not considering the stated alignment on vector loads/stores,
leading us to generate vector instructions even when we do not have
sufficient alignment.

Now, for IR like:

  %1 = load <4 x float>, <4 x float>* %ptr, align 4

we will generate correct, conservative PTX like:

  ld.f32 ... [%ptr]
  ld.f32 ... [%ptr+4]
  ld.f32 ... [%ptr+8]
  ld.f32 ... [%ptr+12]

Or if we have an alignment of 8 (for example), we can
generate code like:

  ld.v2.f32 ... [%ptr]
  ld.v2.f32 ... [%ptr+8]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213186 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoCHECK-LABEL-ize one test
Alexey Samsonov [Wed, 16 Jul 2014 18:11:31 +0000 (18:11 +0000)]
CHECK-LABEL-ize one test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213177 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAdd the "-x" flag to llvm-nm for Mach-O files that prints the fields of a symbol...
Kevin Enderby [Wed, 16 Jul 2014 17:38:26 +0000 (17:38 +0000)]
Add the "-x" flag to llvm-nm for Mach-O files that prints the fields of a symbol in hex.
(generally use for debugging the tools).  This is same functionality as darwin’s
nm(1) "-x" flag.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213176 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRemove unnecessary/redundant std::move
David Blaikie [Wed, 16 Jul 2014 17:09:21 +0000 (17:09 +0000)]
Remove unnecessary/redundant std::move

(run returns unique_ptr by value already)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213174 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoTrack clang r213171
Alp Toker [Wed, 16 Jul 2014 16:50:34 +0000 (16:50 +0000)]
Track clang r213171

The clang rewriter is now a core facility.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213173 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoAdded documentation for SizeMultiplier in the ARM subtarget hook for register coalesc...
Chris Bieneman [Wed, 16 Jul 2014 16:27:31 +0000 (16:27 +0000)]
Added documentation for SizeMultiplier in the ARM subtarget hook for register coalescing. Also fixed some 80 col violations.

No functional code changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213169 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[NVPTX] Rename registers %fl -> %fd and %rl -> %rd
Justin Holewinski [Wed, 16 Jul 2014 16:26:58 +0000 (16:26 +0000)]
[NVPTX] Rename registers %fl -> %fd and %rl -> %rd

This matches the internal behavior of NVIDIA tools like libnvvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213168 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoCodeGen: don't form illegail EXTLOAD operations.
Tim Northover [Wed, 16 Jul 2014 15:37:24 +0000 (15:37 +0000)]
CodeGen: don't form illegail EXTLOAD operations.

It turns out that in most cases (the main exception being i1-related
types) once these operations are formed we cannot separate them and
the targets end up having to deal with them whether they want to or

This is not a good situation, and a more reasonable default can be
formed by ackowledging this and having targets leave them as Legal.
Only x86 seems to be affected (other targets don't even try marking
the operation Expand).

Mostly there's no visible change here yet, but it will be useful to
have truly expanded EXTLOADS for MVT::f16 softening support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213162 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoConvert test to CHECK-LABEL
Tim Northover [Wed, 16 Jul 2014 15:37:08 +0000 (15:37 +0000)]
Convert test to CHECK-LABEL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213161 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[mips][fp64a] Temporarily disable odd-numbered double-precision registers when using...
Daniel Sanders [Wed, 16 Jul 2014 15:34:07 +0000 (15:34 +0000)]
[mips][fp64a] Temporarily disable odd-numbered double-precision registers when using the FP64A ABI.

A few instructions (mostly cvt.d.w and similar) are causing problems with
-mfp64 and -mno-odd-spreg and it looks like fixing it properly may
take several weeks. In the meantime, let's disable the odd-numbered
double-precision registers so that the generated code is at least valid.

The problem is that instructions like cvt.d.w read from the 32-bit low
subregister of a double-precision FPU register. This often leads to the compiler
to inserting moves to transfer a GPR32 to a FGR32 using mtc1. Such moves
violate the rules against 32-bit writes to odd-numbered FPU registers imposed
by -mno-odd-spreg. By disabling the odd-numbered double-precision registers, it
becomes impossible for the 32-bit low subregister to be odd-numbered.

This fixes numerous test-suite failures when compiling for the FP64A ABI
('-mfp64 -mno-odd-spreg'). There is no LLVM test case because it's difficult to
test that odd-numbered FPU registers are not allocatable. Instead, we depend on
the assembler (GAS and -fintegrated-as) raising errors when the rules are

Differential Revision: http://reviews.llvm.org/D4532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213160 91177308-0d34-0410-b5e6-96231b3b80d8

7 years ago[X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'.
Andrea Di Biagio [Wed, 16 Jul 2014 11:29:39 +0000 (11:29 +0000)]
[X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'.

Before this change, method 'isShuffleMaskLegal' didn't know that shuffles
implementing a 'movhlps' operation were perfectly legal for SSE targets.

This patch adds the missing check for 'isMOVHLPSMask' inside method
'isShuffleMaskLegal' to fix the problem.

The reason why it is important to do this is because the DAGCombiner
conservatively avoids combining a pair of shuffles if the resulting shuffle
node has an illegal mask. Before this patch, shuffles with a MOVHLPS mask were
wrongly considered not to be legal. This was the root cause of some poor-code
generation bugs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213137 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agounittests: Actually test reverse iterators in Path tests
Justin Bogner [Wed, 16 Jul 2014 08:18:58 +0000 (08:18 +0000)]
unittests: Actually test reverse iterators in Path tests

This re-enables some #if 0'd code (since 2010) in the Path unittests
and makes at least a weak effort at testing sys::path's rbegin/rend.

This change was inspired by some test failures near uses of rbegin and
rend here:


The "valgrind was whining" comment looked promising in terms of a
simpler to debug case of the same errors. However, it appears that the
valgrind complaints the comment was referring to are distinct from the
ones in the frontend, since this updated test isn't complaining for me
under valgrind.

In any case, the disabled tests weren't helping anybody.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213125 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRoundtrip the inalloca bit on allocas through bitcode
Reid Kleckner [Wed, 16 Jul 2014 01:34:27 +0000 (01:34 +0000)]
Roundtrip the inalloca bit on allocas through bitcode

This was an oversight in the original support.  As it is, I stuffed this
bit into the alignment.  The alignment is stored in log2 form, so it
doesn't need more than 5 bits, given that Value::MaximumAlignment is 1
<< 29.

Reviewers: nicholas

Differential Revision: http://reviews.llvm.org/D3943

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213118 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoFix comment in InstCombiner::visitAddrSpaceCast.
Manuel Jacob [Wed, 16 Jul 2014 01:34:21 +0000 (01:34 +0000)]
Fix comment in InstCombiner::visitAddrSpaceCast.

In the original version of the patch the behaviour was like described in
the comment.  This behaviour was changed before committing it without
updating the comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213117 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoPerform wildcard expansion in Process::GetArgumentVector on Windows (PR17098)
Hans Wennborg [Wed, 16 Jul 2014 00:52:11 +0000 (00:52 +0000)]
Perform wildcard expansion in Process::GetArgumentVector on Windows (PR17098)

On Windows, wildcard expansion isn't performed by the shell, but left to the
program itself. The common way to do this is to link with setargv.obj, which
performs the expansion on argc/argv before main is entered. However, we don't
use argv in Clang on Windows, but instead call GetCommandLineW so we can handle
unicode arguments. This means we have to do wildcard expansion ourselves.

A test case will be added on the Clang side.

Differential Revision: http://reviews.llvm.org/D4529

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213114 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoEmit warnings if vectorization is forced and fails.
Tyler Nowicki [Wed, 16 Jul 2014 00:36:00 +0000 (00:36 +0000)]
Emit warnings if vectorization is forced and fails.

This patch modifies the existing DiagnosticInfo system to create a generic base
class that is inherited to produce diagnostic-based warnings. This is used by
the loop vectorizer to trigger a warning when vectorization is forced and
fails. Several tests have been added to verify this behavior.

Reviewed by: Arnold Schwaighofer

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213110 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoRemove TLI from isInTailCallPosition's arguments. NFC.
Juergen Ributzka [Wed, 16 Jul 2014 00:01:22 +0000 (00:01 +0000)]
Remove TLI from isInTailCallPosition's arguments. NFC.

There is no need to pass on TLI separately to the function. As Eric pointed out
the Target Machine already provides everything we need.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213108 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoR600/SI: Allow using f32 rcp / rsq when denormals not handled.
Matt Arsenault [Tue, 15 Jul 2014 23:50:10 +0000 (23:50 +0000)]
R600/SI: Allow using f32 rcp / rsq when denormals not handled.

These are precise enough to use for OpenCL unless denormals
are handled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213107 91177308-0d34-0410-b5e6-96231b3b80d8