Since BSD cmp(1) does not support the --ignore-initial option, use the
Since BSD cmp(1) does not support the --ignore-initial option, use the
more portable 3rd and 4th arguments to skip the first 16 bytes during
the comparison of Phase2 and Phase3 objects.

Add 'const' to a few more functions in MachineFrameInfo
Add 'const' to a few more functions in MachineFrameInfo

Reviewer: Eric Christopher

Fix comment typo (test commit). NFC
Fix comment typo (test commit). NFC

Simplify now that we can iterate backwards. NFC.
Simplify now that we can iterate backwards. NFC.

[ARM] Refactor the prologue/epilogue emission to be more robust.
[ARM] Refactor the prologue/epilogue emission to be more robust.

This is the first step toward supporting shrink-wrapping for this target.

The changes could be summarized by these items:
- Expand the tail-call return as part of the expand pseudo pass.
- Get rid of the assumptions that the epilogue is the exit block:
  * Do not assume which registers are free in the epilogue. (This indirectly
    improve the lowering of the code for the segmented stacks, see the test
  * Take into account that the basic block can be empty.

Related to <rdar://problem/20821730>

[NVPTX] make load on global readonly memory to use ldg
[NVPTX] make load on global readonly memory to use ldg

As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in
S3D benchmark of shoc [2].

Patched by Xuetian Weng.

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Reviewers: jholewinski, jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11314

Simplify iterating over the dynamic section and report broken ones.
Simplify iterating over the dynamic section and report broken ones.

[Hexagon] Generate MUX from conditional transfers when dot-new not possible
[Hexagon] Generate MUX from conditional transfers when dot-new not possible

7 years agoSuppress two warnings from MSVC 2015 that are triggered under /W4. Since we turn...
Suppress two warnings from MSVC 2015 that are triggered under /W4. Since we turn off exceptions in the code base, C4577 is moot. C4091 triggers on system headers and is a benign construct.

MIR Serialization: Initial serialization of machine constant pools.
MIR Serialization: Initial serialization of machine constant pools.

This commit implements the initial serialization of machine constant pools and
the constant pool index machine operands. The constant pool is serialized using
a YAML sequence of YAML mappings that represent the constant values.
The target-specific constant pool items aren't serialized by this commit.

Reviewers: Duncan P. N. Exon Smith

test-release.sh: don't include /usr/local prefix in the tarball
test-release.sh: don't include /usr/local prefix in the tarball

[CMake] Cleanup tools/CMakeLists.txt to take advantage of the auto-registration that was already partially working.
[CMake] Cleanup tools/CMakeLists.txt to take advantage of the auto-registration that was already partially working.

Re-landing r242059 which re-landed r241621... I'm really bad at this.

Summary (r242059):
This change re-lands r241621, with an additional fix that was required to allow tool sources to live outside the llvm checkout. It also no longer renames LLVM_EXTERNAL_*_SOURCE_DIR. This change was reverted in r241663, because it renamed several variables of the format LLVM_EXTERNAL_*_* to LLVM_TOOL_*_*.

Summary (r241621):
The tools CMakeLists file already had implicit tool registration, but there were a few things off about it that needed to be altered to make it work. This change addresses all that. The changes in this patch are:

* factored out canonicalizing tool names from paths to CMake variables * removed the LLVM_IMPLICIT_PROJECT_IGNORE mechanism in favor of LLVM_EXTERNAL_${nameUPPER}_BUILD which I renamed to LLVM_TOOL_${nameUPPER}_BUILD because it applies to internal and external tools
* removed ignore_llvm_tool_subdirectory() in favor of just setting LLVM_TOOL_${nameUPPER}_BUILD to Off
* Added create_llvm_tool_options() to resolve a bug in add_llvm_external_project() - the old LLVM_EXTERNAL_${nameUPPER}_BUILD would not work on a clean CMake directory because the option could be created after it was set in code.
* Removed all but the minimum required calls to add_llvm_external_project from tools/CMakeLists.txt

Differential Revision: http://reviews.llvm.org/D10665

[ImplicitNullChecks] Work with implicit defs.
[ImplicitNullChecks] Work with implicit defs.

This change generalizes the implicit null checks pass to work with
instructions that don't have any explicit register defs.  This lets us
use X86's `cmp` against memory as faulting load instructions.

Reviewers: reames, JosephTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11286

MIR Parser: Add support for quoted named global value operands.
MIR Parser: Add support for quoted named global value operands.

This commit extends the machine instruction lexer and implements support for
the quoted global value tokens. With this change the syntax for the global value
identifier tokens becomes identical to the syntax for the global identifier
tokens from the LLVM's assembly language.

Reviewers: Duncan P. N. Exon Smith

Remove Elf_Rela_Iter and Elf_Rel_Iter.
Remove Elf_Rela_Iter and Elf_Rel_Iter.

Use just the pointers and check for invalid relocation sections.

[lit] Implement 'env' in the internal shell
[lit] Implement 'env' in the internal shell

The MSys 2 version of 'env' cannot be used to set 'TZ' in the
environment due to some portability hacks in the process spawning
compatibility layer[1]. This affects test/Object/archive-toc.test, which
tries to set TZ in the environment.

Other than that, this saves a subprocess invocation of a small unix
utility, which is makes the tests faster.

The internal shell does not support shell variable expansion, so this
idiom in the ASan tests isn't supported yet:

[1] https://github.com/Alexpux/MSYS2-packages/issues/294

Differential Revision: http://reviews.llvm.org/D11350

[AArch64] Change EON pattern to match more often.
[AArch64] Change EON pattern to match more often.

Phabricator: http://reviews.llvm.org/D11359
Patch by Geoff Berry <gberry@codeaurora.org>

Miscellaneous Fixes for SparseBitVector
Miscellaneous Fixes for SparseBitVector


1. Fix return value in `SparseBitVector::operator&=`.
2. Add checks if SBV is being assigned is invoking SBV.

Reviewers: dberlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11342

Committed on behalf of sl@

Add missing test for r242296 (vec_sld)
Add missing test for r242296 (vec_sld)

Report errors an invalid virtual addresses.
Report errors an invalid virtual addresses.

Remove unnecessary code.
Remove unnecessary code.

We were locating the dynamic string table via both the section and segment

AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops
AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops

The MUBUF addr64 bit has been removed on VI, so we must use FLAT
instructions when the pointer is stored in VGPRs.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11067

Simplify the search for which segment has a virtual address. NFC.
Simplify the search for which segment has a virtual address. NFC.

Simplify iterating over program headers and detect corrupt ones.
Simplify iterating over program headers and detect corrupt ones.

We now use a simple pointer and have range loops.

[mips] Added support for the ERETNC instruction.
[mips] Added support for the ERETNC instruction.

Summary: This required adding the instruction predicate HasMips32r5.

Patch by Scott Egerton.

Reviewers: dsanders, vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11136

llvm-readobj: Handle invalid references to the string table.
llvm-readobj: Handle invalid references to the string table.

Move CHECKs closer to the RUN line.
Move CHECKs closer to the RUN line.

llvm-readobj: call exit(1) on error.
llvm-readobj: call exit(1) on error.

llvm-readobj exists for testing llvm. We can safely stop the program
the first time we know the input in corrupted.

This is in preparation for making it handle a few more broken files.

Refactor duplicated code. NFC.
Refactor duplicated code. NFC.

Revert "MergeFuncs: Transfer the function parameter attributes to the call site"
Revert "MergeFuncs: Transfer the function parameter attributes to the call site"

It is okay to not transfer parameter attributes.

This reverts commit r242558.

[X86][SSE] Tidied up vector CTLZ/CTTZ. NFCI.
[X86][SSE] Tidied up vector CTLZ/CTTZ. NFCI.

Narrow Callee scope, suggestion from David Blaikie.
Narrow Callee scope, suggestion from David Blaikie.

[X86][SSE] Reordered cast vectorization costs. NFCI.
[X86][SSE] Reordered cast vectorization costs. NFCI.

Reordered the data tables at the top and placed the lookups after. The first stage in the yak shaving necessary to get more accurate costs for a variety of targets given the recent improvements to SINT_TO_FP/UINT_TO_FP/SIGN_EXTEND vector lowering.

De-duplicate CS.getCalledFunction() expression.
De-duplicate CS.getCalledFunction() expression.

Not sure if the optimizer will save the call as getCalledFunction()
is not a trivial access function but the code is clearer this way.

[DAGCombiner] Fixed minor typo that was missed in D9097.
[DAGCombiner] Fixed minor typo that was missed in D9097.

We don't bitcast the UNDEFs - that is done in visitVECTOR_SHUFFLE, and the getValueType should come from the operand's SDValue not the SDNode.

[X86] Add support for tbyte memory operand size for Intel-syntax x86 assembly
[X86] Add support for tbyte memory operand size for Intel-syntax x86 assembly

Differential Revision: http://reviews.llvm.org/D11257
Patch by: marina.yatsina@intel.com

Remove TargetInstrInfo::canFoldMemoryOperand
Remove TargetInstrInfo::canFoldMemoryOperand

canFoldMemoryOperand is not actually used anywhere in the codebase - all existing users instead call foldMemoryOperand directly when they wish to fold and can correctly deduce what they need from the return value.

This patch removes the canFoldMemoryOperand base function and the target implementations; only x86 had a real (bit-rotted) implementation, although AMDGPU had a preparatory stub that had never needed to be completed.

Differential Revision: http://reviews.llvm.org/D11331

AVX-512: Floating point conversions for SKX - DAG Lowering.
AVX-512: Floating point conversions for SKX - DAG Lowering.
SKX supports conversion for all FP types. Integer types include doublewords and quardwords.
I added "Legal" status for these nodes and a bunch of tests.
I added "NoVLX" for AVX DAG selection to force VLX instructions selection when VLX is supported.

Differential Revision: http://reviews.llvm.org/D11255

Use SDValue bool check. NFCI.
Use SDValue bool check. NFCI.

[LIT] Allow for executeCommand to take the stdin input.
[LIT] Allow for executeCommand to take the stdin input.

Summary: This patch allows executeCommand to pass a string to the processes stdin.

Reviewers: ddunbar, jroelofs

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11332

[X86][SSE] Updated SHL/LSHR i64 vectorization costs.
[X86][SSE] Updated SHL/LSHR i64 vectorization costs.

This was missed in D8416.

[AggressiveAntiDepBreaker] Use range loops for multimap access.
[AggressiveAntiDepBreaker] Use range loops for multimap access.

No functionality change intended.

Rangify for loops in GlobalDCE, NFC.
Rangify for loops in GlobalDCE, NFC.

[Hexagon] Use composition instead of inheritance from STL types
[Hexagon] Use composition instead of inheritance from STL types

The standard containers are not designed to be inherited from, as
illustrated by the MSVC hacks for NodeOrdering. No functional change

[X86][SSE] Added additional fp/int tests.
[X86][SSE] Added additional fp/int tests.

Demonstrates some shortfalls in subvector(cvt(x)) compared to cvt(subvector(x)) patterns - especially on AVX/AVX2 targets.

Refreshed tests.
Refreshed tests.

Refreshed tests and reordered in descending integer size.
Refreshed tests and reordered in descending integer size.

Tidyup shufflevector calls - don't repeat inputs if you can avoid it.
Tidyup shufflevector calls - don't repeat inputs if you can avoid it.

[PM/AA] Remove the addEscapingUse update API that won't be easy to directly model in the new PM.
[PM/AA] Remove the addEscapingUse update API that won't be easy to
directly model in the new PM.

This also was an incredibly brittle and expensive update API that was
never fully utilized by all the passes that claimed to preserve AA, nor
could it reasonably have been extended to all of them. Any number of
places add uses of values. If we ever wanted to reliably instrument
this, we would want a callback hook much like we have with ValueHandles,
but doing this for every use addition seems *extremely* expensive in
terms of compile time.

The only user of this update mechanism is GlobalsModRef. The idea of
using this to keep it up to date doesn't really work anyways as its
analysis requires a symmetric analysis of two different memory
locations. It would be very hard to make updates be sufficiently
rigorous to *guarantee* symmetric analysis in this way, and it pretty
certainly isn't true today.

However, folks have been using GMR with this update for a long time and
seem to not be hitting the issues. The reported issue that the update
hook fixes isn't even a problem any more as other changes to
GetUnderlyingObject worked around it, and that issue stemmed from *many*
years ago. As a consequence, a prior patch provided a flag to control
the unsafe behavior of GMR, and this patch removes the update mechanism
that has questionable compile-time tradeoffs and is causing problems
with moving to the new pass manager. Note the lack of test updates --
not one test in tree actually requires this update, even for a contrived

All of this was extensively discussed on the dev list, this patch will
just enact what that discussion decides on. I'm sending it for review in
part to show what I'm planning, and in part to show the *amazing* amount
of work this avoids. Every call to the AA here is something like three
to six indirect function calls, which in the non-LTO pipeline never do
any work! =[

Differential Revision: http://reviews.llvm.org/D11214

[libFuzzer] require the files and directories passed to the fuzzer to exist
[libFuzzer] require the files and directories passed to the fuzzer to exist

[asan] Fix shadow mapping on Android/AArch64.
[asan] Fix shadow mapping on Android/AArch64.

Instrumentation and the runtime library were in disagreement about
ASan shadow offset on Android/AArch64.

This fixes a large number of existing tests on Android/AArch64.

ARM: Enable MachineScheduler and disable PostRAScheduler for swift.
ARM: Enable MachineScheduler and disable PostRAScheduler for swift.

Reapply r242500 now that the swift schedmodel includes LDRLIT.

This is mostly done to disable the PostRAScheduler which optimizes for
instruction latencies which isn't a good fit for out-of-order
architectures. This also allows to leave out the itinerary table in
swift in favor of the SchedModel ones.

This change leads to performance improvements/regressions by as much as
10% in some benchmarks, in fact we loose 0.4% performance over the
llvm-testsuite for reasons that appear to be unknown or out of the
compilers control. rdar://20803802 documents the investigation of
these effects.

While it is probably a good idea to perform the same switch for the
other ARM out-of-order CPUs, I limited this change to swift as I cannot
perform the benchmark verification on the other CPUs.

Differential Revision: http://reviews.llvm.org/D10513

ARM: Add scheduling information for LDRLIT instructions to swift scheduling model
ARM: Add scheduling information for LDRLIT instructions to swift

These pseudo instructions are only lowered after register allocation and
are therefore still present when the machine scheduler runs.
Add a run: line to a testcase that uses the uncommon flags necessary to
actually produce a LDRLIT instruction on swift.

7 years ago[RAGreedy] Add an experimental deferred spilling feature.
[RAGreedy] Add an experimental deferred spilling feature.

The idea of deferred spilling is to delay the insertion of spill code until the
very end of the allocation. A "candidate" to spill variable might not required
to be spilled because of other evictions that happened after this decision was
taken. The spirit is similar to the optimistic coloring strategy implemented in
Preston and Briggs graph coloring algorithm.

For now, this feature is highly experimental. Although correct, it would require
much more modification to properly model the effect of spilling.

Anyway, this early patch helps prototyping this feature.

Note: The test case cannot unfortunately be reduced and is probably fragile.

7 years agoMIR Parser: Allow the dollar characters in all of the identifier tokens.
MIR Parser: Allow the dollar characters in all of the identifier tokens.

This commit modifies the machine instruction lexer so that it now accepts the
'$' characters in identifier tokens.

This change makes the syntax for unquoted global value tokens consistent with
the syntax for the global idenfitier tokens in the LLVM's assembly language.

7 years agoAsmParser: Add a function to parse a standalone constant value.
AsmParser: Add a function to parse a standalone constant value.

This commit extends the interface provided by the AsmParser library by adding a
function that allows the user to parse a standalone contant value.

This change is useful for MIR serialization, as it will allow the MIR Parser to
parse the constant values in a machine constant pool.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10280

7 years ago[asan] Add a comment explaining why non-instrumented allocas are moved.
[asan] Add a comment explaining why non-instrumented allocas are moved.

Addition to r242510.

7 years agoMergeFuncs: Transfer the function parameter attributes to the call site
MergeFuncs: Transfer the function parameter attributes to the call site


7 years agoStart adding documentation for llvm-lib.
Start adding documentation for llvm-lib.

7 years agoRevert "ARM: Enable MachineScheduler and disable PostRAScheduler for swift."
Revert "ARM: Enable MachineScheduler and disable PostRAScheduler for swift."

This reverts commit r242500.

It broke some internal tests and Matthias asked me to revert it while he
is investigating.

7 years agoUse llvm_unreachable() instead of report_fatal_error() if the machine model is incomplete
Use llvm_unreachable() instead of report_fatal_error() if the machine model is incomplete

This error is for developers only so it makes sense to abort and get a

7 years ago[OCaml] Do not use -warn-error in tests.
[OCaml] Do not use -warn-error in tests.

This -warn-error flag invariably gets into release tarballs
and breaks builds on distributions that run tests as a part
of release process. The OCaml binding tests are especially
critical, since they often expose lingering toolchain bugs,
and so it is replaced with -w +A (equivalent to -Wall).

7 years ago[ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
[ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA

No functional change, but it preps codegen for the future when SABSDIFF
will start getting generated in anger.

7 years ago[AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
[AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA

No functional change, but it preps codegen for the future when SABSDIFF
will start getting generated in anger.

7 years agoAdd libunwind to the release scripts
Add libunwind to the release scripts

7 years agoUse inbounds GEPs for memcpy and memset lowering
Use inbounds GEPs for memcpy and memset lowering

Follow-up on discussion in http://reviews.llvm.org/D11220

7 years agoAdd support for producing thin archives in llvm-lib.
Add support for producing thin archives in llvm-lib.

I will send an entry in docs/CommandGuide for review today.

7 years agoEdited the CPUNames table of TargetParser
Edited the CPUNames table of TargetParser
- Changed the default FPU of cortex-m4.
- Removed "cortex-m4f" entry. Currently not supported.

Change-Id: I73121e358aa9e7ba68eb001c2143df390ff2352a
Phabricator: http://reviews.llvm.org/D11100

7 years agoMake global aliases have symbol size equal to their type
Make global aliases have symbol size equal to their type

This is mainly for the benefit of GlobalMerge, so that an alias into a
MergedGlobals variable has the same size as the original non-merged

Differential Revision: http://reviews.llvm.org/D10837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242520 91177308-0d34-0410-b5e6-96231b3b80d8

test-release.sh: Add ability to do a test build using the trunk or branches.

Adds '--svn-path BRANCH' that causes the script to export the specified path
from each project. Otherwise the tag specified by -release, -rc, etc. will be
used. The version portion of the package name will be 'test-$path' (any forward
slashes in the branch name are replaced with underscores), for example:
  -svn-path trunk => clang+llvm-test-trunk-mips-linux-gnu.tar.xz
  -svn-path branches/release_35 => clang+llvm-test-branches_release_35-mips-linux-gnu.tar.xz

This is primarily useful for bringing new release packages up to standard
without needing to create and maintain a tag for the purpose.

Reviewers: tstellarAMD, hans

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6563

7 years ago[PM/AA] Disable the core unsafe aspect of GlobalsModRef in the face of
[PM/AA] Disable the core unsafe aspect of GlobalsModRef in the face of
basic changes to the IR such as folding pointers through PHIs, Selects,
integer casts, store/load pairs, or outlining.

This leaves the feature available behind a flag. This flag's default
could be flipped if necessary, but the real-world performance impact of
this particular feature of GMR may not be sufficiently significant for
many folks to want to run the risk.

Currently, the risk here is somewhat mitigated by half-hearted attempts
to update GlobalsModRef when the rest of the optimizer changes
something. However, I am currently trying to remove that update
mechanism as it makes migrating the AA infrastructure to a form that can
be readily shared between new and old pass managers very challenging.
Without this update mechanism, it is possible that this still unlikely
failure mode will start to trip people, and so I wanted to try to
proactively avoid that.

There is a lengthy discussion on the mailing list about why the core
approach here is flawed, and likely would need to look totally different
to be both reasonably effective and resilient to basic IR changes
occuring. This patch is essentially the first of two which will enact
the result of that discussion. The next patch will remove the current
update mechanism.

Thanks to lots of folks that helped look at this from different angles.
Especial thanks to Michael Zolotukhin for doing some very prelimanary
benchmarking of LTO without GlobalsModRef to get a rough idea of the
impact we could be facing here. So far, it looks very small, but there
are some concerns lingering from other benchmarking. The default here
may get flipped if performance results end up pointing at this as a more
significant issue.

Also thanks to Pete and Gerolf for reviewing!

Differential Revision: http://reviews.llvm.org/D11213

7 years ago[OCaml] Use a nicer style for documentation than OCaml default.
[OCaml] Use a nicer style for documentation than OCaml default.

In particular, it's much easier to read, as it doesn't expand all
the way on wide-screen displays.

CSS committed under LLVM license with explicit permission from
Daniel Bünzli <daniel.buenzli@erratique.ch>.

7 years ago[asan] Fix invalid debug info for promotable allocas
[asan] Fix invalid debug info for promotable allocas

Since r230724 ("Skip promotable allocas to improve performance at -O0"), there is a regression in the generated debug info for those non-instrumented variables. When inspecting such a variable's value in LLDB, you often get garbage instead of the actual value. ASan instrumentation is inserted before the creation of the non-instrumented alloca. The only allocas that are considered standard stack variables are the ones declared in the first basic-block, but the initial instrumentation setup in the function breaks that invariant.

This patch makes sure uninstrumented allocas stay in the first BB.

Differential Revision: http://reviews.llvm.org/D11179

7 years ago[llvm-cxxdump] Don't rely on global state
[llvm-cxxdump] Don't rely on global state

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242509 91177308-0d34-0410-b5e6-96231b3b80d8

Tim Northover [Fri, 17 Jul 2015 03:31:50 +0000 (03:31 +0000)]
AArch64: add comment missed out from earlier patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242503 91177308-0d34-0410-b5e6-96231b3b80d8

Matthias Braun [Fri, 17 Jul 2015 01:44:31 +0000 (01:44 +0000)]
ARM: Enable MachineScheduler and disable PostRAScheduler for swift.

This is mostly done to disable the PostRAScheduler which optimizes for
instruction latencies which isn't a good fit for out-of-order
architectures. This also allows to leave out the itinerary table in
swift in favor of the SchedModel ones.

This change leads to performance improvements/regressions by as much as
10% in some benchmarks, in fact we loose 0.4% performance over the
llvm-testsuite for reasons that appear to be unknown or out of the
compilers control. rdar://20803802 documents the investigation of
these effects.

While it is probably a good idea to perform the same switch for the
other ARM out-of-order CPUs, I limited this change to swift as I cannot
perform the benchmark verification on the other CPUs.

Differential Revision: http://reviews.llvm.org/D10513

7 years agoOnly do fmul (fadd x, x), c combine if the fadd only has one use
Only do fmul (fadd x, x), c combine if the fadd only has one use

This was increasing the instruction count if the fadd has multiple uses.

7 years agoUse small encodings for constants when possible.
Use small encodings for constants when possible.

7 years agoMIR Serialization: Serialize the frame setup machine instruction flag.
MIR Serialization: Serialize the frame setup machine instruction flag.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242491 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoMIR Serialization: Serialize the frame index machine operands.
Alex Lorenz [Thu, 16 Jul 2015 23:37:45 +0000 (23:37 +0000)]
MIR Serialization: Serialize the frame index machine operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242487 91177308-0d34-0410-b5e6-96231b3b80d8

Cong Hou [Thu, 16 Jul 2015 23:23:35 +0000 (23:23 +0000)]
Those new constructors make it more natural to construct an object for a function. For example, previously to build a LoopInfo for a function, we need four statements:

DominatorTree DT;
LoopInfo LI;

Now we only need one statement:

LoopInfo LI(DominatorTree(F));


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242486 91177308-0d34-0410-b5e6-96231b3b80d8

7 years agoArm: Don't define a label twice with two setjmps in a function.
Matthias Braun [Thu, 16 Jul 2015 22:34:20 +0000 (22:34 +0000)]
Arm: Don't define a label twice with two setjmps in a function.

Constructing a name based on the function name didn't give us a unique
symbol if we had more than one setjmp in a function. Using
MCContext::createTempSymbol() always gives us a unique name.

Differential Revision: http://reviews.llvm.org/D9314

7 years agoFix __builtin_setjmp in combination with sjlj exception handling.
Fix __builtin_setjmp in combination with sjlj exception handling.

llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling
style but is also used in clang to implement __builtin_setjmp.  The ARM
backend needs to output additional dispatch tables for the SjLj
exception handling style, these tables however can't be emitted if
llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual
landing pad blocks exist.

To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is
introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj
exception handling lowering, so we can differentiate between the case
where we actually need to setup a dispatch table and the case where we
just need the __builtin_setjmp semantic.

Differential Revision: http://reviews.llvm.org/D9313

7 years agoFix ffiInvoke() use of DataLayout, broken in 242414
Fix ffiInvoke() use of DataLayout, broken in 242414

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242456 91177308-0d34-0410-b5e6-96231b3b80d8

Sanjoy Das [Thu, 16 Jul 2015 22:08:37 +0000 (22:08 +0000)]
[SCEV][NFC] Use triple-slash (///) for comment.

Makes the comments for proveNoWrapByVaryingStart consistent with the
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242451 91177308-0d34-0410-b5e6-96231b3b80d8

Simon Pilgrim [Thu, 16 Jul 2015 21:44:53 +0000 (21:44 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242448 91177308-0d34-0410-b5e6-96231b3b80d8

Tim Northover [Thu, 16 Jul 2015 21:30:21 +0000 (21:30 +0000)]
AArch64: make inexact signalling on round Darwin-specific

C11 leaves the choice on whether round-to-integer operations set the inexact
flag implementation-defined. Darwin does expect it to be set, but this seems to
be against the intent of the IEEE document and slower to implement anyway. So
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242446 91177308-0d34-0410-b5e6-96231b3b80d8

Simon Pilgrim [Thu, 16 Jul 2015 21:14:26 +0000 (21:14 +0000)]
[X86][SSE] Added nounwind attribute to vector shift tests.

Stop i686 codegen from generating cfi directives.

7 years ago[PowerPC] v4i32 is a VSRCRegClass
[PowerPC] v4i32 is a VSRCRegClass

I was looking at some vector code generation and kept seeing
unnecessary vector copies into the Altivec half of the VSX registers.
I discovered that we overlooked v4i32 when adding the register classes
for VSX; we only added v4f32 and v2f64.  This means that anything that
canonicalizes into v4i32 (which is a LOT of stuff) ends up being
forced into VRRC on its way to VSRC.

The fix is one line.  The rest of the patch is fixing up some test
cases whose code generation has changed as a result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242442 91177308-0d34-0410-b5e6-96231b3b80d8

Philip Reames [Thu, 16 Jul 2015 21:10:46 +0000 (21:10 +0000)]
List supported architectures for StackMap section and related intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242441 91177308-0d34-0410-b5e6-96231b3b80d8

Simon Pilgrim [Thu, 16 Jul 2015 21:00:57 +0000 (21:00 +0000)]
[X86][SSE] Updated vector conversion test names.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242440 91177308-0d34-0410-b5e6-96231b3b80d8

Eli Bendersky [Thu, 16 Jul 2015 20:42:38 +0000 (20:42 +0000)]
Streamline the coding style in NVPTXLowerAggrCopies

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242439 91177308-0d34-0410-b5e6-96231b3b80d8

Matthias Braun [Thu, 16 Jul 2015 20:27:01 +0000 (20:27 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242438 91177308-0d34-0410-b5e6-96231b3b80d8

Jingyue Wu [Thu, 16 Jul 2015 20:13:48 +0000 (20:13 +0000)]
[NVPTX] enable SpeculativeExecution in NVPTX

SpeculativeExecution enables a series straight line optimizations (such
as SLSR and NaryReassociate) on conditional code. For example,

  if (...)
    ... b * s ...
  if (...)
    ... (b + 1) * s ...

speculative execution can hoist b * s and (b + 1) * s from then-blocks,
so that we have

  ... b * s ...
  if (...)
  ... (b + 1) * s ...
  if (...)

Then, SLSR can rewrite (b + 1) * s to (b * s + s) because after
speculative execution b * s dominates (b + 1) * s.

The performance impact of this change is significant. It speeds up the
benchmarks running EigenFloatContractionKernelInternal16x16
by roughly 2%. Some internal benchmarks that have the above code pattern
are improved by up to 40%. No significant slowdowns are observed on
Eigen CUDA microbenchmarks.

Reviewers: jholewinski, broune, eliben

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11201

7 years agoAArch64: Implement conditional compare sequence matching.
AArch64: Implement conditional compare sequence matching.

This is a new iteration of the reverted r238793 /
http://reviews.llvm.org/D8232 which wrongly assumed that any and/or
trees can be represented by conditional compare sequences, however there
are some restrictions to that. This version fixes this and adds comments
that explain exactly what types of and/or trees can actually be
implemented as conditional compare sequences.

Related to http://llvm.org/PR20927, rdar://18326194

Differential Revision: http://reviews.llvm.org/D10579

7 years agoAMDPGU/SI: Negative offsets aren't allowed in MUBUF's vaddr operand
AMDPGU/SI: Negative offsets aren't allowed in MUBUF's vaddr operand

Reviewers: arsenm

Subscribers: llvm-commits

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242434 91177308-0d34-0410-b5e6-96231b3b80d8

Tom Stellard [Thu, 16 Jul 2015 19:40:07 +0000 (19:40 +0000)]
AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsets

We can safely assume that the high bit of scratch offsets will never
be set, because this would require at least 128 GB of GPU memory.

Reviewers: arsenm

Subscribers: llvm-commits

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242433 91177308-0d34-0410-b5e6-96231b3b80d8

Matthias Braun [Thu, 16 Jul 2015 18:55:35 +0000 (18:55 +0000)]
LiveInterval: Document and enforce rules about empty subranges.

Empty subranges are not allowed in a LiveInterval and must be removed
instead: Check this in the verifiers, put a reminder for this in the
comment of the shrinkToUses variant for a single lane and make it
automatic for the shrinkToUses variant for a LiveInterval.

7 years agoDo not duplicate method name in comment, remove duplicate comment
Do not duplicate method name in comment, remove duplicate comment

7 years agoDelete an unused function.
Delete an unused function.

Patch by Xan López!

