6 years ago[Kaleidoscope][Orc] Fix the fully_lazy Orc Kaleidoscope example.
[Kaleidoscope][Orc] Fix the fully_lazy Orc Kaleidoscope example.

r251933 changed the Orc compile callbacks API, which broke this.

6 years agoFix PR25372 - teach replaceCongruentPHIs to handle cases where SE evaluates a PHI to a SCEVConstant
Fix PR25372 - teach replaceCongruentPHIs to handle cases where SE evaluates a PHI to a SCEVConstant

Since now Scalar Evolution can create non-add rec expressions for PHI
nodes, it can also create SCEVConstant expressions. This will confuse
replaceCongruentPHIs, which previously relied on the fact that SCEV
could not produce constants in this case.

We will now replace the node with a constant in these cases - or avoid
processing the Phi in case of a type mismatch.

Reviewers: sanjoy

Subscribers: llvm-commits, majnemer

Differential Revision: http://reviews.llvm.org/D14230

6 years agoRevert "[Orc] Directly emit machine code for the x86 resolver block and trampolines."
Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines."

This reverts commit r251933.

It broke the build of examples/Kaleidoscope/Orc/fully_lazy/toy.cpp.

6 years agoKaleidoscope-ch2: Remove the dependence on LLVM by cloning make_unique into this project
Kaleidoscope-ch2: Remove the dependence on LLVM by cloning make_unique into this project

6 years ago[Orc] Directly emit machine code for the x86 resolver block and trampolines.
[Orc] Directly emit machine code for the x86 resolver block and trampolines.

Bypassing LLVM for this has a number of benefits:

1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't
   work on Windows as the resolver block was in Darwin asm).

2) For cross-process JITs, it allows resolver blocks and trampolines to be
   emitted directly in the target process, reducing cross process traffic.

3) It should be marginally faster.

6 years agoMove metadata linking after lazy global materialization/linking.
Move metadata linking after lazy global materialization/linking.

Currently, named metadata is linked before the LazilyLinkGlobalValues
list is walked and materialized/linked. As a result, references
from DISubprogram and DIGlobalVariable metadata to yet unmaterialized
functions and variables cause them to be added to the lazy linking
list and their definitions are materialized and linked.

This makes the llvm-link -only-needed option not have the intended
effect when debug information is present, as the otherwise unneeded
functions/variables are still linked in.

Additionally, for ThinLTO I have implemented a mechanism to only link
in debug metadata needed by imported functions. Moving named metadata
linking after lazy GV linking will facilitate applying this mechanism
to the LTO and "llvm-link -only-needed" cases as well.

Reviewers: dexonsmith, tra, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14195

6 years agoPass enum instead of bool to new linkInModule call in llvm-link
Pass enum instead of bool to new linkInModule call in llvm-link

A new call I added to linkInModule from llvm-link in r251866
was still passing in a boolean for an argument that was changed to an
enum in r246561. I didn't catch this in my merge since the bool false
matched the flag value it mapped to.

6 years agoDon't assert if materializing before seeing any function bodies
Filipe Cabecinhas [Tue, 3 Nov 2015 13:48:26 +0000 (13:48 +0000)]
Don't assert if materializing before seeing any function bodies

This assert was reachable from user input. A minimized test case (no
FUNCTION_BLOCK_ID record) is attached.

Bug found with afl-fuzz

6 years agoDon't use Twine objects after their lifetimes end.
Don't use Twine objects after their lifetimes end.

No test, since it would depend on what the compiler can optimize/reuse.
My next commit made this bug visible on Linux Release compiles with some
versions of gcc.

6 years agoLoopVectorizer - skip 'bitcast' between GEP and load.
LoopVectorizer - skip 'bitcast' between GEP and load.

Skipping 'bitcast' in this case allows to vectorize load:

  %arrayidx = getelementptr inbounds double*, double** %in, i64 %indvars.iv
  %tmp53 = bitcast double** %arrayidx to i64*
  %tmp54 = load i64, i64* %tmp53, align 8

Differential Revision http://reviews.llvm.org/D14112

6 years ago[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments

When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.

Darwin does not support this well, so don't use pushes whenever it
would be required.

Differential Revision: http://reviews.llvm.org/D13767

6 years agoAVX512: add encoding tests for vmovq/d instructions.
AVX512: add encoding tests for vmovq/d instructions.

6 years agoRevert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader"
Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader"

Commit 251839 triggers miscompiles on some bots:


(The commit is listed in 13722, but due to an existing failure introduced in
13721 and reverted in 13723 the failure is only visible in 13723)

To verify r251839 is indeed the only change that triggered the buildbot failures
and to ensure the buildbots remain green while investigating I temporarily
revert this commit. At the current state it is unclear if this commit introduced
some miscompile or if it only exposed code to Polly that is subsequently
miscompiled by Polly.

6 years agoFix build problme introduced in r251883
Fix build problme introduced in r251883

6 years agoRegisterPressure: Improve assert message
RegisterPressure: Improve assert message

6 years agoRegisterPressure: Slightly nicer pressure diff dumping
RegisterPressure: Slightly nicer pressure diff dumping

6 years agoScheduleDAGInstrs: Remove IsPostRA flag; NFC
ScheduleDAGInstrs: Remove IsPostRA flag; NFC

ScheduleDAGInstrs doesn't behave differently before or after register
allocation. It was only used in a method of MachineSchedulerBase which
behaved differently in MachineScheduler/PostMachineScheduler. Change
this to let MachineScheduler/PostMachineScheduler just pass in a
parameter to that function.

The order of the LiveIntervals* and bool RemoveKillFlags paramters have
been switched to make out-of-tree code fail instead of unintentionally
passing a value intended for the IsPostRA flag to the
following and default initialized) RemoveKillFlags.

Differential Revision: http://reviews.llvm.org/D14245

6 years agoDon't implicitly construct a Archive::child_iterator.
Don't implicitly construct a Archive::child_iterator.

6 years agoThis never returns end(), simplify to use Child instead of iterator. NFC.
This never returns end(), simplify to use Child instead of iterator. NFC.

6 years agollvm-pdbdump: Simplify. NFC.
llvm-pdbdump: Simplify. NFC.

6 years ago[Hexagon] Fixing mistaken case fallthrough.
[Hexagon] Fixing mistaken case fallthrough.

6 years agoRestore "Support for ThinLTO function importing and symbol linking."
Restore "Support for ThinLTO function importing and symbol linking."

This restores commit r251837, with the new library dependence added to
llvm-link/Makefile to address bot failures.

6 years agoAllow llvm-nm’s single letter command line flags to be grouped.
Allow llvm-nm’s single letter command line flags to be grouped.
Which is needed if we want to replace darwin’s nm(1) with llvm-nm
as there are many uses of grouped flags.  The added test case is
one specific case that is in real use.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251864 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Stop assuming vreg for build_vector
AMDGPU: Stop assuming vreg for build_vector

This was causing a variety of test failures when v2i64
is added as a legal type.

SIFixSGPRCopies should correctly handle the case of vector inputs
to a scalar reg_sequence, so this isn't necessary anymore. This
was hiding some deficiencies in how reg_sequence is handled later,
but this shouldn't be a problem anymore since the register class
copy of a reg_sequence is now done before the reg_sequence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251860 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinter
[WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinter

6 years agoAMDGPU: Error on graphics shaders with HSA
AMDGPU: Error on graphics shaders with HSA

I've found myself pointlessly debugging problems from running
graphics tests with an HSA triple a few times, so stop this from
happening again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251858 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[CGP] widen switch condition and case constants to target's register width (2nd try)
[CGP] widen switch condition and case constants to target's register width (2nd try)

This is a redo of r251849 except the tests have been split into arch-specific folders
to hopefully make the bots happy.

This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
  cmpwi  3, 99
  bgt    0, .LBB0_9
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi         4, 1
  beqlr 0
  cmplwi         4, 10
  bne    0, .LBB0_12
  li 3, 1
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 65436
  beq    0, .LBB0_13
  cmplwi         3, 65526
  beq    0, .LBB0_15
  cmplwi         3, 65535
  bne    0, .LBB0_12
  li 3, 4
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 100
  beq    0, .LBB0_14

  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi  4, 999
  ble 0, .LBB0_5
  lis 3, 0
  ori 3, 3, 65525
  cmpw   4, 3
  bgt    0, .LBB0_9
  cmplwi         4, 1000
  beq    0, .LBB0_14
  cmplwi         4, 65436
  bne    0, .LBB0_13
  li 3, 6
  li 3, 0
  cmplwi         4, 1
  beqlr 0
  cmplwi         4, 10
  beq    0, .LBB0_12
  cmplwi         4, 100
  bne    0, .LBB0_13
  li 3, 2
  cmplwi         4, 65526
  beq    0, .LBB0_15
  cmplwi         4, 65535
  bne    0, .LBB0_13

Differential Revision: http://reviews.llvm.org/D13532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251857 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Un XFAIL a test
Matt Arsenault [Mon, 2 Nov 2015 23:15:46 +0000 (23:15 +0000)]

This should probably be merged with one of the other private memory
tests, but it fails on r600.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251856 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE
Matt Arsenault [Mon, 2 Nov 2015 23:15:42 +0000 (23:15 +0000)]
AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE

Make the REG_SEQUENCE be a VGPR, and do the register class
copy first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251855 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoFix the build I just broke
Fix the build I just broke

6 years agoOrc: Drop some else-after-return, reflow a few spots, and avoid use of pointee types
Orc: Drop some else-after-return, reflow a few spots, and avoid use of pointee types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251853 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[SimplifyLibCalls] Remove variables that are not used. NFC.
[SimplifyLibCalls] Remove variables that are not used. NFC.

6 years agorevert r251849; need to move tests to arch-specific folders
revert r251849; need to move tests to arch-specific folders

6 years agoAdd a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger...
Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger vectorization factor.

To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers.

This is the second attempt to submit this patch. The first attempt got a test failure on ARM. This patch is updated to try to fix the failure (more specifically, by handling the case when VF=1).

Differential revision: http://reviews.llvm.org/D8943

6 years ago[CGP] widen switch condition and case constants to target's register width
[CGP] widen switch condition and case constants to target's register width

This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
  cmpwi  3, 99
  bgt  0, .LBB0_9
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi  4, 1
  beqlr 0
  cmplwi  4, 10
  bne  0, .LBB0_12
  li 3, 1
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi  3, 65436
  beq  0, .LBB0_13
  cmplwi  3, 65526
  beq  0, .LBB0_15
  cmplwi  3, 65535
  bne  0, .LBB0_12
  li 3, 4
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi  3, 100
  beq  0, .LBB0_14

  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi  4, 999
  ble 0, .LBB0_5
  lis 3, 0
  ori 3, 3, 65525
  cmpw  4, 3
  bgt  0, .LBB0_9
  cmplwi  4, 1000
  beq  0, .LBB0_14
  cmplwi  4, 65436
  bne  0, .LBB0_13
  li 3, 6
  li 3, 0
  cmplwi  4, 1
  beqlr 0
  cmplwi  4, 10
  beq  0, .LBB0_12
  cmplwi  4, 100
  bne  0, .LBB0_13
  li 3, 2
  cmplwi  4, 65526
  beq  0, .LBB0_15
  cmplwi  4, 65535
  bne  0, .LBB0_13

Differential Revision: http://reviews.llvm.org/D13532

6 years ago[PPC64LE] Properly initialize instr-info in PPCVSXSwapRemoval pass
[PPC64LE] Properly initialize instr-info in PPCVSXSwapRemoval pass

Replace some hacky code with the proper way to get at this data.

No functional change.

6 years agodon't repeat function names in comments; NFC
don't repeat function names in comments; NFC

6 years ago[SimplifyLibCalls] Merge two if statements. NFC.
[SimplifyLibCalls] Merge two if statements. NFC.

6 years agoRevert "Support for ThinLTO function importing and symbol linking."
Revert "Support for ThinLTO function importing and symbol linking."

This reverts commit r251837, due to a number of bot failures of the form:

loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to
llvm::LLVMContext&, llvm::Module const*, bool)'
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()'

I'm not sure why these are happening - I added Object to the requred
libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS
in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these
symbols come out of libLLVMObject.a. What am I missing?

6 years ago[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader

This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.

Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13974

6 years agoSupport for ThinLTO function importing and symbol linking.
Support for ThinLTO function importing and symbol linking.

Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.

Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.

Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.

Reviewers: dexonsmith, joker.eph, davidxl

Subscribers: tobiasvk, tejohnson, llvm-commits

Differential Revision: http://reviews.llvm.org/D13515

6 years agoMachO: support tvOS and watchOS version min commands in llvm-objdump
MachO: support tvOS and watchOS version min commands in llvm-objdump

6 years agoIn MachineBlockPlacement, filter cold blocks off the loop chain when profile data...
In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available.

In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality.

Consider a simple example shown below:


When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D.

This patch filters those cold blocks off from the loop chain by comparing the ratio:

LoopBBFreq / LoopFreq

to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa.

Differential revision: http://reviews.llvm.org/D11662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251833 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[Support] Assert that reported key+data lenghts match reality
[Support] Assert that reported key+data lenghts match reality

This found a bug in Clang's PTH implementation.

6 years agoFix use-after-free in function index merging code.
Fix use-after-free in function index merging code.

This was flagged by ASAN when using a test case I will be committing
along with D13515.

6 years agoRevert parts accidentally included in r251823
Revert parts accidentally included in r251823

6 years agoStringRef-ify DiagnosticInfoSampleProfile::Filename
StringRef-ify DiagnosticInfoSampleProfile::Filename

6 years agoELF can handle some relocations of the form -sym + constant.
ELF can handle some relocations of the form -sym + constant.

Remove code that was assuming that this would never work.

Thanks to Colin LeMahie for finding and diagnosing the bug.

6 years agoConvert tabs to spaces.
Convert tabs to spaces.

6 years agoFix two issues in MergeConsecutiveStores:
Fix two issues in MergeConsecutiveStores:

1) PR25154. This is basically a repeat of PR18102, which was fixed in
r200201, and broken again by r234430. The latter changed which of the
store nodes was merged into from the first to the last. Thus, we now
also need to prefer merging a later store at a given address into the
target node, instead of an earlier one.

2) While investigating that, I also realized I'd introduced a bug in
r236850. There, I removed a check for alignment -- not realizing that
nothing except the alignment check was ensuring that none of the stores
were overlapping! This is a really bogus way to ensure there's no
aliased stores.

A better solution to both of these issues is likely to always use the
code added in the 'if (UseAA)' branches which rearrange the chain based
on a more principled analysis. I'll look into whether that can be used
always, but in the interest of getting things back to working, I think a
minimal change makes sense.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251816 91177308-0d34-0410-b5e6-96231b3b80d8

Tim Northover [Mon, 2 Nov 2015 18:33:35 +0000 (18:33 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251815 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoWatchOS: update default CPU for triple after t2dsp -> dsp rename
WatchOS: update default CPU for triple after t2dsp -> dsp rename

6 years agoClang format a few prior patches (NFC)
Clang format a few prior patches (NFC)

I had clang formatted my earlier patches using the wrong style.
Reformatted with the LLVM style.

6 years agoTvOS: add missing support for some libcalls.
TvOS: add missing support for some libcalls.

6 years agoPreserve load alignment and dereferenceable metadata during some transformations
Artur Pilipenko [Mon, 2 Nov 2015 17:53:51 +0000 (17:53 +0000)]
Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D13953

6 years agolit: Add '-a' option to display commands+output of all tests
lit: Add '-a' option to display commands+output of all tests

The existing -v option only displays commands and outputs for failed
tests, the newly introduced -a displays it for all executed tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251806 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoAdd missing override statements in ScalarEvolution.h. NFC
Add missing override statements in ScalarEvolution.h. NFC

6 years agoUse static instead of anonymous namespace for helper functions. NFC.
Pawel Bylica [Mon, 2 Nov 2015 14:57:24 +0000 (14:57 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251801 91177308-0d34-0410-b5e6-96231b3b80d8

Silviu Baranga [Mon, 2 Nov 2015 14:41:02 +0000 (14:41 +0000)]
[SCEV][LV] Add SCEV Predicates and use them to re-implement stride versioning

SCEV Predicates represent conditions that typically cannot be derived from
static analysis, but can be used to reduce SCEV expressions to forms which are
usable for different optimizers.

ScalarEvolution now has the rewriteUsingPredicate method which can simplify a
SCEV expression using a SCEVPredicateSet. The normal workflow of a pass using
SCEVPredicates would be to hold a SCEVPredicateSet and every time assumptions
need to be made a new SCEV Predicate would be created and added to the set.
Each time after calling getSCEV, the user will call the rewriteUsingPredicate

We add two types of predicates
SCEVPredicateSet - implements a set of predicates
SCEVEqualPredicate - tests for equality between two SCEV expressions

We use the SCEVEqualPredicate to re-implement stride versioning. Every time we
version a stride, we will add a SCEVEqualPredicate to the context.
Instead of adding specific stride checks, LoopVectorize now adds a more
generic SCEV check.

We only need to add support for this in the LoopVectorizer since this is the
only pass that will do stride versioning.

Reviewers: mzolotukhin, anemet, hfinkel, sanjoy

Subscribers: sanjoy, hfinkel, rengolin, jmolloy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13595

6 years agoFix for bootstrap bug introduced in r244921
Fix for bootstrap bug introduced in r244921

This revision has introduced an issue that only affects bootstrapped compiler
when it is printing the ASM. It turns out that the new code path taken due to
legalizing a scalar_to_vector of i64 -> v2i64 exposes a missing check in a
micro optimization to change a load followed by a scalar_to_vector into a
load and splat instruction on PPC.

6 years agoThis doesn't need a object::Archive::child_iterator.
This doesn't need a object::Archive::child_iterator.

6 years agoAvoid implicitly constructing a Archive::child_iterator.
Avoid implicitly constructing a Archive::child_iterator.

6 years ago[PatternMatch] Switch to use ValueTracking::matchSelectPattern
[PatternMatch] Switch to use ValueTracking::matchSelectPattern

Instead of rolling our own min/max matching code (which is notoriously
hard to get completely right), use ValueTracking's instead.

6 years ago[Support] Extend sys::path with user_cache_directory function.
[Support] Extend sys::path with user_cache_directory function.

The new function sys::path::user_cache_directory tries to discover
a directory suitable for cache storage for current system user.

On Windows and Darwin it returns a path to system-specific user cache directory.

On Linux it follows XDG Base Directory Specification, what is:
- use non-empty $XDG_CACHE_HOME env var,
- use $HOME/.cache.

Reviewers: chapuni, aaron.ballman, rafael

Subscribers: rafael, aaron.ballman, llvm-commits

Differential Revision: http://reviews.llvm.org/D13801

6 years agoAVX512: Implemented encoding and intrinsics for VBROADCASTI32x2 and VBROADCASTF32x2...
AVX512: Implemented encoding and intrinsics for VBROADCASTI32x2 and VBROADCASTF32x2 instructions.

Differential Revision: http://reviews.llvm.org/D14216

6 years ago[X86] Remove assertions that check for valid scale values on scatter/gather intrinsic...
[X86] Remove assertions that check for valid scale values on scatter/gather intrinsics. Nothing upstream prevented illegal values from getting here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251780 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Don't pass a scale value of 0 to scatter/gather intrinsics. This causes the...
Craig Topper [Mon, 2 Nov 2015 07:24:37 +0000 (07:24 +0000)]
[X86] Don't pass a scale value of 0 to scatter/gather intrinsics. This causes the code emitter to throw an assertion if we try to encode it. Need to add a check to fail isel for this, but for now avoid testing it.

6 years ago[X86] Fold 'if' followed by just an llvm_unreachable into an assert.
[X86] Fold 'if' followed by just an llvm_unreachable into an assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251778 91177308-0d34-0410-b5e6-96231b3b80d8

Craig Topper [Mon, 2 Nov 2015 07:24:32 +0000 (07:24 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251777 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[X86] Remove some llvm_unreachables after switches that already have an unreachable...
[X86] Remove some llvm_unreachables after switches that already have an unreachable in their default case.

6 years ago[X86] Remove a 'break' after an llvm_unreachable.
Craig Topper [Mon, 2 Nov 2015 07:24:27 +0000 (07:24 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251775 91177308-0d34-0410-b5e6-96231b3b80d8

Craig Topper [Mon, 2 Nov 2015 07:24:25 +0000 (07:24 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251774 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoFix a -Wpessimizing-move warning.
Fix a -Wpessimizing-move warning.

6 years ago[X86] Use MVT instead of EVT when the type is known to be simple. NFC
Craig Topper [Mon, 2 Nov 2015 05:24:22 +0000 (05:24 +0000)]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251772 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[PGO] Value profiling (index format) code cleanup and testing
[PGO] Value profiling (index format) code cleanup and testing

 1. Added a set of public interfaces in InstrProfRecord
    class to access (read/write) value profile data.
 2. Changed IndexedProfile reader and writer code to
    use the newly defined interfaces and hide implementation
 3. Added a couple of unittests for value profiling:
   - Test new interfaces to get and set value profile data
   - Test value profile data merging with various scenarios.

 No functional change is expected. The new interfaces will also
 make it possible to change on-disk format of value prof data
 to be more compact (to be submitted).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251771 91177308-0d34-0410-b5e6-96231b3b80d8

6 years ago[SCEV] Fix PR25369
[SCEV] Fix PR25369

Have `getConstantEvolutionLoopExitValue` work correctly with multiple
entry loops.

As far as I can tell, `getConstantEvolutionLoopExitValue` never did the
right thing for multiple entry loops; and before r249712 it would
silently return an incorrect answer.  r249712 changed SCEV to fail an
assert on a multiple entry loop, and this change fixes the underlying

6 years agoUntabify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251769 91177308-0d34-0410-b5e6-96231b3b80d8

Davide Italiano [Sun, 1 Nov 2015 17:00:13 +0000 (17:00 +0000)]
The latter might go away (anytime soon).

6 years agoAVX-512: Optimized SIMD truncate operations for AVX512F set.
AVX-512: Optimized SIMD truncate operations for AVX512F set.
Optimized <8 x i32> to <8 x i16>
<4 x i64> to < 4 x i32>
<16 x i16> to <16 x i8>
All these oprtrations use now AVX512F set (KNL). Before this change it was implemented with AVX2 set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251764 91177308-0d34-0410-b5e6-96231b3b80d8

6 years agoRuntimeDyld: add COFF i386 support
RuntimeDyld: add COFF i386 support

This adds support for COFF I386.  This is sufficient for code execution in a
32-bit JIT, though, imported symbols need to custom lowered for the redirection.

6 years agoMake a few definitions static. NFC.
Make a few definitions static. NFC.

6 years agoUse Child instead of child_iterator in the archive writer.
Use Child instead of child_iterator in the archive writer.

We never need to pass end(). This will also remove some complication
once we start adding error checking.

6 years agoSimplify a check. NFC.
Simplify a check. NFC.

6 years ago[SCEV] Don't create SCEV expressions that break LCSSA
[SCEV] Don't create SCEV expressions that break LCSSA

Prevent `createNodeFromSelectLikePHI` from creating SCEV expressions
that break LCSSA.

A better fix for the same issue is to teach SCEVExpander to not break
LCSSA by inserting PHI nodes at appropriate places.  That's planned for
the future.

Fixes PR25360.

6 years ago[SCEV] Use auto and range for; NFC
[SCEV] Use auto and range for; NFC

6 years ago[SimplifyLibCalls] Factor out other common code.
[SimplifyLibCalls] Factor out other common code.

6 years agoThis can take a const reference. NFC.
This can take a const reference. NFC.

6 years agoSamplePGO - Count sample records in embedded profiles when computing coverage.
SamplePGO - Count sample records in embedded profiles when computing coverage.

The initial coverage checking code for sample records failed to count
records inside inlined profiles. This change fixes the oversight.

6 years ago[X86] Replace getScalarType with getVectorElementType when the type is already known...
[X86] Replace getScalarType with getVectorElementType when the type is already known to be a vector. This should result in slightly less code. NFC

6 years agoDon't store a Child to the first regular member.
Rafael Espindola [Sat, 31 Oct 2015 21:44:42 +0000 (21:44 +0000)]
This is a bit ugly, but has a few advantages:
* Archive is now easy to copy since there is no Archive -> Child -> Archive
* It makes it clear that we already checked for errors when finding the Child

6 years agoDelete dead code.
Delete dead code.

6 years agoSimplify handling of archive Symbol tables.
Simplify handling of archive Symbol tables.

We only need to store a StringRef.

6 years ago[SimplifyLibCalls] Add test to ensure transform is not executed if fast-math
[SimplifyLibCalls] Add test to ensure transform is not executed if fast-math
attribute is not present.

During my refactor in r251595 I changed the behavior of optimizeSqrt(),
skipping the transformation if the function wasn't marked with unsafe-fp-math
attribute. This fixed a bug, as confirmed by Sanjay (before the optimization
was silently executed anyway), although it wasn't my primary aim.
This commit adds a test to ensure the code doesn't break again.

Reported by: Marcello Maggioni
Discussed with: Sanjay Patel

6 years agoSimplify the handling of the archive string table.
Simplify the handling of the archive string table.

We only need to store a StringRef

6 years ago[X86] Convert to MVT instead of calling EVT functions since we already know the type...
[X86] Convert to MVT instead of calling EVT functions since we already know the type is simple. NFC

6 years ago[X86] Call getScalarSizeInBits() instead of getScalarType().getScalarSizeInBits(...
[X86] Call getScalarSizeInBits() instead of getScalarType().getScalarSizeInBits(). NFC

6 years ago[X86] Remove two const references to the return value of a constructor and just use...
[X86] Remove two const references to the return value of a constructor and just use normal object creation syntax. NFC

6 years ago[X86] Replace EVT with MVT in some more places. NFC
[X86] Replace EVT with MVT in some more places. NFC

6 years ago[X86] Fix indentation of case statements in switch. NFC
[X86] Fix indentation of case statements in switch. NFC

6 years ago[X86] Reduce math for index calculation for inserting and extracting subvectors and...
[X86] Reduce math for index calculation for inserting and extracting subvectors and elements by exploiting the fact that all supported vector types have a power 2 number of elements.

