Change memcpy/memset/memmove to have dest and source alignments.
authorPete Cooper <peter_cooper@apple.com>
Wed, 18 Nov 2015 22:17:24 +0000 (22:17 +0000)
committerPete Cooper <peter_cooper@apple.com>
Wed, 18 Nov 2015 22:17:24 +0000 (22:17 +0000)
commit8b170f7f290843dc3849eaa75b6f74a87a7a2de6
tree698943dc126191d3c6399f9ee17050d56d3aea6b
parentc4bfc2d61dff2bb6b904d12b756f5b2ebda4ea35
Change memcpy/memset/memmove to have dest and source alignments.

Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253511 91177308-0d34-0410-b5e6-96231b3b80d8
294 files changed:
include/llvm/IR/IRBuilder.h
include/llvm/IR/Instructions.h
include/llvm/IR/IntrinsicInst.h
include/llvm/IR/Intrinsics.td
lib/Analysis/Lint.cpp
lib/CodeGen/CodeGenPrepare.cpp
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
lib/IR/Attributes.cpp
lib/IR/AutoUpgrade.cpp
lib/IR/IRBuilder.cpp
lib/IR/Verifier.cpp
lib/Target/AArch64/AArch64FastISel.cpp
lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
lib/Target/ARM/ARMFastISel.cpp
lib/Target/Mips/MipsFastISel.cpp
lib/Target/X86/X86FastISel.cpp
lib/Transforms/InstCombine/InstCombineCalls.cpp
lib/Transforms/InstCombine/InstCombineInternal.h
lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
lib/Transforms/Instrumentation/MemorySanitizer.cpp
lib/Transforms/Scalar/AlignmentFromAssumptions.cpp
lib/Transforms/Scalar/DeadStoreElimination.cpp
lib/Transforms/Scalar/LoopIdiomRecognize.cpp
lib/Transforms/Scalar/MemCpyOptimizer.cpp
lib/Transforms/Scalar/SROA.cpp
lib/Transforms/Scalar/ScalarReplAggregates.cpp
lib/Transforms/Utils/InlineFunction.cpp
lib/Transforms/Utils/SimplifyLibCalls.cpp
test/Analysis/BasicAA/assume.ll
test/Analysis/BasicAA/cs-cs.ll
test/Analysis/BasicAA/getmodrefinfo-cs-cs.ll
test/Analysis/BasicAA/modref.ll
test/Analysis/CallGraph/no-intrinsics.ll
test/Analysis/DependenceAnalysis/Preliminary.ll
test/Analysis/GlobalsModRef/pr12351.ll
test/Analysis/GlobalsModRef/volatile-instrs.ll
test/Analysis/ScalarEvolution/avoid-smax-1.ll
test/Analysis/ScalarEvolution/trip-count.ll
test/Analysis/ScalarEvolution/trip-count3.ll
test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll
test/Analysis/TypeBasedAliasAnalysis/memcpyopt.ll
test/Bitcode/memintrinsics.3.7.ll [new file with mode: 0644]
test/Bitcode/memintrinsics.3.7.ll.bc [new file with mode: 0644]
test/Bitcode/standardCIntrinsic.3.2.ll
test/CodeGen/AArch64/PBQP-csr.ll
test/CodeGen/AArch64/aarch64-deferred-spilling.ll
test/CodeGen/AArch64/arm64-2012-05-07-MemcpyAlignBug.ll
test/CodeGen/AArch64/arm64-abi-varargs.ll
test/CodeGen/AArch64/arm64-abi_align.ll
test/CodeGen/AArch64/arm64-fast-isel-intrinsic.ll
test/CodeGen/AArch64/arm64-memcpy-inline.ll
test/CodeGen/AArch64/arm64-memset-inline.ll
test/CodeGen/AArch64/arm64-memset-to-bzero.ll
test/CodeGen/AArch64/arm64-misaligned-memcpy-inline.ll
test/CodeGen/AArch64/arm64-misched-basic-A53.ll
test/CodeGen/AArch64/arm64-misched-basic-A57.ll
test/CodeGen/AArch64/arm64-stur.ll
test/CodeGen/AArch64/arm64-virtual_base.ll
test/CodeGen/AArch64/fast-isel-memcpy.ll
test/CodeGen/AArch64/func-argpassing.ll
test/CodeGen/AArch64/memcpy-f128.ll
test/CodeGen/AArch64/tailcall-mem-intrinsics.ll
test/CodeGen/AMDGPU/llvm.memcpy.ll
test/CodeGen/ARM/2009-03-07-SpillerBug.ll
test/CodeGen/ARM/2011-03-10-DAGCombineCrash.ll
test/CodeGen/ARM/2011-10-26-memset-inline.ll
test/CodeGen/ARM/2011-10-26-memset-with-neon.ll
test/CodeGen/ARM/2012-04-24-SplitEHCriticalEdge.ll
test/CodeGen/ARM/Windows/memset.ll
test/CodeGen/ARM/Windows/no-aeabi.ll
test/CodeGen/ARM/crash-O0.ll
test/CodeGen/ARM/debug-info-blocks.ll
test/CodeGen/ARM/dyn-stackalloc.ll
test/CodeGen/ARM/fast-isel-intrinsic.ll
test/CodeGen/ARM/machine-cse-cmp.ll
test/CodeGen/ARM/memcpy-inline.ll
test/CodeGen/ARM/memfunc.ll
test/CodeGen/ARM/memset-inline.ll
test/CodeGen/ARM/stack-protector-bmovpcb_call.ll
test/CodeGen/ARM/struct-byval-frame-index.ll
test/CodeGen/BPF/byval.ll
test/CodeGen/BPF/ex1.ll
test/CodeGen/BPF/sanity.ll
test/CodeGen/Generic/ForceStackAlign.ll
test/CodeGen/Generic/invalid-memcpy.ll
test/CodeGen/Hexagon/mem-fi-add.ll
test/CodeGen/Hexagon/tail-call-mem-intrinsics.ll
test/CodeGen/MSP430/memset.ll
test/CodeGen/Mips/2012-12-12-ExpandMemcpy.ll
test/CodeGen/Mips/Fast-ISel/memtest1.ll
test/CodeGen/Mips/biggot.ll
test/CodeGen/Mips/cconv/arguments-small-structures-bigger-than-32bits.ll
test/CodeGen/Mips/cconv/arguments-varargs-small-structs-byte.ll
test/CodeGen/Mips/cconv/arguments-varargs-small-structs-combinations.ll
test/CodeGen/Mips/cconv/return-struct.ll
test/CodeGen/Mips/largeimmprinting.ll
test/CodeGen/Mips/memcpy.ll
test/CodeGen/Mips/tailcall.ll
test/CodeGen/NVPTX/lower-aggr-copies.ll
test/CodeGen/PowerPC/2011-12-05-NoSpillDupCR.ll
test/CodeGen/PowerPC/2011-12-06-SpillAndRestoreCR.ll
test/CodeGen/PowerPC/ctrloop-reg.ll
test/CodeGen/PowerPC/emptystruct.ll
test/CodeGen/PowerPC/fsl-e500mc.ll
test/CodeGen/PowerPC/fsl-e5500.ll
test/CodeGen/PowerPC/glob-comp-aa-crash.ll
test/CodeGen/PowerPC/isel-rc-nox0.ll
test/CodeGen/PowerPC/memcpy-vec.ll
test/CodeGen/PowerPC/memset-nc-le.ll
test/CodeGen/PowerPC/memset-nc.ll
test/CodeGen/PowerPC/ppc-empty-fs.ll
test/CodeGen/PowerPC/resolvefi-basereg.ll
test/CodeGen/PowerPC/resolvefi-disp.ll
test/CodeGen/PowerPC/structsinmem.ll
test/CodeGen/PowerPC/structsinregs.ll
test/CodeGen/PowerPC/stwu8.ll
test/CodeGen/PowerPC/toc-load-sched-bug.ll
test/CodeGen/SystemZ/memcpy-01.ll
test/CodeGen/SystemZ/memset-01.ll
test/CodeGen/SystemZ/memset-02.ll
test/CodeGen/SystemZ/memset-03.ll
test/CodeGen/SystemZ/memset-04.ll
test/CodeGen/SystemZ/tail-call-mem-intrinsics.ll
test/CodeGen/Thumb/2011-05-11-DAGLegalizer.ll
test/CodeGen/Thumb/dyn-stackalloc.ll
test/CodeGen/Thumb/ldm-stm-base-materialization.ll
test/CodeGen/Thumb/stack-coloring-without-frame-ptr.ll
test/CodeGen/Thumb2/2009-08-04-SubregLoweringBug.ll
test/CodeGen/Thumb2/2012-01-13-CBNZBug.ll
test/CodeGen/X86/2009-01-25-NoSSE.ll
test/CodeGen/X86/2009-11-16-UnfoldMemOpBug.ll
test/CodeGen/X86/2010-04-08-CoalescerBug.ll
test/CodeGen/X86/2010-04-21-CoalescerBug.ll
test/CodeGen/X86/2010-06-25-CoalescerSubRegDefDead.ll
test/CodeGen/X86/2010-09-17-SideEffectsInChain.ll
test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll
test/CodeGen/X86/alignment-2.ll
test/CodeGen/X86/darwin-bzero.ll
test/CodeGen/X86/fast-isel-call.ll
test/CodeGen/X86/fast-isel-x86-64.ll
test/CodeGen/X86/force-align-stack-alloca.ll
test/CodeGen/X86/immediate_merging.ll
test/CodeGen/X86/load-slice.ll
test/CodeGen/X86/lsr-normalization.ll
test/CodeGen/X86/mem-intrin-base-reg.ll
test/CodeGen/X86/memcpy-2.ll
test/CodeGen/X86/memcpy.ll
test/CodeGen/X86/memset-2.ll
test/CodeGen/X86/memset-3.ll
test/CodeGen/X86/memset-sse-stack-realignment.ll
test/CodeGen/X86/memset.ll
test/CodeGen/X86/memset64-on-x86-32.ll
test/CodeGen/X86/misaligned-memset.ll
test/CodeGen/X86/misched-new.ll
test/CodeGen/X86/optimize-max-0.ll
test/CodeGen/X86/pr11985.ll
test/CodeGen/X86/pr14333.ll
test/CodeGen/X86/ragreedy-hoist-spill.ll
test/CodeGen/X86/remat-fold-load.ll
test/CodeGen/X86/small-byval-memcpy.ll
test/CodeGen/X86/stack-protector.ll
test/CodeGen/X86/tailcall-mem-intrinsics.ll
test/CodeGen/X86/tlv-1.ll
test/CodeGen/X86/unaligned-load.ll
test/CodeGen/X86/unwindraise.ll
test/CodeGen/X86/variable-sized-darwin-bzero.ll
test/CodeGen/X86/x86-64-static-relo-movl.ll
test/CodeGen/XCore/memcpy.ll
test/DebugInfo/AArch64/frameindices.ll
test/DebugInfo/X86/array.ll
test/DebugInfo/X86/array2.ll
test/DebugInfo/X86/debug-ranges-offset.ll
test/DebugInfo/X86/pieces-2.ll
test/DebugInfo/X86/pieces-3.ll
test/DebugInfo/X86/sroasplit-1.ll
test/DebugInfo/X86/sroasplit-2.ll
test/DebugInfo/X86/sroasplit-4.ll
test/DebugInfo/X86/sroasplit-5.ll
test/Instrumentation/AddressSanitizer/basic.ll
test/Instrumentation/DataFlowSanitizer/memset.ll
test/Instrumentation/MemorySanitizer/byval-alignment.ll
test/Instrumentation/MemorySanitizer/check_access_address.ll
test/Instrumentation/MemorySanitizer/msan_basic.ll
test/Instrumentation/ThreadSanitizer/tsan_basic.ll
test/Linker/type-unique-simple2-a.ll
test/Linker/type-unique-type-array-a.ll
test/Linker/type-unique-type-array-b.ll
test/Object/mangle-ir.ll
test/Other/lint.ll
test/Transforms/AlignmentFromAssumptions/simple.ll
test/Transforms/AlignmentFromAssumptions/simple32.ll
test/Transforms/BBVectorize/X86/wr-aliases.ll
test/Transforms/CodeGenPrepare/X86/memset_chk-simplify-nobuiltin.ll
test/Transforms/CorrelatedValuePropagation/non-null.ll
test/Transforms/DeadStoreElimination/2011-09-06-MemCpy.ll
test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll
test/Transforms/DeadStoreElimination/crash.ll
test/Transforms/DeadStoreElimination/cs-cs-aliasing.ll
test/Transforms/DeadStoreElimination/lifetime.ll
test/Transforms/DeadStoreElimination/memintrinsics.ll
test/Transforms/DeadStoreElimination/no-targetdata.ll
test/Transforms/DeadStoreElimination/pr11390.ll
test/Transforms/DeadStoreElimination/simple.ll
test/Transforms/GVN/nonescaping-malloc.ll
test/Transforms/GVN/pr17732.ll
test/Transforms/GVN/rle.ll
test/Transforms/GlobalOpt/crash.ll
test/Transforms/GlobalOpt/memcpy.ll
test/Transforms/GlobalOpt/memset-null.ll
test/Transforms/GlobalOpt/memset.ll
test/Transforms/Inline/alloca-dbgdeclare.ll
test/Transforms/Inline/inline-invoke-tail.ll
test/Transforms/Inline/inline-vla.ll
test/Transforms/Inline/noalias-calls.ll
test/Transforms/InstCombine/2007-10-10-EliminateMemCpy.ll
test/Transforms/InstCombine/2009-02-20-InstCombine-SROA.ll
test/Transforms/InstCombine/addrspacecast.ll
test/Transforms/InstCombine/align-addr.ll
test/Transforms/InstCombine/alloca.ll
test/Transforms/InstCombine/call-intrinsics.ll
test/Transforms/InstCombine/malloc-free-delete.ll
test/Transforms/InstCombine/memcpy-from-global.ll
test/Transforms/InstCombine/memcpy-to-load.ll
test/Transforms/InstCombine/memcpy.ll
test/Transforms/InstCombine/memcpy_chk-1.ll
test/Transforms/InstCombine/memmove.ll
test/Transforms/InstCombine/memmove_chk-1.ll
test/Transforms/InstCombine/memset.ll
test/Transforms/InstCombine/memset2.ll
test/Transforms/InstCombine/memset_chk-1.ll
test/Transforms/InstCombine/objsize.ll
test/Transforms/InstCombine/simplify-libcalls.ll
test/Transforms/InstCombine/sprintf-1.ll
test/Transforms/InstCombine/stack-overalign.ll
test/Transforms/InstCombine/stpcpy_chk-1.ll
test/Transforms/InstCombine/strcpy_chk-1.ll
test/Transforms/InstCombine/strncpy_chk-1.ll
test/Transforms/InstCombine/struct-assign-tbaa.ll
test/Transforms/LoopIdiom/basic-address-space.ll
test/Transforms/LoopIdiom/basic.ll
test/Transforms/MemCpyOpt/2008-02-24-MultipleUseofSRet.ll
test/Transforms/MemCpyOpt/2008-03-13-ReturnSlotBitcast.ll
test/Transforms/MemCpyOpt/align.ll
test/Transforms/MemCpyOpt/atomic.ll
test/Transforms/MemCpyOpt/callslot_aa.ll
test/Transforms/MemCpyOpt/callslot_deref.ll
test/Transforms/MemCpyOpt/capturing-func.ll
test/Transforms/MemCpyOpt/form-memset.ll
test/Transforms/MemCpyOpt/memcpy-to-memset-with-lifetimes.ll
test/Transforms/MemCpyOpt/memcpy-to-memset.ll
test/Transforms/MemCpyOpt/memcpy-undef.ll
test/Transforms/MemCpyOpt/memcpy.ll
test/Transforms/MemCpyOpt/memmove.ll
test/Transforms/MemCpyOpt/memset-memcpy-redundant-memset.ll
test/Transforms/MemCpyOpt/memset-memcpy-to-2x-memset.ll
test/Transforms/MemCpyOpt/smaller.ll
test/Transforms/MemCpyOpt/sret.ll
test/Transforms/MergeFunc/vector.ll
test/Transforms/MetaRenamer/metarenamer.ll
test/Transforms/ObjCARC/nested.ll
test/Transforms/PlaceSafepoints/memset.ll
test/Transforms/SROA/address-spaces.ll
test/Transforms/SROA/alignment.ll
test/Transforms/SROA/basictest.ll
test/Transforms/SROA/big-endian.ll
test/Transforms/SROA/slice-order-independence.ll
test/Transforms/SROA/slice-width.ll
test/Transforms/SROA/vector-promotion.ll
test/Transforms/ScalarRepl/2007-05-29-MemcpyPreserve.ll
test/Transforms/ScalarRepl/2008-06-22-LargeArray.ll
test/Transforms/ScalarRepl/2008-08-22-out-of-range-array-promote.ll
test/Transforms/ScalarRepl/2008-09-22-vector-gep.ll
test/Transforms/ScalarRepl/2009-03-04-MemCpyAlign.ll
test/Transforms/ScalarRepl/2009-12-11-NeonTypes.ll
test/Transforms/ScalarRepl/2010-01-18-SelfCopy.ll
test/Transforms/ScalarRepl/2011-05-06-CapturedAlloca.ll
test/Transforms/ScalarRepl/2011-06-17-VectorPartialMemset.ll
test/Transforms/ScalarRepl/2011-10-11-VectorMemset.ll
test/Transforms/ScalarRepl/2011-11-11-EmptyStruct.ll
test/Transforms/ScalarRepl/address-space.ll
test/Transforms/ScalarRepl/badarray.ll
test/Transforms/ScalarRepl/copy-aggregate.ll
test/Transforms/ScalarRepl/crash.ll
test/Transforms/ScalarRepl/inline-vector.ll
test/Transforms/ScalarRepl/memcpy-align.ll
test/Transforms/ScalarRepl/memset-aggregate-byte-leader.ll
test/Transforms/ScalarRepl/memset-aggregate.ll
test/Transforms/ScalarRepl/negative-memset.ll
test/Transforms/ScalarRepl/only-memcpy-uses.ll
test/Transforms/ScalarRepl/vector_memcpy.ll
test/Transforms/Util/combine-alias-scope-metadata.ll
test/Verifier/2006-12-12-IntrinsicDefine.ll
test/Verifier/2008-08-22-MemCpyAlignment.ll [deleted file]
test/Verifier/memcpy.ll