[AArch64] Improve codegen of store lane instructions by avoiding GPR usage.