[X86] Avoid introducing extra shuffles when lowering packed vector shifts.