[X86, AVX2] Replace inserti128 and extracti128 intrinsics with generic shuffles