X86-FMA3: Improved/enabled the memory folding optimization for scalar loads