For AArch64 Neon, simplify scalar dup by lane0 for fp.