AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.