Don't use std::copy and std::copy_backward, run 10% faster.