The first approach one might consider is to synchronize the write position by using an atomic size_t and the fetch_add() member function, as learned in Chapter 10, Concurrency. Whenever a thread tries to write a new element, it fetches the current index and adds one atomically, thus each value is written to a unique index.
In code, we will split the function into two functions: an inner function and an outer function. The atomic write index is defined in the outer function, and the actual implementation in the inner function which we call _inner_par_copy_if_sync().