A few questions that might be lingering:Q: Why is 2 the minimum? Why not 1 or 0?If the program completed, we know that each thread wrote something. And we also know that all of those writes were >= 1. So we know that 0 isn't possible.We also know that iterations after the first iteration must have read a value written by some previous iteration (though not necessarily from the same thread). So all later iterations must all read a 1 or higher, so could only write 2 or higher.Q: What happens in C or C++ if the program doesn't use explicit atomic loads/stores?As far as I'm aware, in C11/C++11 this is undefined behavior so your program could do anything at all. Earlier standards did not discuss memory races, so racey access to memory is tolerated. However, if you write a loop incrementing an integer five times, the optimizer will probably just optimize the individual increments away and each thread would do "x += 5" instead. But if you disable that optimization you can build the program described here using something like:int x = 0;void func() { for (i=0; i<5; i++) { x++ }}(and then add the thread-spawning bits)Q: I tried it and it always prints 25! What's wrong?The scheduler is being nice to you today. If you want to increase your chances of seeing odd results, run more iterations, and maybe more threads. Or you can add `thread::yield_now()` (Rust) or `sched_yield` (C) in between the load and store, to encourage the scheduler to explore other timelines.