Data-UUID icon indicating copy to clipboard operation
Data-UUID copied to clipboard

Duplicate UUIDs are generated after forking

Open oschwald opened this issue 7 years ago • 1 comments

As demonstrated in this blog post, it appears duplicate UUIDs can be generated after forking:

$ perl -MData::UUID -E'$u = Data::UUID->new(); my $parent = $$; $parent == $$ && fork for 1..shift; say $u->create_str' 1000 | sort | uniq -c | grep -v '^\s*1\s'
      2 02B8AAEE-9513-11E8-ABC7-22474D6EA9B9
      2 02B8FA58-9513-11E8-93A0-22474D6EA9B9
      2 02BDCE98-9513-11E8-BF6D-22474D6EA9B9
      2 02BDDCB2-9513-11E8-BFBF-22474D6EA9B9
      2 02BF0C54-9513-11E8-98FC-22474D6EA9B9
      2 02C08688-9513-11E8-84BD-22474D6EA9B9
      2 02C0E2EA-9513-11E8-8793-22474D6EA9B9
      2 02C29D24-9513-11E8-A059-22474D6EA9B9
      2 02C5DE26-9513-11E8-A9C4-22474D6EA9B9

I take no credit for discovering this. I am just opening the issue as I didn't see one already.

oschwald avatar Jul 31 '18 23:07 oschwald

Just some thoughts on this:

  • It seems that the same problem was discovered for loading state from file --- there's code in the constructor to modify the internal state using the current pid (see https://github.com/eserte/Data-UUID/blob/49bada021afbff62f2aef6b6981a1ff222f77913/UUID.xs#L370 ). Without this modification same UUIDs may be generated for independent processes which use the same /tmp/.UUID_STATE.
  • Unfortunately perl has no concept of callbacks on forks, which could be used to fix things (e.g. change the internal Data::UUID state in the forked child)
  • Probably the problem can be fixed by always using the current pid when calculating the internal state. This means an additional getpid() call on every UUID creation (on my systems that's an overhead of 5µs to 25µs per call).

eserte avatar Sep 08 '20 07:09 eserte