opentelemetry-python icon indicating copy to clipboard operation
opentelemetry-python copied to clipboard

Allow passing a custom random number generator to trace.RandomIdGenerator

Open agcom opened this issue 7 months ago • 3 comments

Description

I am using OpenTelemetry to trace gRPC calls in my distributed machine learning application, and I fixed the global random seed in all replicas via random.seed(42) to make the application's result reproducible; but, it causes duplicate trace and span ids rendering the traces corrupt.

This PR adds the possibility to create/use an instance of trace.RandomIdGenerator that utilizes a custom random number generator instance (e.g. random.Random()); It preserves backward compatibility and previous users expectations by defaulting to the global random instance as before.

Fixes #4376.

Type of change

  • [x] New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • [x] Ran the existing unit tests.

To reproduce my use-case, write a Python script with the following pseudo-algorithm:

  • Fix the global random seed.
  • Instrument the gRPC server and client (using the default id generator), exporting to an OpenTelemetry Collector instance which in-turn optionally exports them to a Jaeger instance.
  • Spin up a gRPC server with a simple NOP function.
  • Every x seconds, randomly choose a peer (assume running multiple instances of the script with pre-determined addresses) and call its function.
  • Run multiple instances of the script.
  • Find the duplicate ids in the output traces (can be done through the Jaeger UI which visibly warns about duplicate ids).

Does This PR Require a Contrib Repo Change?

  • [ ] Yes.
  • [x] No.

Checklist:

  • [x] Followed the style guidelines of this project.
  • [x] Change-logs have been updated.
  • [ ] Unit tests have been added.
  • [x] Documentation has been updated.

agcom avatar May 07 '25 19:05 agcom

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: agcom / name: Alireza Ghasemi (9fedfbc3ac25839e0e2680ce4ca2d84c349a7715, 4ab2049d9d650ca65c76a73dfa001acb7de26d7c, dc5133bf4c6ce784f95988e979dd8069d64cd01c, 914a30fd447bc222e5bcc9e266ada206e7ed4db3, 7af0a739880cb02db0435a55e8f049416ab0fc6a)
  • :white_check_mark: login: lzchen / name: Leighton Chen (0bc5a8ba97f05257fca3c2da585cb07a46d71708)

Ideally, the default RNG should be a fresh Random() (which picks a high-entropy seed through combining /dev/urandom, the current time, and the process identifier) and not the global random instance; but, that would be a breaking change/update, affecting the previous codes and users expectations relying on that behavior (unlikely, but yet possible).

agcom avatar May 07 '25 20:05 agcom

I am not sure of what unit tests to add if they are mandatory for this change. In the tests/trace/test_trace.py file, there are lines where an instance of RandomIdGenerator is created and used to generate a few ids; I could change them all to RandomIdGenerator(Random()).

agcom avatar May 07 '25 20:05 agcom