skywalking icon indicating copy to clipboard operation
skywalking copied to clipboard

[Feature] Refactor Python agent UUID generation

Open Superskyyy opened this issue 1 year ago • 9 comments

Search before asking

  • [X] I had searched in the issues and found no similar feature requirement.

Description

Refactoring the uuid generator shows a 30% thoroughput increase in small transactions. UUID in Python agent previously was implemented as a ID class that creates a new UUID1 instance every single time when required, a huge number of uuids is added up as a costly bottleneck that consume large portion of CPU time. New method will replicate the Java agent logic and use a global PROCESSID (uuid4) + ThreadID + Timestamp + (1-9999) as the ids plus other python-specific optimizations.

  • [x] - UUID generation change
  • [x] - Consider forking behavior that is different from Java applications
  • [ ] - Optimize other parts in the tracing core.

Use case

Reduce performance cost of introducing Python agent.

This probably also leads to a new release along with several smaller enhancements.

Related issues

No response

Are you willing to submit a PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

Superskyyy avatar May 23 '23 17:05 Superskyyy

Timstamp + rolling increasing number may face conflict in high concurrency cases. Be careful of that. Java agent includes thread ID to avoid that.

wu-sheng avatar May 23 '23 18:05 wu-sheng

Timstamp + rolling increasing number may face conflict in high concurrency cases. Be careful of that. Java agent includes thread ID to avoid that.

Thanks for reminding! I forgot to mention the threadID, it's also included in the new implementation to sync with Java agent + forking mechanism that is rather frequent in Python.

It will be in the form of f'{GlobalIdGenerator.PROCESS_ID}.{threading.get_ident()}.{GlobalIdGenerator.THREAD_ID_SEQUENCE.context.next_seq()}' - same as Java side.

Superskyyy avatar May 23 '23 19:05 Superskyyy

Good to see. I am aware of UUID cost in Java. Good to know the same in Python. @mrproliu I think it is worth to check on Go agent as well.

wu-sheng avatar May 23 '23 20:05 wu-sheng

Good to see. I am aware of UUID cost in Java. Good to know the same in Python. @mrproliu I think it is worth to check on Go agent as well.

This issue applies to most agents, including skywalking-go (I think)

Superskyyy avatar May 23 '23 22:05 Superskyyy

Good to see. I am aware of UUID cost in Java. Good to know the same in Python. @mrproliu I think it is worth to check on Go agent as well.

This issue applies to most agents, including skywalking-go (I think)

Sure, it looks like a simple version of the snowflake id generator, Am I right? I need to do some performance testing in skywalking-go.

mrproliu avatar May 24 '23 00:05 mrproliu

Yes, it was inspired by that but without coordinator to assign the range. Instead, we tradeoff ID length for a better performance.

wu-sheng avatar May 24 '23 00:05 wu-sheng

Done on Go and Python side, do other agents need an update? @wu-sheng should i keep this issue open as a reminder.

Superskyyy avatar May 24 '23 18:05 Superskyyy

It is fine. If people care, they would follow up. Feel free to close once your side is done.

wu-sheng avatar May 24 '23 20:05 wu-sheng

It is fine. If people care, they would follow up. Feel free to close once your side is done.

Okay!

Superskyyy avatar May 25 '23 00:05 Superskyyy