powertools-lambda-python
powertools-lambda-python copied to clipboard
feat(logger): add thread safe logging keys
Issue number: #991
Summary
Changes
- New logger methods:
append_thread_local_keys: add key to thread-local ContextVarget_current_thread_keys: get all keys from thread-local ContextVarremove_thread_local_keys: remove list of keys from thread-local ContextVarclear_thread_local_keys: remove all keys from thread-local ContextVar
- Add new logic to
_extract_log_keysmethod to handle error when an additional logging key's value is not a string and the name conflicts with a reserved keys clear_stateininject_lambda_contextalso clears thread-local keys
User experience
Before:
import concurrent.futures
from aws_lambda_powertools import Logger
logger = Logger(service="payment")
def process_payments(payment_id):
logger.info("Start processing payment", extra={"payment_id": payment_id})
...
logger.info("End processing payment", extra={"payment_id": payment_id})
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
futures = [executor.submit(process_payments, payment_id) for payment_id in range(10)]
for future in concurrent.futures.as_completed(futures):
future.result()
After:
import concurrent.futures
from aws_lambda_powertools import Logger
logger = Logger(service="payment")
def process_payments(payment_id):
logger.clear_thread_local_keys()
logger.append_thread_local_keys(payment_id=payment_id)
logger.info("Start processing payment")
...
logger.info("End processing payment")
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
futures = [executor.submit(process_payments, payment_id) for payment_id in range(10)]
for future in concurrent.futures.as_completed(futures):
future.result()
Checklist
If your change doesn't seem to apply, please leave them unchecked.
- [x] Meet tenets criteria
- [x] I have performed a self-review of this change
- [x] Changes have been tested
- [x] Changes are documented
- [x] PR title follows conventional commit semantics
Is this a breaking change?: No, existing functionality isn't broken.
RFC issue number:Checklist:
- [ ] Migration process documented
- [ ] Implement warnings (if it can live side by side)
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.
Thanks a lot for your first contribution! Please check out our contributing guidelines and don't hesitate to ask whatever you need. In the meantime, check out the #python channel on our Powertools for AWS Lambda Discord: Invite link
Assigned to @sthulb.
@sthulb I have made some changes to my PR since the last batch of tests failed due to type checking and | not being supported in Python versions < 3.9. Can you please allow the additional test cases to run?
@SimonBFrank We have paused all development and merging in to develop/v3 branches until v3 has gone GA, which is currently scheduled for the 23rd now.
We will review and merge after that. Thank you for this contribution.
This is fine, just got to rebase it into v3
I've done the work to rebase @SimonBFrank's commits on to the v3 branch. This is now fine to merge (assuming tests pass)
Codecov Report
Attention: Patch coverage is 85.10638% with 7 lines in your changes missing coverage. Please review.
Project coverage is 96.15%. Comparing base (
0ccd6b3) to head (9a7ed53). Report is 11 commits behind head on develop.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| aws_lambda_powertools/logging/formatter.py | 81.57% | 5 Missing and 2 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #5141 +/- ##
===========================================
- Coverage 96.20% 96.15% -0.05%
===========================================
Files 229 229
Lines 10761 10806 +45
Branches 2001 2006 +5
===========================================
+ Hits 10353 10391 +38
- Misses 322 327 +5
- Partials 86 88 +2
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Hello @leandrodamascena, thanks for the detailed review. What is the release process for new code? Once it is merged and released, is it imediately accessible or will it get added to a batch or changes?
2 - I'm thinking of the name
...._thread_local...is that the best name? Maybe...._thread_...? I see you usedget_current_thread_keysand notget_current_thread_local_keys, typo or was it intentional? Because I really like having justthreadin the name.
I am not attached to the naming convention. I find thead_local a little weird. I just followed the initial method signatures in the issue. I think replacing thread_local with thread_safe or just thread would totally be fine.
3 - The changes in the documentation are amazing, the examples are very real and someone can understand them easily. But I think this is a bit confusing for the customer who is trying to understand how to use keys with multithreading and not have a specific section for this. I'm thinking about adding a specific section about thread-safe keys and adding this new documentation there. This way we can create a callout at the beginning of the page talking about this and redirect the warnings about "non-thread safe" methods to this new section. I'll work on this on Monday.
Sounds good! 👍🏻 Hopefully my initial work is useful for you.
4 - We need to add additional tests for non-covered area like the change you did in the
_extract_log_keysmethod. BTW, I really like that, it's improve the developer experience.
If I get time this weekend, I will add some more test cases.
5 - Even though this is working perfectly locally, we need to add some
e2e teststo make sure we are having the same local behavior in the Lambda environment. We know that the concept of multithreading is universal and works in any environment, but it's better to be safe and having tests directly with a real function.
Same here. If I get time this weekend, I will add some more test cases.
Hello @leandrodamascena, thanks for the detailed review. What is the release process for new code? Once it is merged and released, is it imediately accessible or will it get added to a batch or changes?
Hey @SimonBFrank our release process is as follows:
1 - We use Trunk-based development and we merge every new change into develop
2 - Every day (except weekends) at 8AM UTC we release an alpha version (check pypi) of this branch. So, customers can tests changes that were merged the day before and anticipate any test or problem.
3 - If we don't have any critical bug, regression or security issue that forces us to release immediately, we release a new production version of this branch every 15 days. This is not set in stone, but we try to maintain this consistency as much as possible.
2 - I'm thinking of the name
...._thread_local...is that the best name? Maybe...._thread_...? I see you usedget_current_thread_keysand notget_current_thread_local_keys, typo or was it intentional? Because I really like having justthreadin the name.I am not attached to the naming convention. I find
thead_locala little weird. I just followed the initial method signatures in the issue. I think replacingthread_localwiththread_safeor justthreadwould totally be fine.
I like the *_thread_safe_* name.
4 - We need to add additional tests for non-covered area like the change you did in the
_extract_log_keysmethod. BTW, I really like that, it's improve the developer experience.If I get time this weekend, I will add some more test cases.
Take your time and try to enjoy the weekend off. No rush to add these new tests (functional and e2e), I can help with that and we can try to include this change in our next release next week.
Thanks.
Hi @SimonBFrank! I've been thinking about our customer experience all weekend, both in terms of our documentation (which we agree needs improvement) and our default behavior. I have some thoughts on this PR.
We're creating new methods that are thread-safe and calling them "local" or whatever name we call, but we're also keeping the old ones. This reminds me of the concept of function coloring in sync/async space. In our case, we're creating a similar situation with our thread-safe "local" methods and the older, non-thread-safe versions. This might make it tricky for developers to use them together, just like mixing "safe" and "non-safe" functions can be challenging.
This morning, I had a meeting with @dreamorosi. He asked some interesting questions:
- Why don't we make the thread-safe behavior the default?
- How is this slower than the non-thread-safe method?
- What possible breaking changes could this cause for our customers?
These are good questions to think about. Let's break them down:
Why don't we make the thread-safe behavior the default?
Making thread-safe behavior the default might seem like a good idea, but it can have drawbacks. I didn't inspect the code yet, but it seems possible to do. The Python logging library is thread-safe by default, so the places we inherit from there like extra_keys (like logger.info and others) this is thread-safe by default.
How is this slower than the non-thread-safe method?
Thread-safe methods can be slower because they require extra steps to ensure they work correctly when multiple threads use them at the same time, and I created a test to test this (the code is below - just example). In the tests I didn't notice any regression regarding the performance, so, I don't this this is a problem.
Non-thread safe REPORT RequestId: e01e70e8-2948-4061-80ed-aff592bea6a0 Duration: 105267.26 ms Billed Duration: 105268 ms Memory Size: 1024 MB Max Memory Used: 784 MB
import json
import io
from aws_lambda_powertools import Logger
my_logger = Logger(stream=io.StringIO())
large_object = {
'prop1': 'prop',
'prop2': 'prop',
'prop3': 'prop',
'prop4': 'prop',
'prop5': 'prop',
'prop6': 'prop',
'prop7': 'prop',
'prop8': 'prop',
'prop9': 'prop',
'prop10': 'prop',
'prop11': 'prop',
'prop12': 'prop',
'prop13': 'prop',
'prop14': 'prop',
'prop15': 'prop',
'prop16': 'prop',
'prop17': 'prop',
'prop18': 'prop',
'prop19': 'prop',
'prop20': 'prop',
'prop21': 'prop',
'prop22': 'prop',
'prop23': 'prop',
'prop24': 'prop',
'prop25': 'prop',
'prop26': 'prop',
'prop27': 'prop',
'prop28': 'prop',
}
def lambda_handler(event, context):
for i in range(1,100000):
my_logger.append_keys(**large_object)
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.remove_keys(list(large_object.keys()))
Thread safe REPORT RequestId: 163aaa0d-b1e2-42ed-be31-40f848f07e01 Duration: 106993.14 ms Billed Duration: 106994 ms Memory Size: 1024 MB Max Memory Used: 790 MB
import json
import io
from aws_lambda_powertools import Logger
my_logger = Logger(stream=io.StringIO())
large_object = {
'prop1': 'prop',
'prop2': 'prop',
'prop3': 'prop',
'prop4': 'prop',
'prop5': 'prop',
'prop6': 'prop',
'prop7': 'prop',
'prop8': 'prop',
'prop9': 'prop',
'prop10': 'prop',
'prop11': 'prop',
'prop12': 'prop',
'prop13': 'prop',
'prop14': 'prop',
'prop15': 'prop',
'prop16': 'prop',
'prop17': 'prop',
'prop18': 'prop',
'prop19': 'prop',
'prop20': 'prop',
'prop21': 'prop',
'prop22': 'prop',
'prop23': 'prop',
'prop24': 'prop',
'prop25': 'prop',
'prop26': 'prop',
'prop27': 'prop',
'prop28': 'prop',
}
def lambda_handler(event, context):
# TODO implement
for i in range(1,100000):
my_logger.append_thread_local_keys(**large_object)
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.info("test")
my_logger.remove_thread_local_keys(list(large_object.keys()))
How is this slower than the non-thread-safe method?
Changing to thread-safe methods could break some customers' existing code. For example, if they rely on the current behavior or performance of our non-thread-safe methods, switching to thread-safe versions might cause unexpected problems or slowdowns in their programs. We've seen that performance isn't an issue we can expect, but I don't know about behavior. One thing we can look at is our tests, if we don't need to make any changes to our tests, that means we're potentially not breaking anything.
What do you think about this? Do you have time to invest in some testing to make thread-safe the default and only behavior?
Again, thank you so much for investing time into this and I would love to hear your thoughts.
@leandrodamascena Hey, thanks for the feedback. My responses may be delayed for the next month as my personal life is getting busy.
I think making the logger thread safe by default would not be what I would want as a feature. Let me explain through an example.
I have a Lambda that processes an order that has multiple items. Each item within an order is processed in a separate thread. When I start my function I used append_keys to add the order_id. Then, at the beginning of each thread I use append_thread_local_keys to add the item_id. Example code below:
Code:
import threading
from typing import List
from aws_lambda_powertools import Logger
from aws_lambda_powertools.utilities.typing import LambdaContext
logger = Logger()
def threaded_func(item_id: str):
logger.append_thread_local_keys(item_id=item_id)
logger.info("Collecting payment")
def lambda_handler(event: dict, context: LambdaContext) -> str:
order_id: str = event["order_id"]
item_ids: List[str] = event["item_ids"]
logger.append_keys(order_id=order_id)
for item in item_ids:
threading.Thread(target=threaded_func, args=(item,)).start()
return "Success"
Input:
lambda_handler({"order_id": "abc123", "item_ids": ["A1", "A2"]}, {})
Output:
{"level":"INFO","location":"threaded_func:12","message":"Collecting payment","timestamp":"2024-10-02 22:46:16,678-0400","service":"service_undefined","order_id":"abc123","item_id":"A1"}
{"level":"INFO","location":"threaded_func:12","message":"Collecting payment","timestamp":"2024-10-02 22:46:16,678-0400","service":"service_undefined","order_id":"abc123","item_id":"A2"}
This is great because I only need to add the order_id once in the beginning of the Lambda call and will be used everywhere. I can use append_keys as "global" logging keys. But, thread specific information (item_id) can be added to each thread. If we were to make append_keys be thread specific, then we would have to add it in every thread. While this is possible, I find this a lot less user friendly. Additionally, I would imagine making append_keys thread safe would break existing behavior which opens a whole new can of worms. Let me know your thoughts.
Hi @SimonBFrank! Thanks for the feedback! Last week was very busy and I'll check it out this week to merge the PR.
Hey @leandrodamascena, any updates on getting this reviewed and merged?
Hey @leandrodamascena, any updates on getting this reviewed and merged?
Hey @SimonBFrank! I've had to deal with some unexpected internal reprioritizations this past week. I dropped the ball a bit here, but tomorrow I'll push some small changes and tests for us to merge, ok?
Thanks
Update 21/10/2024:
- New logger methods:
thread_safe_append_keys: add key to thread-local ContextVarthread_safe_get_current_keys: get all keys from thread-local ContextVarthread_safe_remove_keys: remove list of keys from thread-local ContextVarthread_safe_clear_keys: remove all keys from thread-local ContextVar
- Add new logic to
_extract_log_keysmethod to handle error when an additional logging key's value is not a string and the name conflicts with a reserved keys clear_stateininject_lambda_contextalso clears thread-local keys
Quality Gate passed
Issues
1 New issue
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
Awesome work, congrats on your first merged pull request and thank you for helping improve everyone's experience!
Hi @SimonBFrank! Thanks for your patience and hard work on this PR. I sent a commit with some small changes, but basically I renamed the methods and refactored the documentation to make it more specific.
Please let me know if you have any observation before I ask the @sthulb and merge it.
Looks amazing to me. I will let me team know that this is ready to use. Thanks for the help!