Feature request: Add persistent keys option to Logger
Use case
Add custom keys to logs that should persist on every lambda execution and NOT be cleared when using clear_state.
This functionality already exists in the TypeScript version of powertools and can be useful for Python as well.
Solution/User Experience
Just like in the TypeScript version, add a new optional parameter to the Logger initializer. For example, the new option can be called persistent_keys.
This parameter could be a simple flat dict(or similar type?), and if not provided, default to an empty dict.
The value of this new parameter could either be added to self._default_log_keys(since what ever is in that already persists through a state clearing) or be maintained separately but treated just like self._default_log_keys.
Example init:
import os
from aws_lambda_powertools import Logger
logger = Logger(persistent_keys={"role": os.getenv("MY_ROLE", "MY_ROLE")
Example storing keys:
class Logger:
...
def __init__(
...
persistent_keys: dict[str, Any] = {},
...
) -> None:
...
self._default_log_keys = {"service": self.service, "sampling_rate": self.sampling_rate}
if persistent_keys:
self._default_log_keys.update(persistent_keys)
...
Alternative solutions
Acknowledgment
- [x] This feature request meets Powertools for AWS Lambda (Python) Tenets
- [x] Should this be considered in other Powertools for AWS Lambda languages? i.e. Java, TypeScript, and .NET
Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #python channel on our Powertools for AWS Lambda Discord: Invite link
Hi @jth08527! Thanks a lot for opening this issue! I'll need to do some research on the impact of this issue because I don't know if the API will be confused with append_keys and persistent_keys. I'm adding this to our backlog and aiming to take a look at it later this month.
Hey @leandrodamascena! 👋 I've done a comprehensive analysis of the potential API confusion between append_keys and persistent_keys that you mentioned in issue #6002. Here's my detailed investigation and recommendations.
🔍 Issue Summary
The request is to add a persistent_keys parameter to Logger (similar to TypeScript version) that would persist even when clear_state() is called, unlike the current append_keys() which gets cleared.
🎯 Current Python Implementation Analysis
Detailed Code Flow Analysis:
1. Logger Constructor (__init__)
def __init__(self, service=None, sampling_rate=None, **kwargs):
self.service = resolve_env_var_choice(choice=service, env=os.getenv(constants.SERVICE_NAME_ENV, "service_undefined"))
self.sampling_rate = resolve_env_var_choice(choice=sampling_rate, env=os.getenv(constants.LOGGER_LOG_SAMPLING_RATE))
# These keys ALWAYS persist through clear_state()
self._default_log_keys = {"service": self.service, "sampling_rate": self.sampling_rate}
2. append_keys() Method Chain:
# Logger.append_keys() - delegates to formatter
def append_keys(self, **additional_keys: object) -> None:
self.registered_formatter.append_keys(**additional_keys)
# LambdaPowertoolsFormatter.append_keys() - actual implementation
def append_keys(self, **additional_keys) -> None:
self.log_format.update(additional_keys) # Direct dict update
3. clear_state() Method Chain:
# Logger.clear_state() - resets and restores defaults
def clear_state(self) -> None:
self.registered_formatter.clear_state() # Clear formatter state
self.structure_logs(**self._default_log_keys) # Restore defaults
# LambdaPowertoolsFormatter.clear_state() - actual clearing
def clear_state(self) -> None:
self.log_format = dict.fromkeys(self.log_record_order) # Reset structure
self.log_format.update(**self.keys_combined) # Restore constructor keys
4. structure_logs() Method (Key Restoration):
def structure_logs(self, append: bool = False, formatter_options: dict | None = None, **keys) -> None:
log_keys = {**self._default_log_keys, **keys} # Merge defaults with new keys
# Mode 3: Clear existing and add new keys (used by clear_state)
if not append:
self.registered_formatter.clear_state()
self.registered_formatter.thread_safe_clear_keys()
self.registered_formatter.append_keys(**log_keys)
Critical Current Behaviors:
append_keys()→ Directly updatesformatter.log_formatdict (temporary keys)clear_state()→ Resets formatter, then restores_default_log_keysviastructure_logs()_default_log_keys→{"service": self.service, "sampling_rate": self.sampling_rate}- ALWAYS survive- Keys precedence → Later keys override earlier ones (simple dict update)
- Thread safety → Uses
ContextVarfor thread-local temporary keys
Current State Management:
- Logger Level: Tracks
_default_log_keysonly - Formatter Level: Manages all keys in
log_formatdict + thread-localContextVar - No separation: All non-default keys treated as temporary
- Clear behavior: Nuclear reset + selective restoration of defaults
Current Flow Example:
logger = Logger(service="payment") # _default_log_keys = {"service": "payment", "sampling_rate": None}
logger.append_keys(user_id="123") # formatter.log_format = {..., "user_id": "123"}
logger.append_keys(session="abc") # formatter.log_format = {..., "user_id": "123", "session": "abc"}
logger.clear_state() # Reset everything, restore only service/sampling_rate
# Result: Only service="payment" and sampling_rate persist
🚨 API Confusion Concerns (Validated!)
1. Naming & Semantic Confusion
# This is confusing for developers:
logger.append_keys(user_id="123") # Temporary key
logger.persistent_keys = {"env": "prod"} # Persistent key
logger.clear_state() # Only clears user_id, keeps env
# Users won't intuitively understand the difference!
2. TypeScript vs Python Inconsistency
- TypeScript:
resetKeys()vs Python:clear_state() - TypeScript: Has both
appendKeys()ANDappendPersistentKeys() - Python: Only has
append_keys()currently
3. Multiple Ways to Set Persistent Data
# Currently, these both persist through clear_state():
logger = Logger(service="payment") # Via _default_log_keys
logger.persistent_keys = {"env": "prod"} # New proposed way
# This creates confusion about what persists and why
🔧 Technical Implementation Challenges
1. State Management Complexity
class Logger:
def __init__(self, persistent_keys=None):
self._default_log_keys = {"service": self.service, "sampling_rate": self.sampling_rate}
self._persistent_keys = persistent_keys or {} # NEW
self._temporary_keys = {} # NEW - need to track separately
2. Key Conflict Resolution
# What happens here?
logger.append_keys(environment="staging") # Temporary
logger.persistent_keys = {"environment": "prod"} # Persistent
logger.info("test") # Which environment value wins?
3. Clear State Behavior
def clear_state(self):
# Need to preserve BOTH _default_log_keys AND _persistent_keys
# But clear only temporary keys - complex logic needed
💡 Discovered Issues from Code Analysis
1. Formatter vs Logger Responsibility
- Current: Logger delegates to
formatter.clear_state() - Problem: Formatter doesn't know about Logger's persistent keys concept
- Solution: Need coordination between Logger and Formatter
2. structure_logs() Method Overloading
- Currently handles both initialization AND key appending
- Adding persistent keys would make this method even more complex
- Risk of breaking existing behavior
3. Thread Safety with Context Variables
The formatter uses ContextVar for thread-local keys:
def thread_safe_append_keys(self, **additional_keys) -> None:
set_context_keys(**additional_keys)
Persistent keys would need similar thread-safe handling.
🎯 Recommendations
Option 1: Follow TypeScript API Exactly ⭐ RECOMMENDED
logger = Logger(
service="payment",
persistent_keys={"environment": "prod", "version": "1.0"} # Constructor
)
# Runtime methods (matching TypeScript)
logger.append_persistent_keys(region="us-east-1")
logger.remove_persistent_keys(["version"])
logger.append_keys(user_id="123") # Temporary (existing)
logger.clear_state() # Clears only temporary keys
Pros:
- ✅ Consistent with TypeScript version
- ✅ Clear semantic distinction
- ✅ Explicit method names reduce confusion
Option 2: Enhance Current API with Clear Naming
logger = Logger(service="payment", permanent_log_keys={"env": "prod"})
logger.append_temporary_keys(user_id="123") # Rename existing method
logger.append_permanent_keys(version="1.0") # New method
logger.clear_temporary_keys() # Rename existing method
Pros:
- ✅ Very clear semantic meaning
- ✅ Backwards compatible (with deprecation)
Option 3: Single Method with Scope Parameter
logger.append_keys(user_id="123", scope="temporary") # Default
logger.append_keys(env="prod", scope="persistent")
logger.clear_keys(scope="temporary") # Default
logger.clear_keys(scope="persistent")
logger.clear_keys(scope="all")
Pros:
- ✅ Single consistent API
- ❌ More complex parameter handling
🚧 Implementation Strategy (Option 1)
Phase 1: Internal Refactoring
class Logger:
def __init__(self, persistent_keys=None, **kwargs):
self._default_log_keys = {"service": self.service, "sampling_rate": self.sampling_rate}
self._persistent_keys = persistent_keys or {}
self._all_persistent = {**self._default_log_keys, **self._persistent_keys}
def clear_state(self):
self.registered_formatter.clear_temporary_keys() # NEW method
self.structure_logs(**self._all_persistent) # Restore all persistent
Phase 2: Add New Methods
def append_persistent_keys(self, **keys):
self._persistent_keys.update(keys)
self._all_persistent.update(keys)
self.registered_formatter.update_persistent_keys(**keys)
def remove_persistent_keys(self, keys: List[str]):
for key in keys:
self._persistent_keys.pop(key, None)
self._all_persistent.pop(key, None)
self.registered_formatter.remove_persistent_keys(keys)
Phase 3: Formatter Updates
class LambdaPowertoolsFormatter:
def __init__(self, **kwargs):
self._persistent_keys = {}
self._temporary_keys = {}
# Existing logic...
def clear_temporary_keys(self): # NEW
self.log_format = dict.fromkeys(self.log_record_order)
self.log_format.update(**self.keys_combined) # Existing
self.log_format.update(**self._persistent_keys) # NEW
⚠️ Breaking Change Considerations
Backwards Compatibility Strategy:
- Constructor:
persistent_keys=None(optional, no breaking change) - Methods: Keep existing
append_keys()andclear_state()working exactly as before - Deprecation: Optionally deprecate in favor of explicit
append_temporary_keys()
Migration Path:
# V2 (Current) - Still works
logger.append_keys(user_id="123")
logger.clear_state()
# V3 (New) - Recommended
logger.append_keys(user_id="123") # Temporary (unchanged)
logger.append_persistent_keys(env="prod") # Persistent (new)
logger.clear_state() # Clears only temporary (unchanged behavior)
🔍 Edge Cases to Test
- Key Conflicts: Same key set as both temporary and persistent
- Clear State Timing: Multiple invocations with Lambda context reuse
- Thread Safety: Concurrent access to persistent vs temporary keys
- Memory: Large persistent key dictionaries across invocations
- Serialization: Ensure persistent keys don't break JSON serialization
📊 Risk Assessment
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| API Confusion | HIGH | MEDIUM | Clear documentation + TypeScript consistency |
| Breaking Changes | LOW | HIGH | Careful backwards compatibility |
| Performance | LOW | LOW | Minimal overhead for key tracking |
| Memory Leaks | MEDIUM | MEDIUM | Proper cleanup in Lambda context |
🎯 Next Steps
- Decision: Choose API approach (recommend Option 1 for TypeScript consistency)
- Prototype: Implement minimal version for validation
- Testing: Extensive edge case testing
- Documentation: Clear examples showing temporary vs persistent distinction
- Community Feedback: Get input on API design before implementation
💭 Final Thoughts
Your concern about API confusion is 100% valid! The distinction between temporary and persistent keys isn't immediately obvious. Following the TypeScript approach with explicit method names (append_persistent_keys, remove_persistent_keys) would provide the clearest API while maintaining consistency across Powertools languages.
The technical implementation is definitely feasible, but requires careful coordination between Logger and Formatter classes to maintain backwards compatibility while adding the new persistent behavior.
Ready to dive deeper into any specific aspect! 🚀
Wow, thank you for the incredibly detailed analysis and write-up, @dcabib!
For what it's worth, my 2 cents as a user is to have the different languages to be as similar as possible. The fewer amount of differences, the easier it is to use this library if users ever have to hop around between projects of different languages.
Hi both, feature parity and having different versions of Powertools for AWS being as similar as possible to each others is definitely something important to us.
For this specific item we're not yet ready to make a decision because we're working on some items that we'll share over the coming weeks and that might impact this area of the code.
For now I'll put this issue on hold, but we'll revisit it before end of the year.