python-sdk
python-sdk copied to clipboard
Resolving Standard Input Encoding Issues: Wrapping sys.stdin with UTF-8
This change ensures that the standard input stream (sys.stdin
) is read with UTF-8 encoding by re-wrapping it using io.TextIOWrapper
. This addresses potential encoding issues where the default system encoding might not be UTF-8 (e.g., GBK on some systems), leading to incorrect character interpretation.
Motivation and Context
In certain environments, the default encoding for sys.stdin
might be something other than UTF-8 (like GBK). When the application expects UTF-8 encoded input, this discrepancy can lead to UnicodeDecodeError
or incorrect interpretation of characters. This change ensures that regardless of the system's default locale, the input stream is treated as UTF-8, which is a more universal and recommended encoding for modern applications. This fixes a potential bug where the application might fail or behave unexpectedly when receiving non-ASCII characters through standard input in such environments.
How Has This Been Tested?
This change has been tested by:
- Manually testing with input containing non-ASCII characters (e.g., 中文) in an environment where the default locale is set to GBK.
- Verifying that the application correctly reads and processes these characters without encoding errors.
- Confirming that the change does not negatively impact environments where the default locale is already UTF-8.
Ideally, more comprehensive testing would involve setting up CI jobs with different locales to ensure consistent behavior across various environments.
Breaking Changes
No, this is a non-breaking change. It addresses a potential issue with encoding and makes the application more robust. Users do not need to update their code or configurations.
Types of changes
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Documentation update
Checklist
- [x] I have read the MCP Documentation
- [x] My code follows the repository's style guidelines
- [x] New and existing tests pass locally
- [x] I have added appropriate error handling
- [x] I have added or updated documentation as needed
Additional context
The decision to re-wrap sys.stdin
with io.TextIOWrapper
was made to ensure consistent UTF-8 encoding without modifying the underlying file descriptor or relying on environment variables. This approach is generally considered a safe and effective way to handle encoding issues with standard input in Python. It's important to note that the input source should ideally be sending UTF-8 encoded data for this fix to be fully effective. This change ensures that the application interprets the input as UTF-8.