[FLINK-37125] [conf] fixed env.log.max java.lang.NumberFormatException
What is the purpose of the change
Fix for NumberFormatException in Log Configuration This PR resolves an issue where the log.max parameter in config.yaml was treated as a string, causing a java.lang.NumberFormatException when parsed by the DefaultRolloverStrategy plugin.
Changes: Updated the config.sh script to clean up MAX_LOG_FILE_NUMBER by removing leading and trailing single quotes before parsing.
Testing: Verified correct parsing of config env.log.max in a real Hadoop 3.3.6 cluster. Confirmed the history server starts without errors after the change.
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with
@Public(Evolving): (no) - The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
- The S3 file system connector: (no)
Documentation
- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (not applicable)
CI report:
- 7a21dd39851cc83976c1b0ca24409e67966021e1 Azure: FAILURE
Bot commands
The @flinkbot bot supports the following commands:@flinkbot run azurere-run the last Azure build
Reviewed by Chi on 23/01/2025 Go back to the submitter with review comments.
@davidradl The reason for tolerating strings is that ConfigurationParserUtils.loadAndModifyConfiguration inherently processes all YAML configuration values as strings due to how Configuration.toFileWritableMap serializes data using YamlParserUtils.toYAMLString. This results in numeric values being wrapped as strings (e.g., '10').
This fix handles this behaviour by removing unnecessary quotes to ensure the value is interpreted correctly. Adding an error check here might result in false positives, as numeric values are always converted to strings in config.yaml
Hey @davidradl, Just checking in on this PR—are there any updates from the review, or any changes you'd like me to make?
Hi @dawidwys , Could you review and merge this PR for [FLINK-37125]? It fixes a NumberFormatException in env.log.max by ensuring correct parsing. The PR is already approved by @davidradl .
Thanks!
This PR is being marked as stale since it has not had any activity in the last 90 days. If you would like to keep this PR alive, please leave a comment asking for a review. If the PR has merge conflicts, update it with the latest from the base branch.
If you are having difficulty finding a reviewer, please reach out to the community, contact details can be found here: https://flink.apache.org/what-is-flink/community/
If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.
@ferenc-csaky If you see these tickets, you'll see that similar issues have been handled in a targeted manner before, such as in FLINK-4040, FLINK-33721, FLINK-32290
In each case, the fix was localized to avoid widespread changes and potential regressions, I considered updating Configuration.toFileWritableMap to handle type conversions globally. However, this would require significant changes to downstream methods and could introduce unintended side effects.
The change has been tested in a Hadoop 3.3.6 cluster and verified to resolve the error, while not breaking any other functionality, so I went forward with it
@flinkbot run azure
Hi @ferenc-csaky,
Most of the CI tests are now passing, and the PR has received the necessary approvals. Do you think it’s good to merge at this point, or is there anything else you’d like me to address?
@sehajsandhu123 I'd rather not merge this without a successful CI run, can you rebase your commit to the latest master ant force push the PR branch pls? I think that should most probably fix it.
@flinkbot run azure
I’ve just rebased the branch onto the latest master and force-pushed the updates. Could you please trigger a new CI run when you get a chance? Thanks for your help
It's happening after every commited change automatically, but it takes some time to get triggered. Already running for now though. I'll keep an eye on it.
The E2E run in the CI failed because of an Ubuntu package manager network issue, today I had networking probelms with apt-get commands myself, so completely unrelated to this change, hence merging this.