invalid_percent not working as expected with soda-core-spark-df = 3.5.1
Observed that the rule seems to be not working with invalid_percent, the same works with invalid_count, below are the details soda-core-spark-df = 3.5.1 Rule tested = NON_NEGATIVE Check = checks for DATASET_20250701': invalid_percent(SCORE): valid_min: 0 warn: when > 0 name: daily_score_check Scan Summary = daily_score_check [/workspace/checks.yml] [PASSED] check_value: 0.0 row_count: 10 invalid_count: 2
However the same rule when configured with invalid_count seems to work as expected
Check = checks for DATASET_20250701': invalid_count(SCORE): valid_min: 0 warn: when > 0 name: daily_score_check Scan Summary = daily_score_check [/workspace/checks.yml] [WARNED] check_value: 2
CLOUD-9194
I noticed that the check_value for invalid_percent is being incorrectly reported as 0.0 even when the actual percentage is a very small non-zero value. For example : row_count = 1864458 invalid_count = 3 The expected invalid_percent is approximately 0.00016, but the check_value returned post scan is 0.0. This leads to confusion and misreporting in quality checks. It appears that the value is being rounded or truncated during result formatting or serialization in soda-core-spark-df.