risingwave
risingwave copied to clipboard
bug: source should never drop event for missing/redundant fields
According to @tabVersion, If an input JSON misses some fields defined in source definition aka. the create source
statement, or contains some fields not defined in definination, the entire JSON will be dropped.
This behavior is quite dangerous. I think we should absorb the incoming data in a best-effor way. For example, leave NULL for missing columns and ignore the additional columns.
And DLQ for corrupted bytes that are not even json?
And DLQ for corrupted bytes that are not even json?
json deserialization is quite deterministic. I think it is okay to drop corrupted ones.
As shown in #4626, the behavior of the json parser is as expected. I believe I was confused by json and debezium json before.
Json parser can insert NULL in missing fields and ignore abundant fields. Since we have no not NULL
constrain now, I think nothing to change.
we can close the issue.