aws-lambda-events icon indicating copy to clipboard operation
aws-lambda-events copied to clipboard

What is the rationale for converting empty strings to None?

Open Veetaha opened this issue 4 years ago • 5 comments

Why not converting only null to None and leaving empty strings empty? The problem is loosing the information about whether the value was actually null or an empty string, neither of string-null-empty nor string-null-none preserve this knowledge

Veetaha avatar Jul 03 '20 14:07 Veetaha

Mainly because the go library this generates from doesn't really have the distinction...go has default values and the default value of a string is "", so a straight string field gets "" on json null.

I've been debating switching to generating the defs from a language with a better type system but the only compelling one is TS and it is maintained by a third party.

Apparently Amazon has some internal representation of these events but refuses to make them public or generate code from them (the go code is written by hand).

LegNeato avatar Jul 04 '20 07:07 LegNeato

I tried to think what would be the most idiomatic / least surprising in most cases but sadly the Go defs don't give me enough semantic information to cover all the cases. So, we assume "" is most likely an underlying json null and map it to None. In most cases, you likely want to check if a value is there or not and usually an empty string is not a value.

Of course, this doesn't work for the case where an empty string is a value you care about vs None...do you have an example?

LegNeato avatar Jul 04 '20 07:07 LegNeato

do you have an example

I don't have one yet, but I care about cases where empty string is a valid value for me and I don't want for it to be converted to None. What is on top of my head: it is not stated explicitly, but it seems that both None (absent key) and empty string are valid for SNS event Subject field.

we assume "" is most likely an underlying json null and map it to None

This wrong, empty string is empty string, null is null and absent key is absent key. The problem is that we not only make a special case distinction on the length of a string (following current logic: why not considering only-whitespace string as None too, huh?), but we try to encode these 4 states (which better be only 3) in Option which has only 2 states.

Why do we make any kind of assumption at all? Since the conversion "" -> None happens on a real json event we don't need to modify it at all.

I propose the following mapping:

  • "any string, even empty one" -> Some(string_unmodified)
  • null -> None
  • absent key -> None

@LegNeato putting it other way, what cons/pros do you see in removing this function? https://github.com/LegNeato/aws-lambda-events/blob/0212d44826b76d9a3698398beb3715da165cd989/aws_lambda_events/src/custom_serde.rs#L115

Veetaha avatar Jul 04 '20 09:07 Veetaha

The initial problem with Go is that we don't have information about strings nullability so apparently we have to leave the decision about whether the json string can be absent/null to the users of this crate. Generally, Option<T> is a superset of T, so we kinda upcast both required string fields and optional string fields to the common base type (Option<String>) leaving the job for downcasting (via .unwrap()) back to required String (if really requied according to aws docs) to the users. But we don't need to special case empty strings here at all.

Veetaha avatar Jul 04 '20 09:07 Veetaha

Yeah, we do have to special case things / have to pick some stance because Go may not declare a field as nullable and then silently coerce a null to "". Rust would throw an error deserializing null on a non-nullable field.

Blame go's typesystem as it can't express what Rust can and the default values cannot be separated from an underlying json value.

Perhaps I've missed something, happy to take a PR to tweak behavior.

LegNeato avatar Jul 05 '20 01:07 LegNeato