juniper icon indicating copy to clipboard operation
juniper copied to clipboard

Implement HashMap support the same way as Vec (PR #1070)

Open rimutaka opened this issue 2 years ago • 4 comments

I have a struct where some members are Vec and some are HashSet. They both should convert to/from a GraphQL list.

pub struct Keywords {
    pub full_list: Vec<String>,
    pub unique_list: HashSet<String>,
}

There is a built-in support for Vec<T>, but not for HashSet<T>. I would think that adding HashSet support would make life easier in cases like this.

I copied the impl code for Vec to enable HashMap support in PR #1070. So far it worked fine.

  1. Is there a reason not to do it?
  2. Is there a better way of doing it?

If the idea is sound, would you consider that PR?
I will need to add input support, tests and see if docs need updating before it is ready. Just wanted to check with the maintainers (@tyranron , @ilslv, @LegNeato ?) if I'm on the right track here.

rimutaka avatar May 31 '22 03:05 rimutaka

@rimutaka as an output type HashSet should work just fine, but I'm not sure what the behaviour should be on input. Should HashSet just remove duplicates silently or error?

ilslv avatar May 31 '22 04:05 ilslv

@rimutaka HashSet and GraphQL List have different sematics. The later is explicitly stated to be ordered:

GraphQL services must return an ordered list as the result of a list type.

I'm unsure what implications we will have. At first glance this seems to be OK for output, as with usual Vecs the user still can do any ordering. For input, though, the validation is happened before we resolve input value into HashSet, so the error reporting should be preserved, so seems to be OK too.


@ilslv

Should HashSet just remove duplicates silently or error?

I think error, as we do for arrays. If the user uses HashSet in schema, he wants to expose its invariants there. If he doesn't, nothing stops him from using Vec and then reduce it to HashSet.

tyranron avatar May 31 '22 08:05 tyranron

@ilslv , @tyranron , thanks for your prompt replies! You raised some good points.

  1. HashSet stores and outputs elements in random order. Not sure if this is a blocker. In the end, if the user chooses a HashSet over a Vec then the order is probably not important.

  2. serde quietly dedupes the input and HashMap::insert() returns bool, not Error, so the default seems to be to ignore the duplicates. Here is a simple example with multiple 3s:

println!("{:?}", serde_json::from_str::<HashSet<i32>>("[1,2,3,3,3]").unwrap());

prints {3, 2, 1}

It seems ignoring duplicates is the default with HashSets. I am not opposed to an error.

  1. I havn't got to the point of checking what HashSet looks like in the schema or trying it with GraphQL input.

Let me investigate a bit more before taking it any further.

rimutaka avatar May 31 '22 09:05 rimutaka

@rimutaka

In the end, if the user chooses a HashSet over a Vec then the order is probably not important.

Yup, from the server side. I wonder if it may have some implications on a client side.

  1. I havn't got to the point of checking what HashSet looks like in the schema or trying it with GraphQL input.

Well, on the schema level HashSet will look like a regular GraphQL List for both input and output. Fully similar to how Vec looks like.

It seems ignoring duplicates is the default with HashSets. I am not opposed to an error.

I'm unsure about duplicates, despite serde_json does it. GraphQL differs from JSON in the way it allows to define custom semantics. Imagine we have a query ListLength(list: [Int]!): Int!, then the implementor may accidentally broke its semantics by choosing a HashSet as an argument which silently eats elements.

But It seems reasonable that library users may want HashSet eating duplicates because they don't bother.

It's hard to say for sure which case is more common. But I'd better go with strict defaults. We still may provide a transparent wrapper for a different behaviour:

#[derive(Clone, Debug, Deref, DerefMut, From, Into, RefCast)]
#[transparent]
struct DedupHashSet<T>(HashSet<T>);

and using it

#[derive(GraphQLInputObject)]
pub struct Keywords {
    pub full_list: Vec<String>,
    pub unique_list: juniper::types::DedupHashSet<String>,
}

will allow to use .unique_list field almost the same as it was HashSet.

tyranron avatar May 31 '22 10:05 tyranron