form-serialize Avoid sparse arrays and serialize as hash unless we have sequential arrays

The upgrade to 0.7 and the new hash support for arrays broke our usage because we use numeric ids as part of arrays e.g. <input name="stuff[1203843]">. We then end up with very large arrays that cause UI locks further down as they end up being iterated over and trigger millions of iterations over undefined values. With objects we can iterate over the keys just fine and nothing breaks.

I tried to limit the changes but obviously it's impossible to fix this use case without changing others.

Apr 10 '16 14:04 Seldaek

/cc @jxson @phillipj because they have worked on this feature in the past and might have input.

Part of me understands how this broke for you but another part says that we have a bad API around the array syntax. Maybe it was wrong to allow specifying the array index by number and only allow the bracket notation to specify an array with [] or property name with [foo] and nothing else. Basically removing specifying the index into the array at a certain id.

Apr 10 '16 17:04 defunctzombie

Yup the only safe way to handle arrays IMO would be if we checked the whole form names first to see if all of foo[*] form a contiguous set of array keys. If not treat it as an object.. But then it's kinda weird too because depending on your keys you might accidentally get an array or an object, which is not very predictable.

Apr 10 '16 17:04 Seldaek

Would like to get feedback from the folks previously involved before looking at making a breaking change here to avoid rushing it.

Apr 14 '16 02:04 defunctzombie

FWIW I don't have a strong preference as long as there is a very clearly defined contract for the syntax.

Keeping that in mind, I do think the expectation for having numbers represent the index of an array is very common. It's been a while but I seem to remember that this is the way Rails & some PHP frameworks would deal with lists of small arrays (things like tags in CMS systems for instance).

When I refactored the hashing code I tried to stick closely to what the server side query parsers would expect. Specifically I used the qs module as a reference but didn't pay attention to handling sparse arrays.

Having the array index be an object key is probably a bad idea I think a better approach would be to create a sparse array that preserves order (similar to qs). For example:

a[1]="foo"&a[100]="bar"

Should parse to:

[ "foo", "bar" ]

Note that this preserves the order without creating a massive array with a bunch of null values (like @Seldaek pointed out).

This would get around the current bad/buggy behavior and seems like a reasonable compromise when index arrays are defined with very large and separated indexes.

Apr 14 '16 17:04 jxson

The indices (in my array at least) are very much relevant, and should not be stripped away like you suggest IMO.

PHP at least parses that to a hashmap with 1 and 100 as keys: https://3v4l.org/k30eb

I can't tell for Ruby and others, but I would be surprised if any implementation actually treats a[5]=xx as ["xx"]. https://coderwall.com/p/uh8kiw/pass-arrays-objects-via-querystring-the-rack-rails-way seems to indicate Rails does the same as PHP.

Apr 14 '16 18:04 Seldaek

@Seldaek Hmm yeah. So you are effectively wanting to use the array indexes as object property values.

I think it might be okay to remove the array index by number feature. One question I have off the top of my head if that happens is how would you specify an array with multiple values?

For example, currently this works:

<form>
 <input name="todos[0][name]" value="Get milk" />
 <input name="todos[0][status]" value="done" />
 <input name="todos[1][name]" value="Get eggs" />
 <input name="todos[1][status]" value="pending" />
</form>

And will parse to:

{ todos: 
   [ { name: 'Get milk', status: 'done' },
     { name: 'Get eggs', status: 'pending' } ] }

@defunctzombie Another option would be to remove the hash feature altogether and provide quick documentation on how to parse the query string using other modules (qs, querystring). It would reduce the surface area of this module, and make the form to JS object conversion more flexible since people could choose which ever module aligns with their expectations. Additionally, this would have consumers of this module that need form to object conversions lean on more widely tested and reliable methods for converting query string syntaxes.

Apr 14 '16 20:04 jxson

Not sure I got your question correctly, but my understanding is that if you have the same index twice then that goes in the same object, so todos[0][name] creates {0: {name: "..."}} and then you assign [0][status] to it and get {0: {name: "...", status: "..."}}. It's very much like you'd have in JS:

todos = {};
todos['0'] = {}
todos['0']['name'] = "...";
todos['0']['status'] = "...";
todos['1'] = {}
todos['1']['name'] = "...";
todos['1']['status'] = "...";

Apr 14 '16 20:04 Seldaek

Oh sorry now I got it, you say it works and parses to an array, yes that's the one case that doesn't work well if we treat all keys as strings. We can't at parse time know what's coming, but we could after the fact check where we have objects and if all their keys are sequential numbers convert them to arrays.. That sounds kinda hard for what you get though.

I wonder how other libs do this. In PHP it's quite simple as the language does it for you and has only "arrays" which encompass hashmaps AND actual arrays. It also doesn't have sparse arrays like JS does, the array length is always dependent on what's in it, not the indices that are filled.

Apr 14 '16 20:04 Seldaek

form-serialize form-serialize copied to clipboard

Avoid sparse arrays and serialize as hash unless we have sequential arrays

form-serialize
form-serialize copied to clipboard