db icon indicating copy to clipboard operation
db copied to clipboard

Batchinsert and associative arrays

Open SamMousa opened this issue 5 years ago • 11 comments

Clarification

There is more at play here than the "order of elements" argument:

  1. Associative arrays are conceptually like dictionaries. While they have an order in PHP, it is not intuitive to use to depend on it.
  2. In this bug specifically, if you use associative data rows then type casting will not working unless your columns array also uses the same indexes; this is even less intuitive. The columns parameter is the ordered list of columns, their indexes shouldn't matter.

Regardless of the argument that you can construct all kinds of arrays in PHP I propose documenting a preferred structure and actually checking if the code conforms...

What steps will reproduce the problem?

Create code like this:

$x->batchInsert('table1', ['a', 'b'], ['b' => 123, 'a' => 345]);

What is the expected result?

I expect it to work properly, ie create a query like this:

insert into `table1` (`a`, `b`) values (345, 123);

What do you get instead?

insert into `table1` (`a`, `b`) values (123, 345);

Solution

The thing that is wrong here is not so much the code as my assumption: batch insert does not support associative arrays; this means things like column order and also column type casting will go wrong when passing in associative arrays.

My proposal is to actively check each key in the row data and throw a hard exception if it's not numerical. Since we're already iterating we can do this without a significant performance cost; the only lines we'd add are something like this:

if (!is_int($i)) {
    throw new SomeKindOfException('Data rows must be numerically indexed');
}

Additional info

Q A
Yii version latest
PHP version N/A
Operating system N/A
Related issues #14608

SamMousa avatar Aug 10 '18 08:08 SamMousa

The problem is not associative array, but invalid order of elements. If you pass associative array with correct order, everything will work fine:

$x->batchInsert('table1', ['a', 'b'], ['a' => 345, 'b' => 123]);

Throwing exception will break perfectly fine working code.

rob006 avatar Aug 11 '18 11:08 rob006

Associative arrays tend not to be ordered; I know they are but when using them you don't want to depend on the order of associative elements...

SamMousa avatar Aug 11 '18 12:08 SamMousa

And how this is different from arrays indexed by integers? You still can have [2 => 2, 1 => 1, 3 => 3].

rob006 avatar Aug 11 '18 12:08 rob006

It's not, both of those are dictionaries; however php doesn't have lists so they are also dictionaries... These are conceptually different. My point not related to the order of elements BTW, in the correct order things like typecasting still don't work. Check the source if you don't believe me...

SamMousa avatar Aug 11 '18 12:08 SamMousa

I had similar issue and as simple solution I used keys from array as argument.

$array = ['b' => 123, 'a' => 345];
$x->batchInsert('table1', array_keys(reset($array)), $array);

This does not cover different keys order in every array but when every array has same keys order this should work. Also when you change/add/delete one of key this should still work.

paweljankowiak06 avatar Aug 11 '18 13:08 paweljankowiak06

Currently associative arrays have no meaning in batch insert. Keys are ignored. What's the use case for naming columns explicitly in each element of a huge data array?

samdark avatar Aug 19 '18 21:08 samdark

I don't have one, I just happened to get the data with keys. Then I found out the hard way that typecasting was broken because of it.

SamMousa avatar Aug 20 '18 05:08 SamMousa

Well, then we can conclude that the format supplied wasn't supported and it was said nowhere in the docs that it would work. I see no reason to support such format in the future.

samdark avatar Aug 20 '18 10:08 samdark

@samdark You misunderstand me; I'm not proposing adding support, i'm proposing to make failure more obvious:

My proposal is to actively check each key in the row data and throw a hard exception if it's not numerical.

Since we're already iterating it's a cheap thing to do and will make non-apparent failures more obvious.

SamMousa avatar Aug 20 '18 11:08 SamMousa

Ah, good to go then.

samdark avatar Aug 20 '18 22:08 samdark

https://github.com/yiisoft/yii2/pull/17066#issuecomment-471283293

rob006 avatar Mar 10 '19 22:03 rob006