tarantool-php icon indicating copy to clipboard operation
tarantool-php copied to clipboard

Straightforward distinguishing between arrays and maps

Open zaglex opened this issue 8 years ago • 14 comments

Conversion from php-array to lua-table may be little bit tricky.

Consider the case:

$data[0] = 'a';
$data[1] = 'b';
$data[2] = 'c';
$tarantool->call('myfunc', [$data]);

On lua side you will se such a table:

data[0] == nil
data[1] == 'a'
data[2] == 'b'
data[3] == 'c'

But then you do this on php-side:

unset($data[1]);
$tarantool->call('myfunc', [$data]);

And on lua side you will se this:

data[0] == 'a'
data[1] == 'nil'
data[2] == 'c'

I understand why it happens, but anyway this may be confusing and may cause unexpected errors.

Is there any way to force tarantool client to trait array either as "real array" or as "map"? Would be great to have special classes like MessagePackArray and MessagePackMap, or even LuaArray and LuaTable for this purpose.

zaglex avatar Jun 16 '16 11:06 zaglex

From what i see, [0 => 'a', 1 => 'b', 2 => 'c'] is packed to MP array and [0 => 'a', 2 => 'c'] to MP map and on Lua side these structures are converted to Lua array and Lua table accordingly. Why it's confusing and what result you expect to see? Could you please elaborate on this a bit further?

rybakit avatar Jun 16 '16 12:06 rybakit

Imagine space where tuples consist of 2 fields: NUM primary key and TABLE "payload". Payload is a variable-length map: { fieldId => fieldValue}, where fieldId is integer >= 0. When you pass from php to lua something like this:

$fields = [];
$fields[0] = 'field0Val';
$fields[5] = 'field5Val';
$field[7] = 'field7Val';
  • it is treated as map and everything is ok. But if in some case you have sequential fieldIds in some set:
$fields = [];
$fields[0] = 'field0Val';
$fields[1] = 'field1Val';
$field[2] = 'field2Val';
  • then it will be unexpectedly treated as array.

Of course you can disallow to use 0 as field identifier on the app level, so it doesn't look like a critical issue, or may be it is not an issue at all - just bad choice of field identifier in someone's app)

zaglex avatar Jun 17 '16 08:06 zaglex

Ah, I see. I have the similar issue with the msgpack php extension ;) In my own msgpack implementation I use MP ext type to pack arbitrary types (see https://github.com/rybakit/msgpack.php#custom-types) but, afaik, Tarantool's msgpack doesn't support it yet.

rybakit avatar Jun 17 '16 09:06 rybakit

I make the issue https://github.com/rtsisyk/msgpuck/issues/10 at 3 month ago. The part code of the implenentation MP ext type in the hhvm_msgpack: https://github.com/akalend/hhvm-msgpack/blob/hhvm-v-3.12/msgpuck.h

akalend avatar Jun 17 '16 12:06 akalend

@rybakit As I get it - there's no connection between MP_EXT and this issue. @zaglex want a way to enforce something to be Map or Array. That's why some problems with indexing are happened.

bigbes avatar Jun 17 '16 14:06 bigbes

@zaglex what'll happened if you'll do something like this?

$data[1] = NULL;
$tarantool->call('myfunc', [$data]);

bigbes avatar Jun 17 '16 14:06 bigbes

Yes, @bigbes, you are right, I'm speaking about enforcing Map or Array. As for your example, when I do this:

$data[1] = NULL;
$tarantool->call('myfunc', [$data]);

I get this:

data[0]  - nil
data[1]  - cdata<void *>: NULL

zaglex avatar Jun 17 '16 14:06 zaglex

Ah, you mean that this solves problem with unset from my first example. Yes, it looks like a workaround.

zaglex avatar Jun 17 '16 15:06 zaglex

@bigbes I don't see any other sane way to force php array to be packed as MP map. And with ext it's quite easy to implement. It's already possible with pure client, so I can do something like:

// register your own map transformer 
$coll = new Collection([new MapTransformer(5)]);
$packer->setTransformers($coll);

...

// send a map object directly, MapTransformer will take care about the rest
$client->call('myfunc', [new Map(['a', 'b', 'c'])]); // <- will be packed as MP map

The only problem atm is that on Lua side you will not be able to unpack this extension.

rybakit avatar Jun 17 '16 15:06 rybakit

@rybakit it was said - you're creating MessagePackMap/MessagePackArray instance (that inherits ArrayObject) and use it like array. Maybe you can create object, that will store Array. Every other thing will happen in connector (type checking and applying right encoding procedure)

bigbes avatar Jun 17 '16 19:06 bigbes

@bigbes Of course, you can create the classes and do some checks it in the connector, but:

  1. The connector should know about those classes to apply different packing/unpacking rules based on the class name (or other marker, e.g. interface).

  2. Note, that the problem is a bit wider than only map/array distinguishing. The same issue is with MP str/bin format family. Sometimes you need to pack/unpack php string as MP binary and vise versa.

I'm thinking out loud now, but you may consider to add the Tarantool\Ext class:

$tnt->call('myfunct', new Ext(1, ['a', 'b', 'c'])); // we know that 1 means "array"
$tnt->call('myfunct', new Ext(2, ['a', 'b', 'c'])); // and 2 means "map" in our app

And Tarantool can even have predefined ext types for common cases:

Tarantool::EXT_ARRAY = 1
Tarantool::EXT_MAP = 2
Tarantool::EXT_UTF8 = 3
Tarantool::EXT_BIN = 4
...
(we can reserve 0-16 values for future types)

So instead of creating custom MessagePackMap / MessagePackArray / MessagePackUtf8 / MessagePackBin classes you will have one generic Ext class which will do the same. And you can decide if you want to pack/unpack those structures directly in the connector or forward them to Tarantool as MP ext and let him do packing/unpacking (but then, Tarantool should support MP ext out of the box).

rybakit avatar Jun 17 '16 20:06 rybakit

  1. It's what we discuss here - we need those classes or not
  2. Yes we can, but that needs changing of server - and that's more fundamental task, than just rewrite part of connector. It's overingeneering.

bigbes avatar Jun 18 '16 08:06 bigbes

There is a related issue regarding bin/utf8 distinguishing: https://github.com/msgpack/msgpack-php/issues/89.

rybakit avatar Jun 30 '16 20:06 rybakit

For the record, here is an example of what I was talking above regarding the custom Map objects: https://github.com/rybakit/msgpack.php/blob/64adceae711932b7c55f658884c110f5c2e18e50/examples/map.php#L13-L14. The only difference is that the new implementation doesn't use the MessagePack extension, but directly packs the wrapped array into MP map. Though I'm not sure how this functionality can be ported to the pecl connector.

rybakit avatar Apr 14 '18 22:04 rybakit