msgpack-tools icon indicating copy to clipboard operation
msgpack-tools copied to clipboard

msgpack2json: parse error: mpack_error_io (2)

Open Obsecurus opened this issue 6 years ago • 3 comments

msgpack2json: parse error: mpack_error_io (2)

msgpack2json -v msgpack2json version 0.6 MPack version 0.9dev -- https://github.com/ludocode/mpack RapidJSON version 1.0.2 -- http://rapidjson.org/ libb64 version 1.2.1 -- http://libb64.sourceforge.net/

I'm using GNU parallel with multiple commands that look like: zcat input/b59afd25-91af-4593-af01-523465b300aa.gz | msgpack2json -c | gzip > output/b59afd25-91af-4593-af01-523465b300aa.gz Once I hit the error I removed GNU parallel from the equation but just running a list of commands exactly as above ^^ with the same error. Any help would be greatly appreciated!

Obsecurus avatar Jul 16 '19 14:07 Obsecurus

Workaround: For now I wrote my own using Python msgpack which is working.

#! /usr/bin/python3

import msgpack
import sys
import json

with open(sys.argv[1], 'rb') as f:
    unpacker = msgpack.Unpacker(f, encoding='utf-8')
    for unpacked in unpacker:
        sys.stdout.write(json.dumps(unpacked))

Obsecurus avatar Jul 16 '19 15:07 Obsecurus

Hmm, that's really strange. I don't see any bugs with continuous mode or any unusual interactions with gzip/zcat, as long as there is data in the file:

$ echo '{"hello": "world", "numbers": [5, 3, 7]}' | json2msgpack > a
$ echo '[1, "two", 3.3, 4e4]' | json2msgpack >> a
$ echo '"hello world!"' | json2msgpack >> a
$ hexdump -C a
00000000  82 a5 68 65 6c 6c 6f a5  77 6f 72 6c 64 a7 6e 75  |..hello.world.nu|
00000010  6d 62 65 72 73 93 05 03  07 94 01 a3 74 77 6f cb  |mbers.......two.|
00000020  40 0a 66 66 66 66 66 66  cb 40 e3 88 00 00 00 00  |@.ffffff.@......|
00000030  00 ac 68 65 6c 6c 6f 20  77 6f 72 6c 64 21        |..hello world!|
0000003e
$ cat a | msgpack2json -c
{"hello":"world","numbers":[5,3,7]}[1,"two",3.3,40000.0]"hello world!"
$ gzip a
$ zcat a.gz | msgpack2json -c
{"hello":"world","numbers":[5,3,7]}[1,"two",3.3,40000.0]"hello world!"
$ zcat a.gz | msgpack2json -c | gzip > output.gz
$ zcat output.gz
{"hello":"world","numbers":[5,3,7]}[1,"two",3.3,40000.0]"hello world!"

I did notice that msgpack2json in continuous mode gives an error on no input, but your python script does not:

$ zcat a.gz | ./test.py /dev/stdin
./test.py:8: DeprecationWarning: encoding is deprecated, Use raw=False instead.
  unpacker = msgpack.Unpacker(f, encoding='utf-8')
{"hello": "world", "numbers": [5, 3, 7]}[1, "two", 3.3, 40000.0]"hello world!"
$ echo -n '' > b
$ gzip b
$ zcat b.gz | msgpack2json -c
msgpack2json: parse error: mpack_error_io (2)
$ zcat b.gz | ./test.py /dev/stdin
./test.py:8: DeprecationWarning: encoding is deprecated, Use raw=False instead.
  unpacker = msgpack.Unpacker(f, encoding='utf-8')
$ echo $?
0

Could it be that one of the files decompresses to nothing? Try this and see if it prints 0:

zcat input/b59afd25-91af-4593-af01-523465b300aa.gz | wc -c

msgpack2json raises an error if there is no message, so continuous mode really means "one or more messages". It might make sense to allow no messages without error in continuous mode, so instead it would mean "any number of messages". Let me know if this is the cause of the bug.

ludocode avatar Jul 17 '19 00:07 ludocode

Hi I face the same error....and after I found this post I tried the comand "zcat input/b59afd25-91af-4593-af01-523465b300aa.gz | wc -c", and it prints "0", so what it means?

ana-ra avatar Apr 29 '22 01:04 ana-ra