pbc
pbc copied to clipboard
使用pbc中的lua decode api,产生了unicode乱码问题
云风您好,我在使用你的pbc库时出现了decode失败的问题,我的使用环境是这样的:
-
proto文件: package Cookie; message CookieValue { message Segments { optional int32 s_type = 1; optional int32 s_id = 2; optional int32 s_ct = 3; optional int32 s_ttl = 4; }
repeated Segments seg = 1;//last bid info message Pbid{ optional int32 p_id = 1; optional int32 p_value = 2; }
repeated Pbid pbid = 2;//frequency daily message Daily { optional int32 daily_id = 1; message DailyImp { optional int32 d_id = 1; optional int32 d_value = 2; } repeated DailyImp daily_imp = 2; }
repeated Daily daily = 3;//frequency total message Total { optional int32 t_id = 1; optional int32 t_value = 2; }
repeated Total total = 4;message Imp { optional int32 imp_c_id = 1; optional int32 win = 2; optional int32 bid = 3; optional int32 imp_time = 4; }
repeated Imp imp = 5; message Clk { optional int32 clk_id = 1; optional int32 clk_time = 2; } repeated Clk clk = 6; } -
lua代码: local protobuf = require "protobuf" local cjson = require "cjson"
addr = io.open("../../build/cookie.pb","rb") buffer = addr:read "*a" addr:close() protobuf.register(buffer)
local CookieValue={
seg={
{s_type=1,s_ct=1415384160,s_ttl=1415384160,s_id=100001}
},
total={
{t_value=1,t_id=100188}
},
pbid={
{p_value=3,p_id=100188},
{p_value=3,p_id=100189},
{p_value=3,p_id=100190}
},
daily={
{daily_id=1125,daily_imp={{d_value=1,d_id=100188}}},
{daily_id=1126,daily_imp={{d_value=1,d_id=100118}}}
},
imp={
{win=2,bid=3,imp_time=8160,imp_c_id=100188},
{win=2,bid=3,imp_time=8155,imp_c_id=100182},
{win=2,bid=3,imp_time=8157,imp_c_id=100181},
}
}
local buffer = protobuf.encode("Cookie.CookieValue", CookieValue) print('buffer: ',buffer) print('the len of buffer: ',#buffer) local t = protobuf.decode("Cookie.CookieValue", buffer) print('cjson(t): ',cjson.encode(t))
for k,v in pairs(t) do if type(k) == "string" then print(k,v) end end for k,v in pairs(t.daily[1]) do print(k,v) end
for k,v in pairs(t.pbid[1]) do print(k,v,type(k),type(v)) end
- 打印出来的结果: buffer: ݎގ �" ����������* �* �֎* �Վ the len of buffer: 117 cjson(t): {"pbid":[["Cookie.CookieValue.Pbid","\b\u0006\u0010\u0003"],["Cookie.CookieValue.Pbid","\bݎ\u0006\u0010\u0003"],["Cookie.CookieValue.Pbid","\bގ\u0006\u0010\u0003"]],"daily":[["Cookie.CookieValue.Daily","\b�\b\u0012\u0006\b\u0006\u0010\u0001"],["Cookie.CookieValue.Daily","\b�\b\u0012\u0006\b�u0006\u0010\u0001"]],"total":[["Cookie.CookieValue.Total","\b\u0006\u0010\u0001"]],"seg":[["Cookie.CookieValue.Segments"," ����\u0005\u0010��\u0006\b\u0001\u0018����\u0005"]],"imp":[["Cookie.CookieValue.Imp","\u0010\u0002\u0018\u0003 �?\b\u0006"],["Cookie.CookieValue.Imp","\u0010\u0002\u0018\u0003 �?\b֎\u0006"],["Cookie.CookieValue.Imp","\u0010\u0002\u0018\u0003 �?\bՎ\u0006"]]} pbid table: 0x4005ed38 daily table: 0x4005ef30 total table: 0x4005f030 seg table: 0x4005e708 imp table: 0x4005e7b8 1 Cookie.CookieValue.Daily 2 1 Cookie.CookieValue.Pbid number string 2 number string
- 问题 从上面可以看到,生成的二进制(使用encode 接口)长度和google 官方生成的二进制长度是一致的的。但是使用decode接口时出现了unicode乱码问题。 编译生成pbc.a和protobuf.so时我均采用了luajit编译,并且添加了-fPIC选项,如下: gcc -O2 -Wall --shared -fPIC -o protobuf.so -I../.. -I/usr/local/include/luajit-2.0 -L/usr/local/bin -L../../build pbc-lua.c -lpbc -lluajit-5.1 不知对这个问题有没有影响,因为我们对性能要求较高,所以才这么编译了。。谢谢。
pbc lua decode 出来的 table 是惰性展开的。不支持迭代。
cjson.encode(t) 是错误的用法。
发自我的 iPad
在 2014年12月23日,上午10:08,XavierXia [email protected] 写道:
cjson.encode(t))
明白了。谢谢,也在您的博客上看到了相关文章,我准备试试您的sproto。
----- parse/decode size(Bytes) time(second) times
protobuf (binary to table) 117 9.6896359920502 1M万次
cjson (json str to json) 437 14.513430833817
pbc-lua (decode) nil 7.8595468997955
----- seri/encode size(Bytes) time(second) times
protobuf (table to binary) 117 6.0948710441589 1M万次
cjson (json to json str) 437 23.61823201179
pbc-lua (encode) 117 9.381726026535
测试数据如下:
{"segments":[{"type":1,"ct":1415384160,"ttl":1415384160,"id":100001}],"total":[{"value":1,"id":100188}],"pbid":[{"value":3,"id":100188},{"value":3,"id":100189},{"value":3,"id":100190}],"daily":[{"daily_id":1125,"date_id":[{"value":1,"id":100188}]},{"daily_id":1126,"date_id":[{"value":1,"id":100188}]}],"imp":[{"win":2,"bid":3,"time":8160,"id":100188},{"win":2,"bid":3,"time":8160,"id":100189},{"win":2,"bid":3,"time":8160,"id":100190}]}
我顺便做了一下测试,pbc-lua,protobuf 的lua部分均用luajit编译,您这个库的性能表现的很好。。
@XavierXia 请问你这个问题最终怎么解决的?我用的rapidjson,也遇到你这种问题