elephant-bird icon indicating copy to clipboard operation
elephant-bird copied to clipboard

Extend PigToThrift to support thrift maps whose keys are not strings

Open sagemintblue opened this issue 12 years ago • 3 comments

We currently punt on conversion of pig maps to thrift maps when the thrift map key type is not string or enum. This could be improved with some changes to PigToThrift such that pig key strings are converted to the appropriate thrift key type.

sagemintblue avatar Oct 10 '12 23:10 sagemintblue

How would such a conversion work?

dvryaboy avatar Oct 11 '12 05:10 dvryaboy

We could pretty easily support conversion from string key to other simple output key types like i32 or double-- just add logic to apply appropriate string to X conversion within PigToThrift.

Today I fielded a question from someone trying to convert a pig map whose keys were stringified ints to thrift map<i32, ...>. This precipitated my walking them thru the PigToThrift code and pointing out this current limitation (along with the fact that it doesn't fail fast on front end when unsupported input schema is detected, though I think I still have a pull req out for that one).

On Oct 10, 2012, at 10:23 PM, dvryaboy [email protected] wrote:

How would such a conversion work?

— Reply to this email directly or view it on GitHubhttps://github.com/kevinweil/elephant-bird/issues/261#issuecomment-9328954.

sagemintblue avatar Oct 11 '12 06:10 sagemintblue

yep, we need to added schema check for Lzo Thrift/Protobuf storage similar to the check in SequenceFileStorage.

we do toString() on the key in the loader.. so we could do the reverse while storing.

rangadi avatar Oct 11 '12 17:10 rangadi