elephant-bird
elephant-bird copied to clipboard
Extend PigToThrift to support thrift maps whose keys are not strings
We currently punt on conversion of pig maps to thrift maps when the thrift map key type is not string or enum. This could be improved with some changes to PigToThrift such that pig key strings are converted to the appropriate thrift key type.
How would such a conversion work?
We could pretty easily support conversion from string key to other simple output key types like i32 or double-- just add logic to apply appropriate string to X conversion within PigToThrift.
Today I fielded a question from someone trying to convert a pig map whose keys were stringified ints to thrift map<i32, ...>. This precipitated my walking them thru the PigToThrift code and pointing out this current limitation (along with the fact that it doesn't fail fast on front end when unsupported input schema is detected, though I think I still have a pull req out for that one).
On Oct 10, 2012, at 10:23 PM, dvryaboy [email protected] wrote:
How would such a conversion work?
— Reply to this email directly or view it on GitHubhttps://github.com/kevinweil/elephant-bird/issues/261#issuecomment-9328954.
yep, we need to added schema check for Lzo Thrift/Protobuf storage similar to the check in SequenceFileStorage.
we do toString() on the key in the loader.. so we could do the reverse while storing.