Need to read sequence files nicely
I would like to be able to specify a two writable classes to receive the contents of a sequence file.
This is similar to the way that Avro naturally reads when we use strings() or integers() or such. There would be new kind of PType that describes what writables are being used to do the reading.
Have you seen the support for SequenceFiles added to Avro 1.4.1?
https://issues.apache.org/jira/browse/AVRO-662
http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/SequenceFileInputFormat.html
This uses reflection to infer the Avro schema used when writing the Writables, so non-static, non-transient fields are recursively written, which works well for most Writables.
This permits one to have SequenceFiles as inputs but not as outputs. Does that suffice?
It probably doesn't suffice to just read data, at least over the long term, but this could get me much further down the road in the short term.
I tried this. What I did was to put a reference to a writable into an Avro schema thus:
{ "type": "record", "name": "foo", "fields": [ {"name": "a", "type": "long"}, {"name": "b", "type": {"type":"FooWritable"}} ]}
Avro's schema parser barfed on this. It would nice if a schema like this could be used on a SequenceFile with long keys and FooWritable values.
The intent is that FooWritable's reflected schema would be inlined above. Do you have a code example/test case of what you're trying to do?