pandavro
pandavro copied to clipboard
allow for process_record() while reading in avro
Feature request: Could you allow for process_record function while reading in avro? Here is a suggestion.
def __file_to_dataframe(f, schema, process_record=None, **kwargs):
reader = fastavro.reader(f, reader_schema=schema)
records = list()
if preprocess_record:
records = [process_record(r) for r in avro_reader]
else:
records = list(avro_reader)
return pd.DataFrame.from_records(records, **kwargs)