tensorflow-recorder icon indicating copy to clipboard operation
tensorflow-recorder copied to clipboard

'CsvCoder' object has no attribute 'decode'

Open lucas-fern opened this issue 3 years ago • 0 comments

Describe the bug When using df.tensorflow.to_tfr() an AttributeError is raised: 'CsvCoder' object has no attribute 'decode'

To Reproduce All I am doing is defining a schema and using df.tensorflow.to_tfr(). All columns in the schema are either types.SplitKey, types.IntegerLabel, or types.IntegerInput.

Expected behavior Dataframe is successfully output to a TFRecord.

Screenshots If applicable, add screenshots to help explain your problem.

System (please complete the following information):

  • OS: Ubuntu-20.04 via WSL2
  • Python Version: 3.8.1
  • TensorFlow Version: 2.3.1
  • TensorFlow Transform Version: 1.2.0

Additional context

AttributeError: 'CsvCoder' object has no attribute 'decode'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-95f4c12ae7c5> in <module>
----> 1 sample.tensorflow.to_tfr(output_dir=f'{PROCESSED_DATA_DIR}/df-trips+weather_split_1', schema=schema)

~/.local/lib/python3.8/site-packages/tfrecorder/accessor.py in to_tfr(self, output_dir, schema, runner, project, region, tfrecorder_wheel, dataflow_options, job_label, compression, num_shards)
     87             '<b>Logging output to /tmp/{} </b>'.format(constants.LOGFILE)))
     88 
---> 89     r = converter.convert(
     90         self._df,
     91         output_dir=output_dir,

~/.local/lib/python3.8/site-packages/tfrecorder/converter.py in convert(source, output_dir, schema, header, names, runner, project, region, tfrecorder_wheel, dataflow_options, job_label, compression, num_shards)
    309   job_dir = _get_job_dir(output_dir, job_name)
    310 
--> 311   p = beam_pipeline.build_pipeline(
    312       df,
    313       job_dir=job_dir,

~/.local/lib/python3.8/site-packages/tfrecorder/beam_pipeline.py in build_pipeline(df, job_dir, runner, project, region, compression, num_shards, schema, tfrecorder_wheel, dataflow_options)
    251         | 'ReadFromDataFrame' >> beam.Create(df.values.tolist())
    252         | 'ToCSVRows' >> beam.ParDo(flatten_rows)
--> 253         | 'DecodeCSV' >> beam.Map(converter.decode)
    254     )
    255 

AttributeError: 'CsvCoder' object has no attribute 'decode'

lucas-fern avatar Aug 08 '21 07:08 lucas-fern