automl-gs icon indicating copy to clipboard operation
automl-gs copied to clipboard

Colab: FileNotFoundError: File b'tpu_train/metadata/results.csv' does not exist

Open stevewells20 opened this issue 5 years ago • 10 comments

The stock Google Colab link in the README.md isn't working correctly. I added a line to download the titanic.csv, then hit run all. Full stack trace below:



Solving a binary_classification problem, maximizing accuracy using tensorflow.

Modeling with field specifications:
PassengerId: numeric
Pclass: categorical
Name: ignore
Sex: categorical
Age: numeric
SibSp: categorical
Parch: categorical
Ticket: ignore
Fare: numeric
Cabin: categorical
Embarked: categorical

0% 0/100 [00:00<?, ?trial/s]
0% 0/20 [00:00<?, ?epoch/s]

---------------------------------------------------------------------------

FileNotFoundError                         Traceback (most recent call last)

<ipython-input-5-17dc9e2d602c> in <module>()
      2                    target_field='Survived',
      3                    model_name='tpu',
----> 4                    tpu_address = tpu_address)

/usr/local/lib/python3.6/dist-packages/automl_gs/automl_gs.py in automl_grid_search(csv_path, target_field, target_metric, framework, model_name, context, num_trials, split, num_epochs, col_types, gpu, tpu_address)
     85         # and append to the metrics CSV.
     86         results = pd.read_csv(os.path.join(train_folder, 
---> 87                                         "metadata", "results.csv"))
     88         results = results.assign(**params)
     89         results.insert(0, 'trial_id', uuid.uuid4())

/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
    707                     skip_blank_lines=skip_blank_lines)
    708 
--> 709         return _read(filepath_or_buffer, kwds)
    710 
    711     parser_f.__name__ = name

/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    447 
    448     # Create the parser.
--> 449     parser = TextFileReader(filepath_or_buffer, **kwds)
    450 
    451     if chunksize or iterator:

/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
    816             self.options['has_index_names'] = kwds['has_index_names']
    817 
--> 818         self._make_engine(self.engine)
    819 
    820     def close(self):

/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py in _make_engine(self, engine)
   1047     def _make_engine(self, engine='c'):
   1048         if engine == 'c':
-> 1049             self._engine = CParserWrapper(self.f, **self.options)
   1050         else:
   1051             if engine == 'python':

/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
   1693         kwds['allow_leading_cols'] = self.index_col is not False
   1694 
-> 1695         self._reader = parsers.TextReader(src, **kwds)
   1696 
   1697         # XXX

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

FileNotFoundError: File b'tpu_train/metadata/results.csv' does not exist


stevewells20 avatar Mar 27 '19 00:03 stevewells20

In that case, the subprocess script errored and never completed. Will look into it.

minimaxir avatar Mar 27 '19 00:03 minimaxir

Same issue on my end. It looks like its attempting to read a csv that it hasn't created yet. results = pd.read_csv(os.path.join(train_folder, "metadata", "results.csv"))

PlazaAndMainSt avatar Mar 27 '19 12:03 PlazaAndMainSt

I'm having the same problem.... Using xgboost trying to train a text classification model. It learns the encodings, but fails when it actually needs to train

robotenique avatar Apr 07 '19 20:04 robotenique

+1. The same issue here.

agigao avatar Apr 20 '19 05:04 agigao

I have created a StackOverflow question [https://stackoverflow.com/q/55959256/1330974] related to this issue. If anyone finds a solution to this, please share. Thank you!

plartoo avatar May 02 '19 19:05 plartoo

The FileNoteFoundError is most likely a red herring. You need to look in the Jupyter logs to find the true error, as it's the subprocess that's crashing. See #14.

harvitronix avatar Jun 05 '19 17:06 harvitronix

I have the same problem, has someone find any solution?

AlexandraMakarova avatar Jul 03 '19 13:07 AlexandraMakarova

Still no solution ?

alexandre-xn avatar Jan 27 '20 16:01 alexandre-xn

Guess not, there's a lot of open issues

robotenique avatar Jan 27 '20 17:01 robotenique

I tried adding %tensorflow_version 1.x in the colab document but now i get a different error. Now i get and Index Error for the line: train_results = results.tail(1).to_dict('records')[0] My guess is that tensorflow.train.cosine_decay does not exist in the current version of tensorflow. EDIT: Just changed from TPU and still added the tensorflow version and its working!

thicccatto avatar Sep 01 '20 02:09 thicccatto