BigDL-2.x icon indicating copy to clipboard operation
BigDL-2.x copied to clipboard

absolute path for Zoo model and log path in Databricks

Open jack1981 opened this issue 3 years ago • 3 comments

When we use Estimator.from_keras and save_keras_model API , the save_to_remote_dirhave to use absolute dbfs path ( with dbfs://) , but the log_dir we have to use non-absolute path ( couldn't include dbfs://) est = Estimator.from_keras(tm117_model.tm117_4_tfoptimizer(), model_dir=log_dir) est.save_keras_model(save_to_remote_dir)

Could Zoo support consistent path strategy for both model and logs ( checking point) ?

jack1981 avatar Sep 22 '21 17:09 jack1981

On HDFS, no such issue. For log_dir, I think that BigDL RecordWriter only check file path prefix of "hdfs", and if not "hdfs", it uses Java FileOutputStream, which cannot support "dbfs://", to write.

jenniew avatar Sep 27 '21 21:09 jenniew

On HDFS, no such issue. For log_dir, I think that BigDL RecordWriter only check file path prefix of "hdfs", and if not "hdfs", it uses Java FileOutputStream, which cannot support "dbfs://", to write.

Why do model and log have different behavior?

jason-dai avatar Sep 28 '21 14:09 jason-dai

save model API use hadoop fs to copy file, But for tensorboard logging, from comment of BigDL, FSDataOutputStream(hadoop FSDataOutputStream) couldn't flush data to localFileSystem in time. So reading summaries will throw exception.

jenniew avatar Oct 01 '21 01:10 jenniew