yogadl
yogadl copied to clipboard
Submit a dataset to GCS storage do not storage all the data into GCS
In one of my Jupyter notebook
from yogadl import dataref, storage
fs_config = yogadl.storage.GCSConfigurations(
bucket="mybucket",
bucket_directory_path="yogadl_cache",
url=f"ws://localhost:10050",
local_cache_dir="/tmp/",
)
storage = yogadl.storage.GCSStorage(fs_config)
storage.submit(val_ds, "dl_a2_val", "1.0")
In another jupyter notebook on another machine:
import yogadl
# Get the DataRef.
dataref = storage.fetch("dl_a2_val", "1.0")
I got the following error:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-10-9b0b8157fa6d> in <module>()
2
3 # Get the DataRef.
----> 4 dataref = storage.fetch("dl_a2_samples", "1.0")
5
6 # Tell the DataRef how to stream the dataset.
2 frames
/usr/local/lib/python3.7/dist-packages/google/cloud/storage/blob.py in download_to_filename(self, filename, client, start, end)
662 """
663 try:
--> 664 with open(filename, "wb") as file_obj:
665 self.download_to_file(file_obj, client=client, start=start, end=end)
666 except resumable_media.DataCorruption:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/yogadl_local_cache/dl_a2_samples/1.0/cache.mdb'
In the GCS, I only see a chache.bdb
file. Why submit a dataset to storage does not store all data into GCS?