deeplake icon indicating copy to clipboard operation
deeplake copied to clipboard

[BUG] Cannot create an OBJECT array from memory buffer

Open davidbuniat opened this issue 3 years ago • 3 comments

🐛🐛 Bug Report

⚗️ Current Behavior

Cannot create an OBJECT array from memory buffer. Dataset has been created using numpy objects and a tensor dtype is

Input Code

> ds = hub.load("hub://activeloop/abalone_full_dataset")
> ds.Sex.numpy()
ValueError: cannot create an OBJECT array from memory buffer
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-f439a6f63b90> in <module>
----> 1 ds.Sex.numpy()
~/.local/lib/python3.6/site-packages/hub/core/tensor.py in numpy(self, aslist)
    463         """
    464 
--> 465         return self.chunk_engine.numpy(self.index, aslist=aslist)
    466 
    467     def __str__(self):
~/.local/lib/python3.6/site-packages/hub/core/chunk_engine.py in numpy(self, index, aslist, use_data_cache)
    777                         global_sample_index
    778                     )
--> 779                     sample = chunk.read_sample(local_sample_index)[
    780                         tuple(entry.value for entry in index.values[1:])
    781                     ]
~/.local/lib/python3.6/site-packages/hub/core/chunk/uncompressed_chunk.py in read_sample(self, local_index, cast, copy)
     78             return bytes_to_text(buffer, self.htype)
     79         buffer = bytes(buffer) if copy else buffer
---> 80         return np.frombuffer(buffer, dtype=self.dtype).reshape(shape)
     81 
     82     def update_sample(self, local_index: int, sample: InputSample):
ValueError: cannot create an OBJECT array from memory buffer

⚙️ Environment

  • Python version(s): Python 3.6.9
  • OS: Ubuntu 18.04.6 LTS
  • IDE: VS-Code

🧰 Possible Solution

Either do not allow numpy objects to be uploaded or parse numpy() properly.

davidbuniat avatar Jan 15 '22 18:01 davidbuniat

I am interested to work on this issue.

rajdeepdas2000 avatar Sep 06 '22 18:09 rajdeepdas2000

Hi @rajdeepdas2000 ! Sure, go for it:)

tatevikh avatar Sep 06 '22 18:09 tatevikh

@davidbuniat I would like to work under this issue.Kindly assign me

h20200051 avatar Sep 12 '22 09:09 h20200051

The abalone_full_dataset is not available in hub(deeplake). What is the way to import the dataset from a csv file?

rajdeepdas2000 avatar Oct 14 '22 15:10 rajdeepdas2000

@rajdeepdas2000 @h20200051 really appreciate willingness to contribute here, but just checked and this problem has been fixed. I will close it for now and can be reopened if the problem still persists.

davidbuniat avatar Oct 14 '22 16:10 davidbuniat

Alright. But can you tell how do we import datasets in deeplake from our local machine?

rajdeepdas2000 avatar Oct 14 '22 16:10 rajdeepdas2000

you can use local paths such as ./path/to/your/local/dataset while creating or loading a dataset.

import deeplake
ds = deeplake.empty("./path/to/your/local/dataset")
...

and then access it

ds = deeplake.load("./path/to/your/local/dataset")

davidbuniat avatar Oct 14 '22 16:10 davidbuniat