deeplake
deeplake copied to clipboard
[FEATURE] Add tensor.pytorch and tensor.tensorflow methods
🚨🚨 Feature Request
- [ ] Related to an existing Issue
- [x] A new implementation (Improvement, Extension)
If your feature will improve HUB
Currently, only Datasets can be used to create PyTorch Dataloaders, TensorFlow Datasets. Need to add functionality to convert Hub tensors to PyTorch/TensorFlow compatible format.
Difficulty: Medium
Hey @kristinagrig06 @tatevikh, very cool project! Can I collaborate in this issue? I am new to open source but I have experience in software development (also in a team). I also have experience with pytorch and tensorflow.
Hey @MoritzWillmann, welcome to Hub and Hacktoberfest, and thank you for your willingness to contribute! Do you have a proposed solution in mind for this issue?
Hey @dhiganthrao, I checked out some of your code for .numpy()
yesterday and I think I'll first implement it similar to that. I read in pytorch forums though that transforming a buffer to torch.Tensor
directly can be slow so I'd benchmark it against a call torch.from_numpy(np.frombuffer(...))
. Do you have any thoughts on this? I didn't look into tensorflow yet, but I think it'll be similar.
I think for now .pytorch() and .tensorflow() can call .numpy() underneath. @AbhinavTuli ?
Hey @farizrahman4u, I was just about to suggest that. I implemented it both ways yesterday and there's no recognisable performance difference. The "no numpy"-version needs much more implementing work though due to datatypes. Also it seems like there will be a major change in how pytorch handles storage for different datatypes coming up soon. It should get easier then...
@MoritzWillmann sounds good.
hey i want to solve this issue please assign this issue to me