awesome-python icon indicating copy to clipboard operation
awesome-python copied to clipboard

Adding hub to the awesome list

Open sparkingdark opened this issue 4 years ago • 0 comments

What is this Python project?

Hub - Fastest unstructured dataset management for TensorFlow/PyTorch by activeloop.ai. Stream & version-control data. Converts large data into a single numpy-like array on the cloud, accessible on any machine.

Describe features.

  • Store and retrieve large datasets with version-control
  • Collaborate as in Google Docs: Multiple data scientists working on the same data in sync with no interruptions
  • Access from multiple machines simultaneously
  • Deploy anywhere - locally, on Google Cloud, S3, Azure, and Activeloop (by default - and for free!)
  • Integrate with your ML tools like Numpy, Dask, Ray, PyTorch, or TensorFlow
  • Create arrays as big as you want. You can store images as big as 100k by 100k!
  • Keep the shape of each sample dynamic. This way you can store small and big arrays as 1 array.
  • Visualize any slice of the data in a matter of seconds without redundant manipulations

What's the difference between this Python project and similar ones?

Enumerate comparisons.

It's much more deep learning, machine learning-oriented, and makes easy handling of the data.

Anyone who agrees with this pull request could submit an Approve review to it.

sparkingdark avatar Apr 01 '21 06:04 sparkingdark