superduper icon indicating copy to clipboard operation
superduper copied to clipboard

Create options for downloads.

Open jieguangzhou opened this issue 1 year ago • 6 comments
trafficstars

We create an S3 DataType with a parameter, pre_download.

The logic when pre_download is True and encodable as File

During encoding:

Download the file from S3 and create a FileEncodable with the specified download path. After saving the data, the file/folder will be stored in the artifact.

During decoding:

Retrieve the file/folder from the artifact.

This logic is similar to the file encodable logic that retrieves files from artifacts.

The logic when pre_download is False:

During encoding:

Return the original S3 path.

During decoding:

Download the data from S3. RemoteData calls a download module, which provides the logics for loading remote files/URIS.

RemoteData

class RemoteData(_BaseEncodeble):
    type: str ["s3", 'http']
    x: xxxxx

download module

def load_from_s3(url, **kargs):
       ....

def load_html(): ...


def load_file(): ....

Example

Pre Download

from superduperdb.components.datatype import HttpPredownload

data = HttpPredownload("https://superduperdb.com/xxx")

# 1. download the data to /tmp/xxx
# 2. Save the data to artifact store
# 3. Create a file encodable.
db['documents'].insert_one({"data": data})

 

# 1. load the encodable
# 2. Init the encodable and pull the file from arfiact store
# 3. return the file.x via file.unpack()
db['documents'].find_one() # That will init the file encodable and we can get the real file vis the file.x (xxxx.html)

No Pre Download

from superduperdb.components.datatype import Http

data = Http("https://superduperdb.com/xxx")


# save the "https://superduperdb.com/xxx" as x
db['documents'].insert_one({"data": data})

# 1. download the data to /tmp/xxx
# 2. return the path /tmp/xxx
db['documents'].find_one() # That will download the HTML to a file and return the path

jieguangzhou avatar Jun 11 '24 09:06 jieguangzhou