superduper
superduper copied to clipboard
Create options for downloads.
trafficstars
We create an S3 DataType with a parameter, pre_download.
The logic when pre_download is True and encodable as File
During encoding:
Download the file from S3 and create a FileEncodable with the specified download path. After saving the data, the file/folder will be stored in the artifact.
During decoding:
Retrieve the file/folder from the artifact.
This logic is similar to the file encodable logic that retrieves files from artifacts.
The logic when pre_download is False:
During encoding:
Return the original S3 path.
During decoding:
Download the data from S3. RemoteData calls a download module, which provides the logics for loading remote files/URIS.
RemoteData
class RemoteData(_BaseEncodeble):
type: str ["s3", 'http']
x: xxxxx
download module
def load_from_s3(url, **kargs):
....
def load_html(): ...
def load_file(): ....
Example
Pre Download
from superduperdb.components.datatype import HttpPredownload
data = HttpPredownload("https://superduperdb.com/xxx")
# 1. download the data to /tmp/xxx
# 2. Save the data to artifact store
# 3. Create a file encodable.
db['documents'].insert_one({"data": data})
# 1. load the encodable
# 2. Init the encodable and pull the file from arfiact store
# 3. return the file.x via file.unpack()
db['documents'].find_one() # That will init the file encodable and we can get the real file vis the file.x (xxxx.html)
No Pre Download
from superduperdb.components.datatype import Http
data = Http("https://superduperdb.com/xxx")
# save the "https://superduperdb.com/xxx" as x
db['documents'].insert_one({"data": data})
# 1. download the data to /tmp/xxx
# 2. return the path /tmp/xxx
db['documents'].find_one() # That will download the HTML to a file and return the path