wfdb-python
wfdb-python copied to clipboard
New interface for specifying different data sources for read/write
Looking at the current rdrecord
for example, there are two parameters used to specify the location of the record:
-
record_name
: str -
pn_dir
: str
The current package supports reading files locally and from the global database index URL, which defaults to PhysioNet, as specified in download.py
.
There are several things that we should aim to support:
- Reading/writing from more types of data sources, such as S3, and GCS.
- Having more than one remote source configured at a time.
One proposal might be to have a new DataSource
class, and a global config dictionary with key:value pairs of ds_name(str):ds(DataSource). ie.
class DataSourceType(Enum):
LOCAL = 1 # Not sure if this is necessary?
HTTP = 2
GCS = 3
S3 = 4
class DataSource:
ds_type : DataSourceType
# Other type-specific params here
_physionet_ds = DataSource(ds_type=DataSourceType.HTTP, base_url="https://physionet.org/content/")
data_sources = { 'physionet' : _physionet_ds }
And the read/write functions could use these params:
-
record_name
: str -
data_source
: str | DataSource - The key of the data source in the global data sources map, or aDataSource
object.
This would be much more explicit. Thoughts?