pfio
pfio copied to clipboard
IO library to access various filesystems with unified API
PFIO
PFIO is an IO abstraction library developed by PFN, optimized for deep learning training with batteries included. It supports
- Filesystem API abstraction with unified error semantics,
- Explicit user-land caching system,
- IO performance tracing and metrics stats, and
- Fileset container utilities to save metadata.
Dependency
- HDFS client and libhdfs for HDFS access
- CPython >= 3.8
Installation and Document build
Installation
$ git clone https://github.com/pfnet/pfio.git
$ cd pfio
$ pip install .
Documentation
$ cd pfio/docs
$ make html
$ open build/html/index.html
How to use
Please refer to the official document for more information about the usage.
Release
Check the official document for latest release procedure.
Run tests locally:
$ pip install tox
$ tox
Bump version numbers in pfio/version.py
.
Push and open a pull request to invoke CI. Once CI passed and the pull request merged, tag a release:
$ git tag -s X.Y.Z
$ git push --tags
Build:
$ rm -rf dist
$ pip3 install --user build
$ python3 -m build
Release to PyPI:
$ python3 -m pip install --user --upgrade twine
$ python3 -m twine upload --repository testpypi dist/*