SVTAS
SVTAS copied to clipboard
End to End Streaming Video Temporal Segmentation
Important!
Warning
- This repo main branch are under development so it will have much bugs, because it doesn't test completely!
Note
- If you want to reproduce paper list, please checkout branch to svtas-paper!
Paper List
- Streaming Video Temporal Action Segmentation In Real Time,
, statu: accepted by ISKE2023
- End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning,
, statu: under review
Streaming Video Temporal Action Segmentation

Our framework integrates training, inference and deployment services to meet the demand of streaming video temporal action segmentation, with the goal of creating an AI Infra framework for streaming video temporal action segmentation.
Installation
See the SVTAS installation guide to install from pip, or build from source.
To install the current release:
python setup.py install .
To update SVTAS to the latest version, add --upgrade flag to the above commands.
Framework Feature
| Training | Inference | Serving | |
| Supports |
|
|
|
| Model Zoom | Tutorials | Services Component | |
| Algorithms |
|
|
Envirnment Prepare
- Linux Ubuntu 22.04+
- Python 3.10+
- PyTorch 2.1.0+
- CUDA 12.2+
- Pillow-SIMD (optional): Install it by the following scripts.
- FFmpeg 4.3.1+ (optional): For extract flow and visualize video cam
conda uninstall -y --force pillow pil jpeg libtiff libjpeg-turbo
pip uninstall -y pillow pil jpeg libtiff libjpeg-turbo
conda install -yc conda-forge libjpeg-turbo
CFLAGS="${CFLAGS} -mavx2" pip install --upgrade --no-cache-dir --force-reinstall --no-binary :all: --compile pillow-simd
conda install -y jpeg libtiff
- use pip to install environment
conda create -n torch python=3.10
python -m pip install --upgrade pip
pip install -r requirements/requirements_base.txt
- If report
correlation_cuda package no found, you should read Install - If you want to extract montion vector and residual image to video, you should install ffmpeg, for example, in ubuntu
sudo apt install ffmpeg
Document Dictionary
- Prepare Datset
- Usage
- Model Zoo
- Tools Usage
- Infer Guideline
- Add Test Case Guideline
Citation
@misc{2209.13808,
Author = {Wujun Wen and Yunheng Li and Zhuben Dong and Lin Feng and Wanxiao Yang and Shenlan Liu},
Title = {Streaming Video Temporal Action Segmentation In Real Time},
Year = {2022},
Eprint = {arXiv:2209.13808},
}
@article{wen2023end,
title={End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning},
author={Wen, Wujun and Zhang, Jinrong and Liu, Shenglan and Li, Yunheng and Li, Qifeng and Feng, Lin},
journal={arXiv preprint arXiv:2309.15683},
year={2023}
}
Acknowledgement
This repo borrowed code from many great open source libraries, thanks again for their selfless dedication.
License
The entire codebase is under Apache2.0 license.