deployml
deployml copied to clipboard
Scripts for "Deploy ML to production" workshop
Contents
- Pre-Requirements
-
Environment SetUp
- Case 1
- Case 1.1
- Case 2
- Case 3
- Frameworks comparison
- Tensorflow optimization methods
-
Training optimization approaches
- Pruning
- XNOR nets
- Knowledge distillation
- Simple servers
-
Testing
- Preprocessing and code testing
- Profiling
- Routines automation
- Converting weights to the tensorflow
- Conclusion
Pre-Requirements
- Docker. You may get it here
- Update docker memory limit if necessary
- Git, Python>=3.5
Environment SetUp
Case 1
-
Clone the workshop repository
git clone [email protected]:ikhlestov/deployml.git && cd deployml
-
Create virtualenv
python3.6 -m venv .venv && source .venv/bin/activate
-
Install corresponding requirements
pip install -r requirements/dev_mac.txt
or
pip install -r requirements/dev_ubuntu_cpu.txt
Note: requirements are based on python3.6. If you have another python version you should change link to the pytorch wheel at the requirements file which you may get here
Case 1.1
Additionally download tensorflow source code nearby:
git clone https://github.com/tensorflow/tensorflow.git -b v1.6.0
Case 2
pull small docker container:
docker pull ikhlestov/deployml_dev_small
or
pull large docker container(in case of really good Internet connection):
docker pull ikhlestov/deployml_dev
Case 3
Build your own docker container:
- Clone the workshop repository
git clone [email protected]:ikhlestov/deployml.git && cd deployml
- Check dockers containers defined at the dockers folder
- Run build commands:
-
docker build -f dockers/Dev . -t ikhlestov/deployml_dev
(for the workshop you should build only this image) -
docker build -f dockers/Dev_small . -t ikhlestov/deployml_dev_small
-
docker build -f dockers/Prod . -t ikhlestov/deployml_prod
-
- Compare their sizes
docker images | grep "deployml_dev\|deployml_dev_small\|deployml_prod"
Notes:
- Don't forget about .dockerignore file.
- Try to organize your docker files to use cache.
- Optimize your docker containers
- Try to release with some smaller distributions.
- You may use multistage builds
Frameworks comparison
-
Check defined models in the models folder
-
Run docker container with mounted directory:
docker run -v $(pwd):/deployml -p 6060:6060 -p 8080:8080 -it ikhlestov/deployml_dev /bin/bash
-
Run time measurements inside docker:
python benchmarks/compare_frameworks.py
Tensorflow optimization methods
-
Save our tensorflow model.
python optimizers/save_tensorflow_model.py
1.1 Import saved model to tensorboard
python misc/import_pb_to_tensorboard.py --model_dir saves/tensorflow/usual_model.pbtxt --log_dir saves/tensorboard/usual_model --graph_type PbTxt
1.2 Run tensorboard in the background
tensorboard --logdir saves/tensorboard --port 6060 --host=0.0.0.0 &
If you've encountered error such "ModuleNotFoundError: No module named 'html5lib.filters.base'" please install another version of the html5lib
pip uninstall -y html5lib && pip install html5lib --no-cache
-
Build frozen graph. More about it you may read here
python optimizers/get_frozen_graph.py
python misc/import_pb_to_tensorboard.py --model_dir saves/tensorflow/constant_graph.pb --log_dir saves/tensorboard/constant_graph
-
Build optimized frozen graph
python optimizers/get_optimized_frozen_graph.py
python misc/import_pb_to_tensorboard.py --model_dir saves/tensorflow/optimized_graph.pb --log_dir saves/tensorboard/optimized_graph
-
Get quantized graph:
3.1 With plain python (link to script)
python /tensorflow/tensorflow/tools/quantization/quantize_graph.py \ --input=saves/tensorflow/optimized_graph.pb \ --output=saves/tensorflow/quantized_graph_python.pb \ --output_node_names="output" \ --mode=weights
3.2. With bazel (tensorflow tutorial)
../tensorflow/bazel-bin/tensorflow/tools/graph_transforms/transform_graph \ --in_graph=`pwd`/saves/tensorflow/optimized_graph.pb \ --out_graph=`pwd`/saves/tensorflow/quantized_graph_bazel.pb \ --inputs="input:0" \ --outputs="output:0" \ --transforms='quantize_weights'
3.3 Note: tf.contrib.quantize provide only simulated quantization.
3.4 Import quantized models to the tensorboard
python misc/import_pb_to_tensorboard.py \ --model_dir saves/tensorflow/quantized_graph_bazel.pb \ --log_dir saves/tensorboard/quantized_graph_bazel python misc/import_pb_to_tensorboard.py \ --model_dir saves/tensorflow/quantized_graph_python.pb \ --log_dir saves/tensorboard/quantized_graph_python
-
Compare resulted graphs
5.1 sizes
ls -l saves/tensorflow/
5.2 architecture at the tensorboard
5.3 Compare resulted graphs performance
python benchmarks/compare_tf_optimizations.py
-
Try various restrictions
6.1 CPU restriction
docker run -v $(pwd):/deployml -it --cpus="1.0" ikhlestov/deployml_dev /bin/bash
6.2 Memory restriction
docker run -v $(pwd):/deployml -it --memory=1g ikhlestov/deployml_dev /bin/bash
6.3 Use GPUs
docker run --runtime=nvidia -v $(pwd):/deployml -it ikhlestov/deployml_dev /bin/bash
6.3 Try to run two models on two different CPUs 6.4 Try to run two models on two CPU simultaneously
Training optimization approaches
You may also take a look at other methods (list of resources) like:
Pruning
XNOR nets
Knowledge distillation
Simple servers
One-to-one server(servers/simple_server.py)
Scaling with multiprocessing(servers/processes_server.py)
You may start servers (not simultaneously) as:
python servers/simple_server.py
or
python servers/processes_server.py
and test them with:
python servers/tester.py
Queues based(Kafka, RabbitMQ, etc)
Serving with tf-serving
Testing
Preprocessing and code testing
Q: Where data preprocessing should be done? CPU or GPU or even another host?
-
enter to the preprocessing directory
cd preprocessing
-
run various resizers benchmarks
python benchmark.py
- Note: opencv may be installed from PyPi for python3
-
check unified resizer at the
image_preproc.py
-
try to run tests for it
pytest test_preproc.py
(and they will fail) -
fix resizer
-
run tests again
pytest test_preproc.py
What else should be tested(really - as much as possible):
- General network inference
- Model loading/saving
- New models deploy
- Any preprocessing
- Corrupted inputs - Nan, Inf, zeros
- Deterministic output
- Input ranges/distributions
- Output ranges/distributions
- Test that model will fail in known cases
- ...
- Just check this video :)
You my run tests:
- At the various docker containers
- Under the tox
Profiling
-
Code:
-
Tensorflow:
-
CPU/GPU:
- nvidia-smi
- gpustat
- psutil
- nvidia profiler
-
Lifetime benchmark - airspeed velocity
Routines automation
-
Continuous integration:
- Jenkins
- Travis
- TeamCity
- CircleCI
-
Clusters:
- Kubernetes
- Mesos
- Docker swarm
-
Configuration management:
- Terraform
- Ansible
- Chef
- Puppet
- SaltStack
Converting weights to the tensorflow
-
Converting from keras to tensorflow:
- Get keras saved model
python converters/save_keras_model.py
- Convert keras model to the tensorflow save format
python converters/convert_keras_to_tf.py
- Get keras saved model
-
Converting from PyTorch to tensorflow:
- Trough keras - converter
- Manually
In any case you should know about:
Conclusion
I'm grateful for the cool ideas to Alexandr Onbysh, Aleksandr Obednikov, Kyryl Truskovskyi and to the Ring Urkaine in overall.
Take a look at the checklist.
Thank you for reading!