finn-examples
finn-examples copied to clipboard
Error building Mobilenet-v1
Hi,
I'm trying to build the mobilenet-v1 model. I followed all the steps in the README of the repository. However I get this error:
File "/opt/conda/lib/python3.8/site-packages/sigtools/_signatures.py", line 83, in <module>
@attr.define(eq=False)
AttributeError: module 'attr' has no attribute 'define'
Do you know how to solve it? Thanks for your help
After some research I solved the issue.
The problem is in the get-finn.sh
file, in particular in the REPO_COMMIT
variable.
In the file there is the commit 96c0f5e3678abd7b1eaab2a2b4f8e937ac1f48b8. However this clones a version of the finn repository that has the following requirements.txt
:
bitstring==3.1.7
clize==4.1.1
dataclasses-json==0.5.7
docrep==0.2.7
future==0.18.2
gspread==3.6.0
numpy==1.22.0
onnx==1.11.0
onnxoptimizer
onnxruntime==1.11.1
pre-commit==2.9.2
protobuf==3.20.1
pyscaffold==3.2.1
scipy==1.5.2
setupext-janitor>=1.1.2
toposort==1.5
vcdvcd==1.0.5
wget==3.2
However, there is a package missing, that is the sigtools. The correct commit is the last one, at the time I'm writing, in the finn repository. It's SHA is abc500078692f7dec1f67aa7af4dead879eb1513.
In this version, the requirements.txt
is the following, and this is the correct one:
bitstring==3.1.7
clize==4.1.1
dataclasses-json==0.5.7
docrep==0.2.7
future==0.18.2
gspread==3.6.0
numpy==1.22.0
onnx==1.11.0
onnxoptimizer
onnxruntime==1.11.1
pre-commit==2.9.2
protobuf==3.20.1
pyscaffold==3.2.1
scipy==1.5.2
setupext-janitor>=1.1.2
sigtools==2.0.3
toposort==1.5
vcdvcd==1.0.5
wget==3.2
In the PR #41 there is the fix for this problem
I ran into this exact problem with MobileNet-v1 and tried your REPO_COMMIT fix from PR #41. That got me past the attr
error, but unfortunately lead to a different problem:
Traceback (most recent call last):
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/builder/build_dataflow.py", line 166, in build_dataflow_cfg
model = transform_step(model, cfg)
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/builder/build_dataflow_steps.py", line 426, in step_hls_codegen
model = model.transform(
File "/data/verderog-projects/xilinx/finn-examples/build/finn/deps/qonnx/src/qonnx/core/modelwrapper.py", line 140, in transform
(transformed_model, model_was_changed) = transformation.apply(transformed_model)
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/transformation/fpgadataflow/prepare_ip.py", line 88, in apply
_codegen_single_node(node, model, self.fpgapart, self.clk)
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/transformation/fpgadataflow/prepare_ip.py", line 55, in _codegen_single_node
inst.code_generation_ipgen(model, fpgapart, clk)
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/custom_op/fpgadataflow/hlscustomop.py", line 271, in code_generation_ipgen
self.generate_params(model, path)
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/custom_op/fpgadataflow/thresholding_batch.py", line 462, in generate_params
self.make_weight_file(thresholds, "hls_header", weight_filename)
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/custom_op/fpgadataflow/thresholding_batch.py", line 372, in make_weight_file
thresholds_hls_code = numpy_to_hls_code(
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/util/data_packing.py", line 278, in numpy_to_hls_code
strarr = np.array2string(ndarray, separator=", ", formatter={"all": elem2str})
File "<__array_function__ internals>", line 200, in array2string
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 736, in array2string
return _array2string(a, options, separator, prefix)
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 513, in wrapper
return f(self, *args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 546, in _array2string
lst = _formatArray(a, format_function, options['linewidth'],
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 889, in _formatArray
return recurser(index=(),
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 880, in recurser
nested = recurser(index + (-1,), next_hanging_indent, next_width)
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 880, in recurser
nested = recurser(index + (-1,), next_hanging_indent, next_width)
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 876, in recurser
nested = recurser(index + (-i,), next_hanging_indent,
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 845, in recurser
word = recurser(index + (-i,), next_hanging_indent, next_width)
File "/opt/conda/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 799, in recurser
return format_function(a[index])
File "/data/verderog-projects/xilinx/finn-examples/build/finn/src/finn/util/data_packing.py", line 268, in elem2str
if type(x) == str or type(x) == np.str_ or type(x) == np.str:
File "/opt/conda/lib/python3.8/site-packages/numpy/__init__.py", line 284, in __getattr__
raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'str'
> /opt/conda/lib/python3.8/site-packages/numpy/__init__.py(284)__getattr__()
-> raise AttributeError("module {!r} has no attribute "
I checked and my build/finn/requirements.txt
looks identical to the one you posted:
bitstring==3.1.7
clize==4.1.1
dataclasses-json==0.5.7
docrep==0.2.7
future==0.18.2
gspread==3.6.0
numpy==1.22.0
onnx==1.11.0
onnxoptimizer
onnxruntime==1.11.1
pre-commit==2.9.2
protobuf==3.20.1
pyscaffold==3.2.1
scipy==1.5.2
setupext-janitor>=1.1.2
sigtools==2.0.3
toposort==1.5
vcdvcd==1.0.5
wget==3.2
Are you running the out of the box example or have you done any kind of modifcation? If yes, what have you changed? I ask, so that I can try to replicate your setup!
@giop98 Much thanks for the response! This is an out-of-the-box example -- I am just trying to rebuild things from scratch to understand more about how FINN works to eventually use it for a test model I have. The only modification I made was to the get-finn.sh
script to match the hash you posted above. My target is a ZCU104 dev board but I'm not doing anything special with the config to target that.
I tried running things from scratch (completely removed the finn-examples repo and started over from square one.) Same results.
I'm running Ubuntu 20.04:
~/projects/xilinx/finn-examples/build/finn$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal
I've got the following variables configured before I do anything:
export FINN_XILINX_PATH=/opt/Xilinx
export FINN_XILINX_VERSION=2021.2
export VIVADO_PATH=/opt/Xilinx/Vivado/2021.2
Here is my exact sequence of steps:
-
git clone https://github.com/Xilinx/finn-examples.git
-
cd finn-examples/build/
- edit
get-finn.sh
to update REPO_COMMIT to "abc500078692f7dec1f67aa7af4dead879eb1513" -
./get-finn.sh
- Note that my docker version: Docker version 20.10.12, build 20.10.12-0ubuntu2~20.04.1
-
cd mobilenet-v1/models/
-
./download-model.sh
-
cd ../../..
-
export FINN_EXAMPLES=$PWD
-
cd $FINN_EXAMPLES/build/finn
-
./run-docker.sh build_custom $FINN_EXAMPLES/build/mobilenet-v1
The step that it fails is 7/13:
Running step: step_mobilenet_streamline [1/13]
Running step: step_mobilenet_lower_convs [2/13]
Running step: step_mobilenet_convert_to_hls_layers_separate_th [3/13]
Running step: step_create_dataflow_partition [4/13]
Running step: step_apply_folding_config [5/13]
Running step: step_generate_estimate_reports [6/13]
Running step: step_hls_codegen [7/13]
Traceback (most recent call last):
The traceback error is identical to the one I copied in my post above.
It appears that it might be related to the numpy version inside the docker container. I can launch an interactive docker container via ./run-docker.sh
. If I then launch python3
and import numpy, I see numpy.__version__
report 1.24.0. If I try to reference numpy.str
I get the same error as my traceback:
projects/xilinx/finn-examples/build/finn$ python3
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.24.0'
>>> numpy.str
<stdin>:1: FutureWarning: In the future `np.str` will be defined as the corresponding NumPy scalar. (This may have returned Python scalars in past versions.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/conda/lib/python3.8/site-packages/numpy/__init__.py", line 284, in __getattr__
raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'str'
>>>
Running python3 outside of the container on my dev machine and importing numpy shows that I have 1.21.2 installed. If I reference numpy.str
there, I get the following warning about deprecation:
~/projects/xilinx/finn-examples/build/finn$ python3
Python 3.8.10 (default, Sep 15 2021, 10:14:58)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.21.2'
>>> numpy.str
<stdin>:1: DeprecationWarning: `np.str` is a deprecated alias for the builtin `str`. To silence this warning, use `str` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.str_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
<class 'str'>
I can progress a little further if I comment out part of line 80 from docker/Dockerfile.finn
:
from:
RUN pip install matplotlib==3.3.1 --ignore-installed
to:
RUN pip install matplotlib==3.3.1 # --ignore-installed
After that change, I run into this error:
ERROR: [HLS 200-101] 'config_rtl': Unknown option '-deadlock_detection'.
Looking at the Xilinx docs, "deadlock_detection" is a new option present in 2022.1 tools, but not 2021.2 which is what I am using.
UG1399 2021.2: https://docs.xilinx.com/r/2021.2-English/ug1399-vitis-hls/config_rtl UG1399 2022.1: https://docs.xilinx.com/r/2022.1-English/ug1399-vitis-hls/config_rtl
Thanks for all the clear explanations. Tomorrow I will try to replicate your setup (I anticipate that I have the 2022.1 Xilinx Tools) and see what happens. I will let you know and then we can think of a fix!
I decided to take a step back and checked out the v0.0.5 tag of this repo. The build fails with the original "attr" error reported above:
AttributeError: module 'attr' has no attribute 'define'
I was hoping that this at least would build without modification.
SUCCESS! I'm finally able to build finn-examples hash 7123fa53b73fcba0f80be009de2e83e0d48995f0. I have no clue if it works, but I was at least able to get through the build without error.
Here are the customizations I performed:
- Updated
build/get-finn.sh
with the hash "abc500078692f7dec1f67aa7af4dead879eb1513" @giop98 identified above. - Updated
build/finn/docker/Dockerfile.finn
to comment out "--ignore-installed" for matplotlib - Updated
build/mobilenet-v1/build.py
so it only referenced the ZCU104 (no need to build for ZCU102 or Alveo U250) - Installed Vivado 2022.2 and referenced that with my environment variables. Of course, you need the licensing configured in order to complete synthesis.
SUCCESS! I'm finally able to build finn-examples hash 7123fa53b73fcba0f80be009de2e83e0d48995f0. I have no clue if it works, but I was at least able to get through the build without error.
Here are the customizations I performed:
- Updated
build/get-finn.sh
with the hash "abc500078692f7dec1f67aa7af4dead879eb1513" @giop98 identified above.- Updated
build/finn/docker/Dockerfile.finn
to comment out "--ignore-installed" for matplotlib- Updated
build/mobilenet-v1/build.py
so it only referenced the ZCU104 (no need to build for ZCU102 or Alveo U250)- Installed Vivado 2022.2 and referenced that with my environment variables. Of course, you need the licensing configured in order to complete synthesis.
I'm glad you fixed the issue. I was trying to replicate the issue, but you anticipated me. So the problem seems to be related to the Xilinx tools version 2021.x? Let me know if you are then able to run on the board.
Update--
I'm able to successfully run the rebuilt MobileNet-v1 model against the ImageNet validation data set and get identical accuracy results to the pre-built version.
Observations:
- The
FINNExampleOverlay
class produced by the current workflow has some differences from the pre-built version. Namely, theishape_packed
,oshape_packed
, andishape_normal
parameters are now referenced via methods instead of properties. This was causing a benchmarking tool I wrote that referenced those to error out vs. the pre-built. - The
io_shape_dict
includes some new parameters that are not present in the pre-built model. So, you can't directly use the_imagenet_top5inds_io_shape_dict
definition from the finn_examplesmodels.py
module. - My re-built model is running slightly slower than the pre-built version. Using the 50000 ImageNet validation set, it takes around 77 seconds longer to execute the entire data set vs the pre-built (~1.5ms longer per image).
Hi @giop98, sorry for the delayed response. Please check out the latest FINN-examples release, that should resolve your issue. However, to build the model for the ZCU104 board, we are forced to move some resources to URAM to enable the model to fit on the board. This requires to initialize the weights during runtime. Unfortunately, we have seen some issues with runtime-writeable weights leading to essentially an accuracy drop. Until we have resolved the issue, our advice would be to either target a U250 with Pynq v3.0.1 / FINN-examples v0.0.6 (since that particular model does not utilize URAM), or a ZCU104 with Pynq v2.6.1 (i.e. FINN-examples v0.0.5).
Dear @mmrahorovic, thanks for the help. I was able to build mobilenet correctly. However, I'm having some issues in testing the throughput. When I launch the script to test the throughput, it gets stuck indefinitely. I'm still on a ZCU104. Could this be because you are forced to move some resources to URAM? Thanks for the help
EDIT: Seems I solved my issue. I was setting the batchsize to a value that was too big. Now everything seems to work correctly. The maximum value of batch size I reached is 100, when jumping to 1000 the throughput test got stuck.
Dear @mmrahorovic. I tested the model, and I noticed an accuracy drop. On hardware, I obtained an accuracy of nearly 20%. Is it reasonable, or is it too low? You said that the problem with runtime-writable weights leads to an accuracy drop. Which is the order of magnitude of this accuracy degradation? Thanks
Hi @giop98 , can you try if this patch: https://github.com/Xilinx/finn-examples/pull/55/commits/11a1cb340612f3ce8e8910cf352517092670ebb2 solves your problem? The accuracy drop your seeing is so immense because the weights don't get correctly loaded with the driver.py on main. We're working on the solution, see #55
Hi, @auphelia. Thanks for the patch. I think this patch is headed towards people using finn-examples v0.0.6 and Pynq 3.0.1. However, I don't have access to the new version of Pynq, so I cannot test it on the board.
Just a few minutes ago, I was able to execute the model on hardware and obtain the expected accuracy. This is the setup with all the details:
- Finn v0.81
- Pynq v2.7
- finn-examples v0.0.5
To obtain the expected accuracy, I took 1 to 1 this script and adapted it to work with the FINN driver. The only difference was the removal of the normalize
method since is already inserted in the ONNX model.
In addition to that, the dataset was prepared with this script, as the README suggests.
In these days I will make the script a bit better and upload it here on the issue, so other people can use it to test the model on hardware, since I did not find any existing script online.