finn
finn copied to clipboard
High mvau_wwidth_max value cause step_hw_ipgen to fail
Prerequisites
dev branch: e188b4c50955105717b223862c4e26e4777852ea
Quick summary
High mvau_wwidth_max
cause step_hw_ipgen
to fail, so only very low value of mvau_wwidth_max is valid for configuration option
Details
I have my simple mnist cnn model, and it is impossible to build with high performance configuration. I confirmed that it's resource requirement is satisfied, by checking estimation report.
Below is the detail stacktrace
Building dataflow accelerator from brevitas_cnn2.onnx
Intermediate outputs will be generated in /tmp/finn_dev_pbk
Final outputs will be generated in output_final
Build log is at output_final/build_dataflow.log
Running step: custom_step_add_pre_proc [1/21]
Running step: custom_step_add_post_proc [2/21]
Running step: step_qonnx_to_finn [3/21]
Running step: step_tidy_up [4/21]
Running step: step_streamline [5/21]
Running step: step_convert_to_hw [6/21]
Running step: step_create_dataflow_partition [7/21]
Running step: step_specialize_layers [8/21]
Running step: step_target_fps_parallelization [9/21]
Running step: step_apply_folding_config [10/21]
Running step: step_minimize_bit_width [11/21]
Running step: step_generate_estimate_reports [12/21]
Running step: step_hw_codegen [13/21]
Running step: step_hw_ipgen [14/21]
Traceback (most recent call last):
File "/home/pbk/git-projects/embedded-social-infra/finn/src/finn/builder/build_dataflow.py", line 158, in build_dataflow_cfg
model = transform_step(model, cfg)
File "/home/pbk/git-projects/embedded-social-infra/finn/src/finn/builder/build_dataflow_steps.py", line 573, in step_set_fifo_depths
model = model.transform(
File "/home/pbk/git-projects/embedded-social-infra/finn/deps/qonnx/src/qonnx/core/modelwrapper.py", line 140, in transform
(transformed_model, model_was_changed) = transformation.apply(transformed_model)
File "/home/pbk/git-projects/embedded-social-infra/finn/src/finn/transformation/fpgadataflow/set_fifo_depths.py", line 301, in apply
model = model.transform(InsertFIFO(create_shallow_fifos=True))
File "/home/pbk/git-projects/embedded-social-infra/finn/deps/qonnx/src/qonnx/core/modelwrapper.py", line 140, in transform
(transformed_model, model_was_changed) = transformation.apply(transformed_model)
File "/home/pbk/git-projects/embedded-social-infra/finn/src/finn/transformation/fpgadataflow/insert_fifo.py", line 115, in apply
fld_shape = n0.get_folded_output_shape()
File "/home/pbk/git-projects/embedded-social-infra/finn/src/finn/custom_op/fpgadataflow/streamingdatawidthconverter.py", line 124, in get_folded_output_shape
dummy_t = dummy_t.reshape(new_shape)
ValueError: cannot reshape array of size 784 into shape (1,7,7,0,28)
Running step: step_set_fifo_depths [15/21]
> /home/pbk/git-projects/embedded-social-infra/finn/src/finn/custom_op/fpgadataflow/streamingdatawidthconverter.py(124)get_folded_output_shape()
122 new_shape.append(int(ochannels // oelems))
123 new_shape.append(oelems)
--> 124 dummy_t = dummy_t.reshape(new_shape)
125
126 return dummy_t.shape
Steps to Reproduce
from finn.util.pytorch import ToTensor
from qonnx.transformation.merge_onnx_models import MergeONNXModels
from qonnx.core.modelwrapper import ModelWrapper
from qonnx.core.datatype import DataType
import finn.builder.build_dataflow as build
from qonnx.transformation.insert_topk import InsertTopK
import onnx
import torch
from pathlib import Path
import finn.builder.build_dataflow as build
import finn.builder.build_dataflow_config as build_cfg
import os
import shutil
def custom_step_add_pre_proc(model: ModelWrapper, cfg: build.DataflowBuildConfig):
ishape = model.get_tensor_shape(model.graph.input[0].name)
# preprocessing: torchvision's ToTensor divides uint8 inputs by 255
preproc = ToTensor()
bo.export_qonnx(preproc, torch.randn(ishape), "preproc.onnx", opset_version=12)
preproc_model = ModelWrapper("preproc.onnx")
# set input finn datatype to UINT8
preproc_model.set_tensor_datatype(preproc_model.graph.input[0].name, DataType["UINT8"])
# merge pre-processing onnx model with cnv model (passed as input argument)
model = model.transform(MergeONNXModels(preproc_model))
return model
def custom_step_add_post_proc(model: ModelWrapper, cfg: build.DataflowBuildConfig):
model = model.transform(InsertTopK(k=1))
return model
model_file = "brevitas_cnn2.onnx"
import finn.builder.build_dataflow as build
import finn.builder.build_dataflow_config as build_cfg
import os
import shutil
final_output_dir = "output_final"
# Delete previous run results if exist
if os.path.exists(final_output_dir):
shutil.rmtree(final_output_dir)
print("Previous run results deleted!")
cfg = build.DataflowBuildConfig(
output_dir = final_output_dir,
mvau_wwidth_max = 10000,
target_fps = 1000000,
synth_clk_period_ns = 10.0,
board = "Pynq-Z1",
shell_flow_type = build_cfg.ShellFlowType.VIVADO_ZYNQ,
steps = [custom_step_add_pre_proc, custom_step_add_post_proc] + build_cfg.default_build_dataflow_steps,
generate_outputs=[
build_cfg.DataflowOutputType.BITFILE,
build_cfg.DataflowOutputType.PYNQ_DRIVER,
build_cfg.DataflowOutputType.DEPLOYMENT_PACKAGE,
]
)
finally run this inside finn jupyter
%%time
build.build_dataflow_cfg(model_file, cfg)
Expected behavior
I confirmed that the resource requirement is satisfied, so it should not fail at this step or more detailed error raised
Actual behavior
StreamingDataWidthConverter
's get_folded_output_shape
fails with ValueError: cannot reshape array of size 784 into shape (1,7,7,0,28)
Possible fix
If I provide pretty low value to mvau_wwidth_max
(like 24) it works without error.
Additional context
mnist dataset trained onnx model brevitas_cnn2.zip
Hi, generally, errors like this can occur because the current automatic folding transformation is not perfect and might produce an illegal configuration where SIMD and PE settings between layers do not match all requirements (e.g. PE of one layer datawidth-convertible to SIMD of next layer).
You could try with a manual folding config or dig deeper how the shape (1,7,7,0,28) comes about. Especially the 0 dimension is very odd.
I have the same issue with automatic (basic). I use the notebook with a custom ONNX file that I use CNV network and generate by Brevitas. I am sure that not to exceed the Pynq-Z2 hw resources.
I wrote this message to just inform you and I'm following the issue.
As you recommend, I'll try it with a custom folding custom configuration.
Edit: I have solved the issue by using a custom folding configuration. You should consider the HW resource usage by estimation step and the folding constraints in the documentation before compiling the stitch IP. @pbk20191
Closing this issue for now, please feel free to re-open if your problem isn't solved.