finn icon indicating copy to clipboard operation
finn copied to clipboard

Dataflow partition fails

Open jacopoabramo opened this issue 3 years ago • 9 comments

Greetings everyone,

I'm trying to port a Brevitas model for Iris classification into FPGA using FINN. The model I'm using is shown here.

I'm using a custom notebook by taking as a reference the tfc_end2end_example in the repository. I'm able to come up with a reasonable model up the HLS conversion step. This is the model that I have so far:

Iris_Classification_hls_layers onnx (1)

When performing the dataflow partition step though, FINN throws the following exception:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-8-8e79fe16bf71> in <module>
      2 
      3 model = ModelWrapper(build_dir + model_name + "_hls_layers.onnx")
----> 4 parent_model = model.transform(CreateDataflowPartition())
      5 parent_model.save(build_dir + model_name + "_dataflow_parent.onnx")
      6 showInNetron(build_dir + model_name + "_dataflow_parent.onnx")

/workspace/finn-base/src/finn/core/modelwrapper.py in transform(self, transformation, make_deepcopy, cleanup, fix_float64)
    139         model_was_changed = True
    140         while model_was_changed:
--> 141             (transformed_model, model_was_changed) = transformation.apply(
    142                 transformed_model
    143             )

/workspace/finn/src/finn/transformation/fpgadataflow/create_dataflow_partition.py in apply(self, model)
     76 
     77         # first, use the generic partitioning functionality to split up the graph
---> 78         parent_model = model.transform(
     79             PartitionFromLambda(
     80                 partitioning=assign_partition_id, partition_dir=self.partition_model_dir

/workspace/finn-base/src/finn/core/modelwrapper.py in transform(self, transformation, make_deepcopy, cleanup, fix_float64)
    139         model_was_changed = True
    140         while model_was_changed:
--> 141             (transformed_model, model_was_changed) = transformation.apply(
    142                 transformed_model
    143             )

/workspace/finn-base/src/finn/transformation/create_generic_partitions.py in apply(self, model)
    122                 for node in to_check:
    123                     if node is not None:
--> 124                         assert (
    125                             self.partitioning(node) != partition_id
    126                         ), """cycle-free graph violated: partition depends on itself"""

AssertionError: cycle-free graph violated: partition depends on itself

I don't exactly understand what this means, so any input is appreciated. In case you need more information I'll happily provide them. Thank you.

My setup is:

  • OS: Ubuntu 20.04.4 LTS (running on Windows 10 with WSL2)
  • Python: 3.8.10
  • FINN branch: main

jacopoabramo avatar Mar 25 '22 10:03 jacopoabramo

Hi, dataflow partitioning fails because not all nodes were converted to their respective HLS Custom Op in the prior steps. Specifically, the MatMul nodes should have been converted to StreamingFCLayer_Batch nodes.

Do you apply the InferQuantizedStreamingFCLayer() transformation as part of the HLS conversion?

fpjentzsch avatar Apr 07 '22 11:04 fpjentzsch

Hi @fpjentzsch , thank you for your reply. Yes, adding the transformation to the HLS conversion step did the trick. I'll close the issue.

jacopoabramo avatar Apr 11 '22 16:04 jacopoabramo

Greetings,

I'm reopening this issue for the following reason: I changed the FINN version (I checked out the "dev" branch, commit hash 281af25500cfbfb8e7abd23c956e4599851fdda8) and tried re-running my notebook to generate again the IP core for my model. I noticed that some of the functionalities were moved to the qonnx module but apart from that everything was the same. My problem is in the pre-HLS conversion step. The model I have is the following:

iris_onnx_streamline onnx

Afterwards I apply the following transformations:

from qonnx.transformation.bipolar_to_xnor import ConvertBipolarMatMulToXnorPopcount
from finn.transformation.streamline.round_thresholds import RoundAndClipThresholds
from qonnx.transformation.infer_data_layouts import InferDataLayouts
from qonnx.transformation.general import RemoveUnusedTensors

chkpt_name = build_dir + model_name + pre_hls + ".onnx"
model = model.transform(ConvertBipolarMatMulToXnorPopcount())
model = model.transform(absorb.AbsorbAddIntoMultiThreshold())
model = model.transform(absorb.AbsorbMulIntoMultiThreshold())

# absorb final add-mul nodes into TopK
model = model.transform(absorb.AbsorbScalarMulAddIntoTopK())
model = model.transform(RoundAndClipThresholds())

# bit of tidy-up
model = model.transform(InferDataLayouts())
model = model.transform(RemoveUnusedTensors())

model.save(chkpt_name)
showInNetron(chkpt_name)

This produces the following model:

iris_onnx_pre_hls onnx

Then I perform the HLS conversion as follows:

import finn.transformation.fpgadataflow.convert_to_hls_layers as to_hls

chkpt_name = build_dir + model_name + pre_hls + ".onnx"
model = ModelWrapper(chkpt_name)
model = model.transform(to_hls.InferBinaryMatrixVectorActivation(mem_mode=mem_mode))
# TopK to LabelSelect
model = model.transform(to_hls.InferLabelSelectLayer())
# input quantization (if any) to standalone thresholding
model = model.transform(to_hls.InferThresholdingLayer())

chkpt_name = build_dir + model_name + hls_layers + ".onnx"

model.save(chkpt_name)
showInNetron(chkpt_name)

iris_onnx_hls_layers onnx

At this point though I should not be seeing anymore the MatMul layers as these should have been collapsed within the MultiThreshold layers. I'm not exactly sure I understand what's going on. At any rate I afterwards try to create a dataflow partition as follows:

from finn.transformation.fpgadataflow.create_dataflow_partition import CreateDataflowPartition

chkpt_name = build_dir + model_name + hls_layers + ".onnx"
model = ModelWrapper(chkpt_name)
parent_model = model.transform(CreateDataflowPartition())

chkpt_name = build_dir + model_name + dataflow_part + ".onnx"
parent_model.save(chkpt_name)
showInNetron(chkpt_name)

Which then fails:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-9-9b65553d6b9e> in <module>
      3 chkpt_name = build_dir + model_name + hls_layers + ".onnx"
      4 model = ModelWrapper(chkpt_name)
----> 5 parent_model = model.transform(CreateDataflowPartition())
      6 
      7 chkpt_name = build_dir + model_name + dataflow_part + ".onnx"

/home/jacopo/git/finn/deps/qonnx/src/qonnx/core/modelwrapper.py in transform(self, transformation, make_deepcopy, cleanup)
    138         model_was_changed = True
    139         while model_was_changed:
--> 140             (transformed_model, model_was_changed) = transformation.apply(transformed_model)
    141         if cleanup:
    142             transformed_model.cleanup()

/home/jacopo/git/finn/src/finn/transformation/fpgadataflow/create_dataflow_partition.py in apply(self, model)
     78 
     79         # first, use the generic partitioning functionality to split up the graph
---> 80         parent_model = model.transform(
     81             PartitionFromLambda(
     82                 partitioning=assign_partition_id, partition_dir=self.partition_model_dir

/home/jacopo/git/finn/deps/qonnx/src/qonnx/core/modelwrapper.py in transform(self, transformation, make_deepcopy, cleanup)
    138         model_was_changed = True
    139         while model_was_changed:
--> 140             (transformed_model, model_was_changed) = transformation.apply(transformed_model)
    141         if cleanup:
    142             transformed_model.cleanup()

/home/jacopo/git/finn/deps/qonnx/src/qonnx/transformation/create_generic_partitions.py in apply(self, model)
    116                 for node in to_check:
    117                     if node is not None:
--> 118                         assert (
    119                             self.partitioning(node) != partition_id
    120                         ), """cycle-free graph violated: partition depends on itself"""

AssertionError: cycle-free graph violated: partition depends on itself

jacopoabramo avatar Jul 11 '22 11:07 jacopoabramo

Hi @jacopoabramo , I see that you are using the InferBinaryMatrixVectorActivation() function. Is your network binary? I had a brief look into the model you've sent in the beginning of the conversation and it looked for me like you're using 4-bit. Is that the same model you're still using? If yes, you can try replacing that function with InferQuantizedMatrixVectorActivation(). You can now also use the current main branch to run your model.

auphelia avatar Jul 20 '22 16:07 auphelia

Hi @auphelia , thanks for the reply. That did the trick. Now I'm having an issue because apparently Vitis HLS is not installed (which makes sense since I only installed Vivado). I'll keep the issue open just in case I find more problems.

jacopoabramo avatar Jul 21 '22 10:07 jacopoabramo

Hello again,

I now have the following output after trying to build the model with Vitis HLS. The code for the building is shown below (target is PYNQ-Z2):

from finn.util.basic import pynq_part_map
from finn.transformation.fpgadataflow.make_zynq_proj import ZynqBuild

# set the correct part map
fpga_part = pynq_part_map[target]
target_clk_ns = 10

chkpt_name = build_dir + model_name + folding + ".onnx"
model = ModelWrapper(chkpt_name)
model = model.transform(ZynqBuild(platform = target, period_ns = target_clk_ns))

chkpt_name = build_dir + model_name + post_synth + ".onnx"
model.save(chkpt_name)
showInNetron(chkpt_name)

This is what I get as traceback:

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/opt/conda/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/jacopo/git/finn/src/finn/transformation/fpgadataflow/hlssynth_ip.py", line 69, in applyNodeLocal
    inst.ipgen_singlenode_code()
  File "/home/jacopo/git/finn/src/finn/custom_op/fpgadataflow/hlscustomop.py", line 339, in ipgen_singlenode_code
    assert os.path.isdir(
AssertionError: IPGen failed: /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_sryta8e_/project_StreamingDataflowPartition_0_IODMA_0/sol1/impl/ip not found. Check log under /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_sryta8e_
"""

The above exception was the direct cause of the following exception:

AssertionError                            Traceback (most recent call last)
<ipython-input-12-bfbbda2f9c4e> in <module>
      8 chkpt_name = build_dir + model_name + folding + ".onnx"
      9 model = ModelWrapper(chkpt_name)
---> 10 model = model.transform(ZynqBuild(platform = target, period_ns = target_clk_ns))
     11 
     12 chkpt_name = build_dir + model_name + post_synth + ".onnx"

/home/jacopo/git/finn/deps/qonnx/src/qonnx/core/modelwrapper.py in transform(self, transformation, make_deepcopy, cleanup)
    138         model_was_changed = True
    139         while model_was_changed:
--> 140             (transformed_model, model_was_changed) = transformation.apply(transformed_model)
    141         if cleanup:
    142             transformed_model.cleanup()

/home/jacopo/git/finn/src/finn/transformation/fpgadataflow/make_zynq_proj.py in apply(self, model)
    353                 PrepareIP(self.fpga_part, self.period_ns)
    354             )
--> 355             kernel_model = kernel_model.transform(HLSSynthIP())
    356             kernel_model = kernel_model.transform(
    357                 CreateStitchedIP(

/home/jacopo/git/finn/deps/qonnx/src/qonnx/core/modelwrapper.py in transform(self, transformation, make_deepcopy, cleanup)
    138         model_was_changed = True
    139         while model_was_changed:
--> 140             (transformed_model, model_was_changed) = transformation.apply(transformed_model)
    141         if cleanup:
    142             transformed_model.cleanup()

/home/jacopo/git/finn/deps/qonnx/src/qonnx/transformation/base.py in apply(self, model)
    103         # Execute transformation in parallel
    104         with mp.Pool(self._num_workers) as p:
--> 105             new_nodes_and_bool = p.map(self.applyNodeLocal, old_nodes, chunksize=1)
    106 
    107         # extract nodes and check if the transformation needs to run again

/opt/conda/lib/python3.8/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    362         in a list that is returned.
    363         '''
--> 364         return self._map_async(func, iterable, mapstar, chunksize).get()
    365 
    366     def starmap(self, func, iterable, chunksize=None):

/opt/conda/lib/python3.8/multiprocessing/pool.py in get(self, timeout)
    769             return self._value
    770         else:
--> 771             raise self._value
    772 
    773     def _set(self, i, obj):

AssertionError: IPGen failed: /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_sryta8e_/project_StreamingDataflowPartition_0_IODMA_0/sol1/impl/ip not found. Check log under /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_sryta8e_

This is the log content:

****** Vitis HLS - High-Level Synthesis from C, C++ and OpenCL v2022.1 (64-bit)
  **** SW Build 3526262 on Mon Apr 18 15:47:01 MDT 2022
  **** IP Build 3524634 on Mon Apr 18 20:55:01 MDT 2022
    ** Copyright 1986-2022 Xilinx, Inc. All Rights Reserved.

source /tools/Xilinx/Vitis_HLS/2022.1/scripts/vitis_hls/hls.tcl -notrace
INFO: [HLS 200-10] Running '/tools/Xilinx/Vitis_HLS/2022.1/bin/unwrapped/lnx64.o/vitis_hls'
/tools/Xilinx/Vitis_HLS/2022.1/tps/tcl/tcl8.5/tzdata/Europe/Dublin can't be opened.
INFO: [HLS 200-10] For user 'jacopo' on host 'finn_dev_jacopo' (Linux_x86_64 version 5.10.16.3-microsoft-standard-WSL2) on Thu Jul 21 13:35:22 +0000 2022
INFO: [HLS 200-10] On os Ubuntu 18.04.5 LTS
INFO: [HLS 200-10] In directory '/tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev'
Sourcing Tcl script '/tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/hls_syn_StreamingDataflowPartition_0_IODMA_0.tcl'
INFO: [HLS 200-1510] Running: source /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/hls_syn_StreamingDataflowPartition_0_IODMA_0.tcl
HLS project: project_StreamingDataflowPartition_0_IODMA_0
HW source dir: /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev
finn-hlslib dir: /home/jacopo/git/finn/deps/finn-hlslib
custom HLS dir: /home/jacopo/git/finn/custom_hls
INFO: [HLS 200-1510] Running: open_project project_StreamingDataflowPartition_0_IODMA_0
INFO: [HLS 200-10] Creating and opening project '/tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/project_StreamingDataflowPartition_0_IODMA_0'.
INFO: [HLS 200-1510] Running: add_files /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/top_StreamingDataflowPartition_0_IODMA_0.cpp -cflags -std=c++14 -I/home/jacopo/git/finn/deps/finn-hlslib -I/home/jacopo/git/finn/custom_hls
INFO: [HLS 200-10] Adding design file '/tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/top_StreamingDataflowPartition_0_IODMA_0.cpp' to the project
INFO: [HLS 200-1510] Running: set_top StreamingDataflowPartition_0_IODMA_0
INFO: [HLS 200-1510] Running: open_solution sol1
INFO: [HLS 200-10] Creating and opening solution '/tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/project_StreamingDataflowPartition_0_IODMA_0/sol1'.
INFO: [HLS 200-1505] Using default flow_target 'vivado'
Resolution: For help on HLS 200-1505 see www.xilinx.com/cgi-bin/docs/rdoc?v=2022.1;t=hls+guidance;d=200-1505.html
INFO: [HLS 200-435] Setting 'open_solution -flow_target vivado' configuration: config_interface -m_axi_latency=0
INFO: [HLS 200-1510] Running: set_part xc7z020clg400-1
INFO: [HLS 200-1611] Setting target device to 'xc7z020-clg400-1'
INFO: [HLS 200-1510] Running: config_compile -disable_unroll_code_size_check -pipeline_style flp
WARNING: [XFORM 203-506] Disable code size check when do loop unroll.
WARNING: [HLS 200-643] The 'config_compile -disable_unroll_code_size_check' hidden command is deprecated and will be removed in a future release.
INFO: [HLS 200-1510] Running: config_interface -m_axi_addr64
INFO: [HLS 200-1510] Running: config_rtl -module_auto_prefix
INFO: [HLS 200-1510] Running: config_rtl -deadlock_detection none
INFO: [HLS 200-1510] Running: create_clock -period 10 -name default
INFO: [SYN 201-201] Setting up clock 'default' with a period of 10ns.
INFO: [HLS 200-1510] Running: csynth_design
INFO: [HLS 200-111] Finished File checks and directory preparation: CPU user time: 0.01 seconds. CPU system time: 0 seconds. Elapsed time: 0 seconds; current allocated memory: 1.214 GB.
INFO: [HLS 200-10] Analyzing design file '/tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/top_StreamingDataflowPartition_0_IODMA_0.cpp' ...
WARNING: [HLS 207-5567] Invalid Directive: for current device, RAM_S2P + URAM is invalid combination for BIND_STORAGE's option 'type + impl' (/home/jacopo/git/finn/deps/finn-hlslib/slidingwindow.h:116:49)
ERROR: [HLS 207-3504] static_assert failed "" (/home/jacopo/git/finn/deps/finn-hlslib/dma.h:139:3)
INFO: [HLS 207-4235] in instantiation of function template specialization 'Mem2Stream<64, 0>' requested here (/home/jacopo/git/finn/deps/finn-hlslib/dma.h:170:7)
INFO: [HLS 207-4235] in instantiation of function template specialization 'Mem2Stream_Batch<64, 0>' requested here (/tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/top_StreamingDataflowPartition_0_IODMA_0.cpp:25:1)
INFO: [HLS 200-111] Finished Command csynth_design CPU user time: 1.35 seconds. CPU system time: 0.07 seconds. Elapsed time: 0.71 seconds; current allocated memory: 0.867 MB.

    while executing
"source /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/hls_syn_StreamingDataflowPartition_0_IODMA_0.tcl"
    invoked from within
"hls::main /tmp/finn_dev_jacopo/code_gen_ipgen_StreamingDataflowPartition_0_IODMA_0_holym8ev/hls_syn_StreamingDataflowPartition_0_IODMA_0.tcl"
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 hls::main {*}$newargs"
    (procedure "hls_proc" line 16)
    invoked from within
"hls_proc [info nameofexecutable] $argv"
INFO: [HLS 200-112] Total CPU user time: 2.86 seconds. Total CPU system time: 0.48 seconds. Total elapsed time: 1.94 seconds; peak allocated memory: 1.215 GB.
INFO: [Common 17-206] Exiting vitis_hls at Thu Jul 21 13:35:24 2022...

I installed Vitis version 2022.1

jacopoabramo avatar Jul 21 '22 13:07 jacopoabramo

WARNING: [HLS 207-5567] Invalid Directive: for current device, RAM_S2P + URAM is invalid combination for BIND_STORAGE's option 'type + impl' (/home/jacopo/git/finn/deps/finn-hlslib/slidingwindow.h:116:49) From this warning it looks like URAM is selected. If I see it correctly, your target device is a Pynq-Z2, that board doesn't have URAM, you will need to select a different ram_style for that node.

auphelia avatar Jul 21 '22 13:07 auphelia

WARNING: [HLS 207-5567] Invalid Directive: for current device, RAM_S2P + URAM is invalid combination for BIND_STORAGE's option 'type + impl' (/home/jacopo/git/finn/deps/finn-hlslib/slidingwindow.h:116:49) From this warning it looks like URAM is selected. If I see it correctly, your target device is a Pynq-Z2, that board doesn't have URAM, you will need to select a different ram_style for that node.

Apparently the entire folding step I copied from the original source was incorrect; now I removed it and it seems that synthesis is successfull. Where can I look in the documentation for updated informations on how to properly select the folding parameters?

EDIT: is there any way to increase or control the number of processes involved in the synthesis step?

jacopoabramo avatar Jul 21 '22 14:07 jacopoabramo

You can find documentation about the folding factors here: https://github.com/Xilinx/finn/blob/github-pages/docs/finn-sheduling-and-folding.pptx

Some of the FINN transformations (e.g. HLSSynthIP) can be parallelized using the env var NUM_DEFAULT_WORKERS (https://finn.readthedocs.io/en/latest/getting_started.html#environment-variables), the default is 4

auphelia avatar Jul 27 '22 10:07 auphelia

Closing issue as it is now solved.

jacopoabramo avatar Sep 16 '22 14:09 jacopoabramo