dace
dace copied to clipboard
Fix stream allocation scoping
There was an issue where streams would be allocated globally (to a state) and locally. This should not happen. The expected behaviour is to never allocate streams locally to a scope.
This PR fixes this issue by never including streams in the scope transient analysis.
Codecov Report
Merging #766 (b6e1c96) into master (ea8aeed) will decrease coverage by
21.66%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #766 +/- ##
===========================================
- Coverage 80.49% 58.83% -21.66%
===========================================
Files 222 224 +2
Lines 39796 40707 +911
===========================================
- Hits 32031 23949 -8082
- Misses 7765 16758 +8993
Impacted Files | Coverage Δ | |
---|---|---|
dace/codegen/targets/fpga.py | 13.96% <ø> (-55.28%) |
:arrow_down: |
dace/sdfg/utils.py | 66.02% <100.00%> (-17.45%) |
:arrow_down: |
dace/libraries/pblas/__init__.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
dace/libraries/pblas/nodes/__init__.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
dace/libraries/pblas/environments/__init__.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
dace/libraries/pblas/nodes/pgeadd.py | 0.00% <0.00%> (-94.81%) |
:arrow_down: |
dace/frontend/octave/ast_expression.py | 11.39% <0.00%> (-82.67%) |
:arrow_down: |
dace/transformation/interstate/transient_reuse.py | 12.80% <0.00%> (-81.60%) |
:arrow_down: |
dace/transformation/testing.py | 0.00% <0.00%> (-79.66%) |
:arrow_down: |
dace/codegen/targets/sve/unparse.py | 12.24% <0.00%> (-76.87%) |
:arrow_down: |
... and 147 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update ea8aeed...b6e1c96. Read the comment docs.
I think the approach is still not the right one. What we want is to make sure that FPGA streams are always allocated in exactly one place, which is at the beginning of kernel generation. My intuition would be that we have to:
- Make sure that all streams are allocated when generating the FPGA kernel.
- Never allocate local streams when inside an FPGA kernel.
Okay - I found something interesting. The place where the stream allocation "would" be suppressed [here],(https://github.com/spcl/dace/blob/62ae0f0a43352af1c88dea4d457dac392203e790/dace/codegen/targets/fpga.py#L703) it is explicitly not suppressed because of something with the Intel FPGA backend. Can someone explain to me why this is?
Okay - I found something interesting. The place where the stream allocation "would" be suppressed [here],(
https://github.com/spcl/dace/blob/62ae0f0a43352af1c88dea4d457dac392203e790/dace/codegen/targets/fpga.py#L703
) it is explicitly not suppressed because of something with the Intel FPGA backend. Can someone explain to me why this is?
Honestly I already find it a bit weird that we are suppressing these allocations here. Have you checked the callsite of where allocate_array
is being called from in the case that is causing problems for you? Could you post the call stack here?
I think the approach is still not the right one. What we want is to make sure that FPGA streams are always allocated in exactly one place, which is at the beginning of kernel generation. My intuition would be that we have to:
- Make sure that all streams are allocated when generating the FPGA kernel.
- Never allocate local streams when inside an FPGA kernel.
exactly this. This PR should be changed to focus on making sure the FPGA codegen allocates all streams in one place.
@definelicht This would be the stack trace (made it throw an exception if an already defined stream reaches this bit of code).
It is called by dispatch_allocate
. That's the same code allocation "top-level local" data.
Traceback (most recent call last):
File "tests/stream_allocation_scope_test.py", line 98, in <module>
test_stream_allocation_scope()
File "tests/stream_allocation_scope_test.py", line 91, in test_stream_allocation_scope
assert sdfg.generate_code()[2].clean_code.count("dace::SetNames(AtoB") == 1
File "/home/widmmarc/k3-dace/dace/sdfg/sdfg.py", line 2249, in generate_code
program_code = codegen.generate_code(sdfg)
File "/home/widmmarc/k3-dace/dace/codegen/codegen.py", line 178, in generate_code
used_environments) = frame.generate_code(sdfg, None)
File "/home/widmmarc/k3-dace/dace/codegen/targets/framecode.py", line 552, in generate_code
states_generated = self.generate_states(sdfg, global_stream,
File "/home/widmmarc/k3-dace/dace/codegen/targets/framecode.py", line 463, in generate_states
cft.as_cpp(self.dispatcher.defined_vars, sdfg.symbols), sdfg)
File "/home/widmmarc/k3-dace/dace/codegen/control_flow.py", line 195, in as_cpp
expr += elem.as_cpp(defined_vars, symbols)
File "/home/widmmarc/k3-dace/dace/codegen/control_flow.py", line 126, in as_cpp
expr += self.dispatch_state(self.state)
File "/home/widmmarc/k3-dace/dace/codegen/targets/framecode.py", line 440, in dispatch_state
self._dispatcher.dispatch_state(sdfg, state, global_stream, stream)
File "/home/widmmarc/k3-dace/dace/codegen/dispatcher.py", line 310, in dispatch_state
satisfied_dispatchers[0].generate_state(sdfg, state,
File "/home/widmmarc/k3-dace/dace/codegen/targets/fpga.py", line 211, in generate_state
self.generate_kernel(sdfg, state, kernel_name, subgraphs,
File "/home/widmmarc/k3-dace/dace/codegen/targets/fpga.py", line 1399, in generate_kernel
self.generate_kernel_internal(sdfg, state, kernel_name, subgraphs,
File "/home/widmmarc/k3-dace/dace/codegen/targets/xilinx.py", line 798, in generate_kernel_internal
self.generate_modules(sdfg, state, kernel_name, subgraphs,
File "/home/widmmarc/k3-dace/dace/codegen/targets/fpga.py", line 1450, in generate_modules
self.generate_module(sdfg, state, module_name, subgraph,
File "/home/widmmarc/k3-dace/dace/codegen/targets/xilinx.py", line 732, in generate_module
self._dispatcher.dispatch_subgraph(sdfg,
File "/home/widmmarc/k3-dace/dace/codegen/dispatcher.py", line 351, in dispatch_subgraph
self.dispatch_scope(v.map.schedule, sdfg, scope_subgraph,
File "/home/widmmarc/k3-dace/dace/codegen/dispatcher.py", line 403, in dispatch_scope
self._map_dispatchers[map_schedule].generate_scope(
File "/home/widmmarc/k3-dace/dace/codegen/targets/fpga.py", line 510, in generate_scope
self.generate_node(sdfg, dfg_scope, state_id,
File "/home/widmmarc/k3-dace/dace/codegen/targets/fpga.py", line 1043, in generate_node
getattr(self, method_name)(sdfg, dfg, state_id, node,
File "/home/widmmarc/k3-dace/dace/codegen/targets/fpga.py", line 1345, in _generate_MapEntry
self._dispatcher.dispatch_allocate(sdfg, dfg, state_id, child, None,
File "/home/widmmarc/k3-dace/dace/codegen/dispatcher.py", line 424, in dispatch_allocate
dispatcher.allocate_array(sdfg, dfg, state_id, node, function_stream,
File "/home/widmmarc/k3-dace/dace/codegen/targets/fpga.py", line 543, in allocate_array
raise Exception
I will also need to ensure that all streams are allocated once (at the top-level). At the moment, only data shared between two or more subgraphs are allocated "top-level locally". (Missed that bit before.) This should be rather straightforward - just look for all streams in the subgraphs and allocate them when it allocates the data shared between subgraphs.
As a first step, make sure that all these streams are allocated where we want them to be allocated (i.e., at the beginning of generating an FPGA kernel). Then we can see if we need to prevent them from being reallocated later, but we might not need to, because of that code you linked.
Now all streams are allocated "top-level locally". The FPGA codegen now allocates data at that level if it shared between two subgraphs (status-quo) or if it is a stream.
There is still the issue with them being declared locally due to that explicit check mentioned before.https://github.com/spcl/dace/blob/62ae0f0a43352af1c88dea4d457dac392203e790/dace/codegen/targets/fpga.py#L703 @TizianoDeMatteis It was added by this commit. Could you explain to me why this check is needed? I am not familiar with the concept of views in DaCe.
@TizianoDeMatteis It was added by this commit. Could you explain to me why this check is needed? I am not familiar with the concept of views in DaCe.
This is not directly related to views. We need to allocate anyway streams to keep track of their names across the SDFGs hierarchy (a stream may have a certain name in the top-level scope, then another one in a nested SDFG and so on...)