nngen icon indicating copy to clipboard operation
nngen copied to clipboard

NNgen log ignores lazy_reshape, which is the output layer, and shows the previous operator as the output layer.

Open RyusukeYamano opened this issue 3 years ago • 1 comments

NNgen log ignores lazy_reshape, which is the output layer, and shows the previous operator as the output layer.

the test case

  • /nngen/tests/matrix_reshape/test_matrix_reshape_int16.py
from __future__ import absolute_import
from __future__ import print_function

import os
import sys

# the next line can be removed after installation
sys.path.insert(0, os.path.dirname(os.path.dirname(
    os.path.dirname(os.path.abspath(__file__)))))

import nngen as ng
import veriloggen

import matrix_reshape


a_shape = (4, 16)
b_shape = (16, 4)
a_dtype = ng.int16
b_dtype = ng.int16
axi_datawidth = 32


def test(request, silent=True):
    veriloggen.reset()

    simtype = request.config.getoption('--sim')

    rslt = matrix_reshape.run(a_shape, b_shape,
                              a_dtype, b_dtype,
                              axi_datawidth, silent,
                              filename=None, simtype=simtype,
                              outputfile=os.path.splitext(os.path.basename(__file__))[0] + '.out')

    verify_rslt = rslt.splitlines()[-1]
    assert(verify_rslt == '# verify: PASSED')


if __name__ == '__main__':
    rslt = matrix_reshape.run(a_shape, b_shape,
                              a_dtype, b_dtype,
                              axi_datawidth, silent=False,
                              filename='tmp.v',
                              outputfile=os.path.splitext(os.path.basename(__file__))[0] + '.out')
    print(rslt)

It log is below

[Schedule Table]
(Stage 0)
(Stage 1)
  <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | <placeholder a dtype:int16 shape:(4, 16) default_addr:128 g_index:2 word_alignment:2 aligned_shape:(4, 16) scale_factor:1.000000>
(Stage 2)
  <sub output_sub_0 dtype:int16 shape:(16, 4) default_addr:0 g_index:1 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | <add None dtype:int16 shape:(16, 4) chained default_addr:0 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | | <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | | <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
(Stage 3)
  <_lazy_reshape None dtype:int16 shape:(4, 16) alias_of:output_sub_0 default_addr:0 g_index:1 word_alignment:2 aligned_shape:(4, 16) scale_factor:1.000000>
  | <sub output_sub_0 dtype:int16 shape:(16, 4) default_addr:0 g_index:1 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
[RAM (spec: num)]
  16-bit 512-entry 2-port 2-bank RAM: 4
[Substream (spec: num)]
[Stream (spec: num)]
  (((<class 'nngen.operator.basic._lazy_reshape'>, <dtype int16>), <dtype int16>, 1), True): 1
  ((<class 'nngen.operator.basic.sub'>, ((<class 'nngen.operator.basic.add'>, <dtype int16>, <dtype int16>), <dtype int16>, 1), <dtype int16>), <dtype int16>, 1): 1
[State IDs in main_fsm]
  (3, 4, 'a', 'None')
  (5, 6, None, 'None')
  (12, 14, 'output_sub_0', 'control_sub_3')
  (15, 16, None, 'None')
[Control (name (# states: num))]
  main_fsm (# states: 22)
  control_sub_3 (# states: 59)
[Register Map]
    0 (R ): header0 (default: 0x00000000)
    4 (R ): header1 (default: 0x00000000)
    8 (R ): header2 (default: 0x00000000)
   12 (R ): header3 (default: 0x00000000)
   16 ( W): Start (set '1' to run)
   20 (R ): Busy (returns '1' when running)
   24 ( W): Reset (set '1' to initialize internal logic)
   28 (R ): Opcode from extern objects to SW (returns '0' when idle)
   32 ( W): Resume extern objects (set '1' to resume)
   36 (R ): Interrupt Status Register
   40 ( W): Interrupt Enable Register
   44 ( W): Interrupt Acknowledge Register
   48 (R ): State Counter
   52 ( W): Count Target
   56 ( W): Count Divider
   60 (  ): Reserved ...
  120 (  ): ... Reserved
  124 (R ): Address space amount
  128 (RW): Global address offset (default: 0)
  132 (RW): Address of temporal storages (size: 0B)
  136 (RW): Address of output (sub) 'output_sub_0' (size: 128B, dtype: int16, shape: (16, 4), alignment: 2 words (4 bytes)), aligned shape: (16, 4)
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ we expect the log that "(_lazy_reshape) 'output__lazy_reshape_0' (size: 128B, dtype: int16, shape: (4, 16), alignment: 2 words (4 bytes)), aligned shape: (4, 16)". 
  140 (RW): Address of placeholder 'a' (size: 128B, dtype: int16, shape: (4, 16), alignment: 2 words (4 bytes)), aligned shape: (4, 16)
[Default Memory Map (start - end)] (entire range: [0 - 255], size: 256B)
  [  0 - 127]: output (sub) 'output_sub_0' (size: 128B, dtype: int16, shape: (16, 4), alignment: 2 words (4 bytes)), aligned shape: (16, 4)
  [128 - 255]: placeholder 'a' (size: 128B, dtype: int16, shape: (4, 16), alignment: 2 words (4 bytes)), aligned shape: (4, 16)
# start
# end
# execution cycles:        1721
# verify: PASSED

This seems to work well.

diff --git a/nngen/verilog.py b/nngen/verilog.py
index 034a68e..390149a 100644
--- a/nngen/verilog.py
+++ b/nngen/verilog.py
@@ -500,8 +500,8 @@ def set_storage_name(objs):
             obj.name = 'input_%d' % tmp_input
             tmp_input += 1
         elif obj.is_output and obj.name is None:
-            while bt.is_view(obj) or bt.is_removable_reshape(obj):
-                obj = obj.args[0]
+            #while bt.is_view(obj) or bt.is_removable_reshape(obj):
+            #    obj = obj.args[0]
 
             obj.name = 'output_%s_%d' % (obj.__class__.__name__, tmp_output)
             tmp_output += 1
@@ -1148,6 +1154,7 @@ def make_addr_map(config, objs, saxi):
     for obj in sorted(objs, key=lambda x: x.object_id):
         if obj.is_output and obj.global_index is None:
 
+            org_obj = obj
             while bt.is_view(obj) or bt.is_removable_reshape(obj):
                 obj = obj.args[0]
 
@@ -1165,16 +1172,16 @@ def make_addr_map(config, objs, saxi):
                   "(size: %s, dtype: %s, shape: %s, "
                   "alignment: %d words (%d bytes)), "
                   "aligned shape: %s") %
-                 (obj.__class__.__name__,
-                  "'%s'" % obj.name if obj.name is not None else 'None',
+                 (org_obj.__class__.__name__,
+                  "'%s'" % org_obj.name if org_obj.name is not None else 'None',
                   size_str(space_size),
-                  obj.dtype.to_str() if obj.dtype is not None else 'None',
-                  (str(obj.shape)
-                   if isinstance(obj.shape, (tuple, list)) else '()'),
-                  obj.get_word_alignment(),
-                  bt.to_byte(obj.get_word_alignment() * obj.get_ram_width()),
-                  (str(tuple(obj.get_aligned_shape()))
-                   if isinstance(obj.shape, (tuple, list)) else '()')))
+                  org_obj.dtype.to_str() if org_obj.dtype is not None else 'None',
+                  (str(org_obj.shape)
+                   if isinstance(org_obj.shape, (tuple, list)) else '()'),
+                  org_obj.get_word_alignment(),
+                  bt.to_byte(org_obj.get_word_alignment() * org_obj.get_ram_width()),
+                  (str(tuple(org_obj.get_aligned_shape()))
+                   if isinstance(org_obj.shape, (tuple, list)) else '()')))

then the log is follow

[Schedule Table]
(Stage 0)
(Stage 1)
  <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | <placeholder a dtype:int16 shape:(4, 16) default_addr:128 g_index:2 word_alignment:2 aligned_shape:(4, 16) scale_factor:1.000000>
(Stage 2)
  <sub None dtype:int16 shape:(16, 4) default_addr:0 g_index:1 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | <add None dtype:int16 shape:(16, 4) chained default_addr:0 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | | <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | | <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
  | <_lazy_reshape None dtype:int16 shape:(16, 4) alias_of:a default_addr:128 g_index:2 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
(Stage 3)
  <_lazy_reshape output__lazy_reshape_0 dtype:int16 shape:(4, 16) alias_of:<sub> default_addr:0 g_index:1 word_alignment:2 aligned_shape:(4, 16) scale_factor:1.000000>
  | <sub None dtype:int16 shape:(16, 4) default_addr:0 g_index:1 word_alignment:2 aligned_shape:(16, 4) scale_factor:1.000000>
[RAM (spec: num)]
  16-bit 512-entry 2-port 2-bank RAM: 4
[Substream (spec: num)]
[Stream (spec: num)]
  (((<class 'nngen.operator.basic._lazy_reshape'>, <dtype int16>), <dtype int16>, 1), True): 1
  ((<class 'nngen.operator.basic.sub'>, ((<class 'nngen.operator.basic.add'>, <dtype int16>, <dtype int16>), <dtype int16>, 1), <dtype int16>), <dtype int16>, 1): 1
[State IDs in main_fsm]
  (3, 4, 'a', 'None')
  (5, 6, None, 'None')
  (12, 14, None, 'control_sub_3')
  (15, 16, 'output__lazy_reshape_0', 'None')
[Control (name (# states: num))]
  main_fsm (# states: 22)
  control_sub_3 (# states: 59)
[Register Map]
    0 (R ): header0 (default: 0x00000000)
    4 (R ): header1 (default: 0x00000000)
    8 (R ): header2 (default: 0x00000000)
   12 (R ): header3 (default: 0x00000000)
   16 ( W): Start (set '1' to run)
   20 (R ): Busy (returns '1' when running)
   24 ( W): Reset (set '1' to initialize internal logic)
   28 (R ): Opcode from extern objects to SW (returns '0' when idle)
   32 ( W): Resume extern objects (set '1' to resume)
   36 (R ): Interrupt Status Register
   40 ( W): Interrupt Enable Register
   44 ( W): Interrupt Acknowledge Register
   48 (R ): State Counter
   52 ( W): Count Target
   56 ( W): Count Divider
   60 (  ): Reserved ...
  120 (  ): ... Reserved
  124 (R ): Address space amount
  128 (RW): Global address offset (default: 0)
  132 (RW): Address of temporal storages (size: 0B)
  136 (RW): Address of output (_lazy_reshape) 'output__lazy_reshape_0' (size: 128B, dtype: int16, shape: (4, 16), alignment: 2 words (4 bytes)), aligned shape: (4, 16)
  140 (RW): Address of placeholder 'a' (size: 128B, dtype: int16, shape: (4, 16), alignment: 2 words (4 bytes)), aligned shape: (4, 16)
[Default Memory Map (start - end)] (entire range: [0 - 255], size: 256B)
  [  0 - 127]: output (_lazy_reshape) 'output__lazy_reshape_0' (size: 128B, dtype: int16, shape: (4, 16), alignment: 2 words (4 bytes)), aligned shape: (4, 16)
  [128 - 255]: placeholder 'a' (size: 128B, dtype: int16, shape: (4, 16), alignment: 2 words (4 bytes)), aligned shape: (4, 16)
# start
# end
# execution cycles:        1721
# verify: PASSED

RyusukeYamano avatar Jul 07 '21 05:07 RyusukeYamano

This issue has been resolved in 32e8d503bb819edcd48adccfb44784aa081cc8e4 .

shtaxxx avatar Dec 08 '22 23:12 shtaxxx