[clustering] Possible wrong implementation of get_weight_from_layer
Describe the bug
Problem with custom layer weights clustering. When layer implements ClusterableLayer it should override get_clusterable_weights but later call of get_weights_from_layer causes AttributeError
System information
MMMMMMMMMMMMMMMMMMMMMMMMMmds+. dellboy@thunderstruck
MMm----::-://////////////oymNMd+` ---------------------
MMd /++ -sNMd: OS: Linux Mint 19.3 Tricia x86_64
MMNso/` dMM `.::-. .-::.` .hMN: Host: Z390 AORUS MASTER
ddddMMh dMM :hNMNMNhNMNMNh: `NMm Kernel: 5.4.0-81-generic
NMm dMM .NMN/-+MMM+-/NMN` dMM Uptime: 12 hours, 21 mins
NMm dMM -MMm `MMM dMM. dMM Packages: 4694
NMm dMM -MMm `MMM dMM. dMM Shell: bash 4.4.20
NMm dMM .mmd `mmm yMM. dMM Resolution: 3840x2160, 3840x2160
NMm dMM` ..` ... ydm. dMM DE: Cinnamon 4.4.8
hMM- +MMd/-------...-:sdds dMM WM: Mutter (Muffin)
-NMm- :hNMNNNmdddddddddy/` dMM WM Theme: Linux Mint (Mint-Y-Dark)
-dMNs-``-::::-------.`` dMM Theme: Mint-Y-Dark [GTK2/3]
`/dMNmy+/:-------------:/yMMM Icons: Mint-Y [GTK2/3]
./ydNMMMMMMMMMMMMMMMMMMMMM Terminal: gnome-terminal
.MMMMMMMMMMMMMMMMMMM CPU: Intel i9-9900K (16) @ 5.000GHz
GPU: NVIDIA GeForce GTX 1080 Ti
Memory: 23072MiB / 64320MiB
TensorFlow version (installed from source or binary):
Installed with pip, tensorflow-gpu 2.6.0
TensorFlow Model Optimization version (installed from source or binary):
Installed with pip, tensorflow-model-optimization 0.6.0
Python version: Python 3.6.9
Describe the expected behavior
Describe the current behavior
Code to reproduce the issue
import numpy as np
import tensorflow as tf
from tensorflow import keras
import tensorflow_model_optimization as tfmot
cluster_weights = tfmot.clustering.keras.cluster_weights
CentroidInitialization = tfmot.clustering.keras.CentroidInitialization
clustering_params = {
'number_of_clusters': 3,
'cluster_centroids_init': CentroidInitialization.DENSITY_BASED
}
class MyCustomLayer(keras.layers.Layer, tfmot.clustering.keras.ClusterableLayer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyCustomLayer, self).__init__(**kwargs)
def build(self, input_shape):
self.kernel = self.add_weight(
name = 'kernel',
shape = (input_shape[1], self.output_dim),
initializer = 'normal',
trainable = True
)
super(MyCustomLayer, self).build(input_shape)
def call(self, input_data):
return keras.backend.dot(input_data, self.kernel)
def compute_output_shape(self, input_shape):
return (input_shape[0], self.output_dim)
def get_clusterable_weights(self):
clusterable_weights = []
for weight in self.trainable_weights:
clusterable_weights.append((weight.name, weight.read_value()))
return clusterable_weights
def get_model():
# Create a simple model.
model = keras.Sequential(
[
keras.Input(shape=(32,)),
MyCustomLayer(32, input_shape=(32,)),
keras.layers.Dense(2, activation="relu", name="layer1"),
keras.layers.Dense(3, activation="relu", name="layer2"),
keras.layers.Dense(4, name="layer3"),
]
)
model.compile(optimizer="adam", loss="mean_squared_error")
return model
model = get_model()
# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)
print(model.summary())
# Print all weights in model.
for weight in model.weights:
print(weight.name)#, weight.read_value())
clustered_model = cluster_weights(model, **clustering_params)
clustered_model.summary(line_length=180, positions=[0.25, 0.60, 0.70, 1.0])
The output is:
(bug) dellboy@thunderstruck:~/git/tisma/tf-learn$ python simple_model.py
2021-08-23 18:40:26.443329: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:26.465328: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:26.465628: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:26.466063: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-23 18:40:26.466461: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:26.466787: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:26.467059: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:27.063134: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:27.063747: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:27.064148: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-23 18:40:27.064580: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 6866 MB memory: -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2021-08-23 18:40:27.214358: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
4/4 [==============================] - 0s 664us/step - loss: 0.2719
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
my_custom_layer (MyCustomLay (None, 32) 1024
_________________________________________________________________
layer1 (Dense) (None, 2) 66
_________________________________________________________________
layer2 (Dense) (None, 3) 9
_________________________________________________________________
layer3 (Dense) (None, 4) 16
=================================================================
Total params: 1,115
Trainable params: 1,115
Non-trainable params: 0
_________________________________________________________________
None
my_custom_layer/kernel:0
layer1/kernel:0
layer1/bias:0
layer2/kernel:0
layer2/bias:0
layer3/kernel:0
layer3/bias:0
Traceback (most recent call last):
File "simple_model.py", line 69, in <module>
clustered_model = cluster_weights(model, **clustering_params)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/tensorflow_model_optimization/python/core/clustering/keras/cluster.py", line 133, in cluster_weights
**kwargs)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/tensorflow_model_optimization/python/core/clustering/keras/cluster.py", line 261, in _cluster_weights
to_cluster, input_tensors=None, clone_function=_add_clustering_wrapper)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/models.py", line 449, in clone_model
model, input_tensors=input_tensors, layer_fn=clone_function)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/models.py", line 332, in _clone_sequential_model
cloned_model = Sequential(layers=layers, name=model.name)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 530, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/engine/sequential.py", line 134, in __init__
self.add(layer)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 530, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/engine/sequential.py", line 217, in add
output_tensor = layer(self.outputs[0])
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/engine/base_layer.py", line 977, in __call__
input_list)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/engine/base_layer.py", line 1115, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/engine/base_layer.py", line 848, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/engine/base_layer.py", line 886, in _infer_output_signature
self._maybe_build(inputs)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/keras/engine/base_layer.py", line 2659, in _maybe_build
self.build(input_shapes) # pylint:disable=not-callable
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/tensorflow_model_optimization/python/core/clustering/keras/cluster_wrapper.py", line 160, in build
original_weight = self.get_weight_from_layer(weight_name)
File "/home/dellboy/git/tisma/tf-learn/bug/lib/python3.6/site-packages/tensorflow_model_optimization/python/core/clustering/keras/cluster_wrapper.py", line 146, in get_weight_from_layer
return getattr(self.layer, weight_name)
AttributeError: 'MyCustomLayer' object has no attribute 'my_custom_layer/kernel:0'
But if I run this snippet:
# Print all weights in model.
for weight in model.weights:
print(weight.name)#, weight.read_value())
print([layer.name for layer in model.layers])
# Example weight that should return from model.
weight_name = "my_custom_layer/kernel:0"
# This is correct way for getting it (I set layers[0] for example)
for weight in model.layers[0].weights:
if weight.name == weight_name:
print("FOUND WEIGHT: ", weight.name, weight.read_value())
It will found the weight:
my_custom_layer/kernel:0
layer1/kernel:0
layer1/bias:0
layer2/kernel:0
layer2/bias:0
layer3/kernel:0
layer3/bias:0
['my_custom_layer', 'layer1', 'layer2', 'layer3']
FOUND WEIGHT: my_custom_layer/kernel:0 tf.Tensor(
[[-0.01710013 0.08641329 0.00445064 ... -0.00947034 -0.02543414
-0.02332742]
[-0.0146245 -0.002941 -0.01422382 ... 0.02857029 -0.04331051
-0.00299862]
[-0.07127763 0.07367716 -0.06753001 ... -0.06001836 0.04888764
0.1081293 ]
...
[ 0.04297659 0.0334582 -0.09708535 ... 0.00098922 0.05463797
-0.0092663 ]
[ 0.03690836 -0.061338 0.01662921 ... -0.03843782 -0.08734126
0.00209901]
[ 0.10478324 0.07971404 0.05170573 ... 0.05777165 -0.08564453
0.04021074]], shape=(32, 32), dtype=float32)
My assumption is that implementation of get_weight_from_layer(self, weight_name): https://github.com/tensorflow/model-optimization/blob/18e87d262e536c9a742aef700880e71b47a7f768/tensorflow_model_optimization/python/core/clustering/keras/cluster_wrapper.py#L144-L145 is incorrect.
Hi @tisma,
Thanks for your report and especially for the code to reproduce the issue. We are looking into this.
Hi @wwwind, do you think you can take a look at this issue?
Hi @tisma, The function get_clusterable_weights in your implementation does not return what is expected by the clustering algorithm.
It should be
def get_clusterable_weights(self):
return [('kernel', self.kernel)]
We have a tutorial for ClusterableLayer here.
Oh, I see the problem in my implementation of get_clusterable_weights
def get_clusterable_weights(self):
clusterable_weights = []
for weight in self.trainable_weights:
clusterable_weights.append((weight.name, weight.read_value()))
return clusterable_weights
First problem, return value of weight.read_value() is of type <class 'tensorflow.python.framework.ops.EagerTensor'> instead of <class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'> so it should be replaced just by weight in that tuple creation.
Second trickier problem is that weight.name is not simply just the name of the variable, but it contains the layer name prefix and :0 at the end (eg. my_custom_layer/kernel:0). So if I have to specify names manually or if I want to use more generic way for adding clusterable weights I'll have to do something like this
weight.name[weight.name.find("/") + 1 : weight.name.find(":")]
to get just variable name.
@wwwind This was just a simple example of the model. What if I have a more complex model which is composed of several layers? How can I add those weights that are part of the nested layers to the clustered_weights list []?
This is the output of the model.summary() and all the weights that are part of stereo_net layer
# layers[9] -> StereoNet
for weight in model.layers[9].weights:
print(weight.name)
Model: "model"
________________________________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
========================================================================================================================
Left (InputLayer) [(None, None, None, 3)] 0
________________________________________________________________________________________________________________________
Right (InputLayer) [(None, None, None, 3)] 0
________________________________________________________________________________________________________________________
cam_fx (InputLayer) [(None, 1)] 0
________________________________________________________________________________________________________________________
cam_baseline (InputLayer) [(None, 1)] 0
________________________________________________________________________________________________________________________
cam_proj_l (InputLayer) [(None, 3, 4)] 0
________________________________________________________________________________________________________________________
cam_proj_r (InputLayer) [(None, 3, 4)] 0
________________________________________________________________________________________________________________________
targets (InputLayer) [(None, None, None)] 0
________________________________________________________________________________________________________________________
ious (InputLayer) [(None, None, None)] 0
________________________________________________________________________________________________________________________
labels_map (InputLayer) [(None, None, None)] 0
________________________________________________________________________________________________________________________
stereo_net (StereoNet) [(None, None, None), (2, 233472, 6), (2, 4085752 Left[0][0]
Right[0][0]
cam_fx[0][0]
cam_baseline[0][0]
cam_proj_l[0][0]
cam_proj_r[0][0]
targets[0][0]
ious[0][0]
labels_map[0][0]
________________________________________________________________________________________________________________________
depth (PassThrough) (None, None, None) 0 stereo_net[0][0]
________________________________________________________________________________________________________________________
bbox_cls (PassThrough) (2, 233472, 6) 0 stereo_net[0][1]
________________________________________________________________________________________________________________________
bbox_reg (PassThrough) (2, 10, 3, 6) 0 stereo_net[0][2]
________________________________________________________________________________________________________________________
bbox_centerness (PassThrough) (2, 233472, 6) 0 stereo_net[0][3]
========================================================================================================================
Total params: 4,085,752
Trainable params: 4,085,480
Non-trainable params: 272
________________________________________________________________________________________________________________________
conv2d/kernel:0
batch_normalization/gamma:0
batch_normalization/beta:0
conv2d_1/kernel:0
batch_normalization_1/gamma:0
batch_normalization_1/beta:0
conv2d_2/kernel:0
batch_normalization_2/gamma:0
batch_normalization_2/beta:0
conv2d_4/kernel:0
batch_normalization_4/gamma:0
batch_normalization_4/beta:0
conv2d_5/kernel:0
batch_normalization_5/gamma:0
batch_normalization_5/beta:0
conv2d_3/kernel:0
batch_normalization_3/gamma:0
batch_normalization_3/beta:0
conv2d_6/kernel:0
batch_normalization_6/gamma:0
batch_normalization_6/beta:0
conv2d_7/kernel:0
batch_normalization_7/gamma:0
batch_normalization_7/beta:0
conv2d_8/kernel:0
batch_normalization_8/gamma:0
batch_normalization_8/beta:0
conv2d_9/kernel:0
batch_normalization_9/gamma:0
batch_normalization_9/beta:0
conv2d_11/kernel:0
group_normalization_1/gamma:0
group_normalization_1/beta:0
conv2d_12/kernel:0
group_normalization_2/gamma:0
group_normalization_2/beta:0
conv2d_10/kernel:0
group_normalization/gamma:0
group_normalization/beta:0
conv2d_13/kernel:0
group_normalization_3/gamma:0
group_normalization_3/beta:0
conv2d_14/kernel:0
group_normalization_4/gamma:0
group_normalization_4/beta:0
conv2d_15/kernel:0
group_normalization_5/gamma:0
group_normalization_5/beta:0
conv2d_16/kernel:0
group_normalization_6/gamma:0
group_normalization_6/beta:0
conv2d_17/kernel:0
group_normalization_7/gamma:0
group_normalization_7/beta:0
conv2d_18/kernel:0
group_normalization_8/gamma:0
group_normalization_8/beta:0
conv2d_20/kernel:0
group_normalization_10/gamma:0
group_normalization_10/beta:0
conv2d_21/kernel:0
group_normalization_11/gamma:0
group_normalization_11/beta:0
conv2d_19/kernel:0
group_normalization_9/gamma:0
group_normalization_9/beta:0
conv2d_22/kernel:0
group_normalization_12/gamma:0
group_normalization_12/beta:0
conv2d_23/kernel:0
group_normalization_13/gamma:0
group_normalization_13/beta:0
conv2d_24/kernel:0
group_normalization_14/gamma:0
group_normalization_14/beta:0
conv2d_25/kernel:0
group_normalization_15/gamma:0
group_normalization_15/beta:0
conv2d_26/kernel:0
group_normalization_16/gamma:0
group_normalization_16/beta:0
conv2d_27/kernel:0
group_normalization_17/gamma:0
group_normalization_17/beta:0
conv2d_28/kernel:0
group_normalization_18/gamma:0
group_normalization_18/beta:0
conv2d_29/kernel:0
group_normalization_19/gamma:0
group_normalization_19/beta:0
conv2d_30/kernel:0
group_normalization_20/gamma:0
group_normalization_20/beta:0
conv2d_31/kernel:0
group_normalization_21/gamma:0
group_normalization_21/beta:0
conv2d_33/kernel:0
group_normalization_23/gamma:0
group_normalization_23/beta:0
conv2d_34/kernel:0
group_normalization_24/gamma:0
group_normalization_24/beta:0
conv2d_32/kernel:0
group_normalization_22/gamma:0
group_normalization_22/beta:0
conv2d_35/kernel:0
group_normalization_25/gamma:0
group_normalization_25/beta:0
conv2d_36/kernel:0
group_normalization_26/gamma:0
group_normalization_26/beta:0
conv2d_37/kernel:0
group_normalization_27/gamma:0
group_normalization_27/beta:0
conv2d_38/kernel:0
group_normalization_28/gamma:0
group_normalization_28/beta:0
conv2d_39/kernel:0
group_normalization_29/gamma:0
group_normalization_29/beta:0
conv2d_40/kernel:0
group_normalization_30/gamma:0
group_normalization_30/beta:0
conv2d_41/kernel:0
group_normalization_31/gamma:0
group_normalization_31/beta:0
conv2d_42/kernel:0
group_normalization_32/gamma:0
group_normalization_32/beta:0
conv2d_43/kernel:0
group_normalization_33/gamma:0
group_normalization_33/beta:0
conv2d_44/kernel:0
conv3d/kernel:0
group_normalization_34/gamma:0
group_normalization_34/beta:0
conv3d_1/kernel:0
group_normalization_35/gamma:0
group_normalization_35/beta:0
conv3d_2/kernel:0
group_normalization_36/gamma:0
group_normalization_36/beta:0
conv3d_3/kernel:0
group_normalization_37/gamma:0
group_normalization_37/beta:0
conv3d_transpose/kernel:0
group_normalization_38/gamma:0
group_normalization_38/beta:0
conv3d_transpose_1/kernel:0
group_normalization_39/gamma:0
group_normalization_39/beta:0
conv3d_4/kernel:0
group_normalization_40/gamma:0
group_normalization_40/beta:0
conv3d_5/kernel:0
batch_normalization/moving_mean:0
batch_normalization/moving_variance:0
batch_normalization_1/moving_mean:0
batch_normalization_1/moving_variance:0
batch_normalization_2/moving_mean:0
batch_normalization_2/moving_variance:0
batch_normalization_4/moving_mean:0
batch_normalization_4/moving_variance:0
batch_normalization_5/moving_mean:0
batch_normalization_5/moving_variance:0
batch_normalization_3/moving_mean:0
batch_normalization_3/moving_variance:0
batch_normalization_6/moving_mean:0
batch_normalization_6/moving_variance:0
batch_normalization_7/moving_mean:0
batch_normalization_7/moving_variance:0
batch_normalization_8/moving_mean:0
batch_normalization_8/moving_variance:0
batch_normalization_9/moving_mean:0
batch_normalization_9/moving_variance:0
print(dir(model.layers[9]))
['CV_X_MAX', 'CV_X_MIN', 'CV_Y_MAX', 'CV_Y_MIN', 'CV_Z_MAX', 'CV_Z_MIN', 'GRID_SIZE', 'RPN3D_INPUT_DIM', 'VOXEL_X_SIZE', 'VOXEL_Y_SIZE', 'VOXEL_Z_SIZE', 'X_MAX', 'X_MIN', 'Y_MAX', 'Y_MIN', 'Z_MAX', 'Z_MIN', '_TF_MODULE_IGNORED_PROPERTIES', '__abstractmethods__', '__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_abc_cache', '_abc_negative_cache', '_abc_negative_cache_version', '_abc_registry', '_activity_regularizer', '_add_trackable', '_add_variable_with_custom_getter', '_auto_track_sub_layers', '_autocast', '_autographed_call', '_build_input_shape', '_call_accepts_kwargs', '_call_arg_was_passed', '_call_fn_arg_defaults', '_call_fn_arg_positions', '_call_fn_args', '_call_full_argspec', '_callable_losses', '_cast_single_input', '_checkpoint_dependencies', '_clear_losses', '_compute_dtype', '_compute_dtype_object', '_dedup_weights', '_default_training_arg', '_deferred_dependencies', '_dtype', '_dtype_policy', '_dynamic', '_eager_losses', '_expects_mask_arg', '_expects_training_arg', '_flatten', '_flatten_layers', '_functional_construction_call', '_gather_children_attribute', '_gather_saveables_for_checkpoint', '_get_call_arg_value', '_get_existing_metric', '_get_input_masks', '_get_node_attribute_at_index', '_get_save_spec', '_get_trainable_state', '_handle_activity_regularization', '_handle_deferred_dependencies', '_handle_weight_regularization', '_inbound_nodes', '_inbound_nodes_value', '_infer_output_signature', '_init_call_fn_args', '_init_set_name', '_initial_weights', '_input_spec', '_instrument_layer_creation', '_instrumented_keras_api', '_instrumented_keras_layer_class', '_instrumented_keras_model_class', '_is_layer', '_keras_api_names', '_keras_api_names_v1', '_keras_tensor_symbolic_call', '_layers', '_list_extra_dependencies_for_serialization', '_list_functions_for_serialization', '_lookup_dependency', '_losses', '_map_resources', '_maybe_build', '_maybe_cast_inputs', '_maybe_create_attribute', '_maybe_initialize_trackable', '_metrics', '_metrics_lock', '_must_restore_from_config', '_name', '_name_based_attribute_restore', '_name_based_restores', '_name_scope', '_no_dependency', '_non_trainable_weights', '_obj_reference_counts', '_obj_reference_counts_dict', '_object_identifier', '_outbound_nodes', '_outbound_nodes_value', '_preload_simple_restoration', '_preserve_input_structure_in_config', '_restore_from_checkpoint_position', '_saved_model_inputs_spec', '_self_name_based_restores', '_self_saveable_object_factories', '_self_setattr_tracking', '_self_unconditional_checkpoint_dependencies', '_self_unconditional_deferred_dependencies', '_self_unconditional_dependency_names', '_self_update_uid', '_set_call_arg_value', '_set_connectivity_metadata', '_set_dtype_policy', '_set_mask_keras_history_checked', '_set_mask_metadata', '_set_save_spec', '_set_trainable_state', '_set_training_mode', '_setattr_tracking', '_should_cast_single_input', '_single_restoration_from_checkpoint_position', '_split_out_first_arg', '_stateful', '_supports_masking', '_symbolic_call', '_tf_api_names', '_tf_api_names_v1', '_thread_local', '_track_trackable', '_trackable_saved_model_saver', '_tracking_metadata', '_trainable', '_trainable_weights', '_unconditional_checkpoint_dependencies', '_unconditional_dependency_names', '_update_uid', '_updates', 'activity_regularizer', 'add_loss', 'add_metric', 'add_update', 'add_variable', 'add_weight', 'anchor_angles', 'apply', 'box_corner_parameters', 'build', 'built', 'call', 'cat_disp', 'cat_img_feature', 'cat_right_img_feature', 'centerness4class', 'cfg', 'class4angles', 'classif1', 'compute_dtype', 'compute_mask', 'compute_output_shape', 'compute_output_signature', 'coord_rect', 'count_params', 'dispregression', 'downsample_disp', 'dres0', 'dtype', 'dtype_policy', 'dynamic', 'feature_extraction', 'fix_centerness_bug', 'from_config', 'get_clusterable_algorithm', 'get_clusterable_weights', 'get_config', 'get_input_at', 'get_input_mask_at', 'get_input_shape_at', 'get_losses_for', 'get_output_at', 'get_output_mask_at', 'get_output_shape_at', 'get_updates_for', 'get_weights', 'hg_cv', 'hg_firstconv', 'hg_rpn_conv', 'hg_rpn_conv3d', 'img_feature_attentionbydisp', 'inbound_nodes', 'input', 'input_mask', 'input_shape', 'input_spec', 'losses', 'maxdisp', 'metrics', 'name', 'name_scope', 'non_trainable_variables', 'non_trainable_weights', 'num_3dconvs', 'num_angles', 'num_classes', 'num_convs', 'outbound_nodes', 'output', 'output_mask', 'output_shape', 'rpn3d_conv_kernel', 'set_weights', 'stateful', 'submodules', 'supports_masking', 'trainable', 'trainable_variables', 'trainable_weights', 'updates', 'upsample0', 'valid_classes', 'variable_dtype', 'variables', 'voxel_attentionbydisp', 'weights', 'with_name_scope']
Hi @tisma To pass to the clustering algorithm what should be clustered in your layer, you need to take a look at attributes with weights. This is an advanced usage, so it might be not so convenient, but I would put a breakpoint to see where weights are stored. For example, for MHA layer we have 4 types of weights. To pass them for clustering I would re-define my function like this:
def get_clusterable_weights_mha():
return [('_query_dense.kernel', layer._query_dense.kernel),
('_key_dense.kernel', layer._key_dense.kernel),
('_value_dense.kernel', layer._value_dense.kernel),
('_output_dense.kernel', layer._output_dense.kernel)]