keras icon indicating copy to clipboard operation
keras copied to clipboard

3 Possible Essential Features for `model.summary()`

Open innat opened this issue 4 years ago • 5 comments

System information.

TensorFlow version (you are using): 2.4.1 Are you willing to contribute it (Yes/No): No


Describe the feature and the current behavior/state.

Total 3 Feature Request, illustrated below.

[Feature Request 1]: Add Model Memory Usage

Many times it's needed to know about the consumed size of GPU memory for DL models to avoid potential out-of-memory errors either at the beginning of training or in the middle.

So in the model. summary, (we think) it would be better to have such functionalities that would give a rough estimation of memory usage by the DL model.

We've found a nice solution from @ZFTurbo on here, that addresses such a request. And I think it would be nice to have such a mechanism inside model.summary(memory_usage_info = True / False). Here is the starting gist for the contributor.

~~[Feature Request 2]: Layer Range to Display Model Summary~~ It has been addressed. HERE https://github.com/keras-team/keras/pull/16458

details of feature request 2

Let's say, we want to print the summary of the following model and are interested to observe the information of trainable and non-trainable parameter counts only, maybe we simply want to print the model summary within a specific range (like); in that case, it's not intuitive to get such LONG output.

from tensorflow.keras import Input, Model, Sequential
from tensorflow.keras import layers 

input = Input((224, 224, 3))
base =  keras.applications.MobileNetV3Small(weights=None,
                                             include_top = False, 
                                             input_tensor = input)
features     = layers.GlobalAveragePooling2D() (base.output)
classifier   = layers.Dense(10, activation="sigmoid")(features)
model        = Model(base.input, classifier)
model.summary()
# lots of layer gets print 
# .... 

Instead, a new parameters can be introduced such as model.summary(layer_range=['layer_a', 'layer_z']) to print sub-graph (like we have now for plot_model). Also, a parameter that only also .summary function to print the following output only:

# No Long Layer Prints ...
Total params: 1,540,218
Trainable params: 1,528,106
Non-trainable params: 12,112

It becomes a bit painful if we need to print the model summary more than one time and because of the long layer print, we have to scroll for a while which is inconvenient. At the time of transfer learning where we freeze/unfreeze layers and quickly check the parameter counts, the painful situation becomes more visible.

~~[Feature Request 3]: Print Nested Model / Layers~~ It has been addressed. HERE https://github.com/keras-team/keras/issues/15250

details of feature request 3

Let's say, we've defined a model as follows

from tensorflow import keras 
from tensorflow.keras import layers 
from tensorflow.keras.layers.experimental import preprocessing

model = keras.Sequential(
    [
        keras.applications.MobileNetV3Small(
            input_shape=(224, 224, 3),
            include_top=False,
            weights="imagenet"
            ),
        layers.GlobalAveragePooling2D(),
        layers.Dense(10, activation="sigmoid"),
    ]
)

If we print the model summary, we will get

model.summary()
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
===============================================
MobilenetV3small (Functional (None, 1, 1, 1024)        1529968   
_________________________________________________________________
global_average_pooling2d_13  (None, 1024)              0         
_________________________________________________________________
dense_16 (Dense)             (None, 10)                10250     
=============================================
Total params: 1,540,218
Trainable params: 1,528,106
Non-trainable params: 12,112

The problem here is that the MobilenNet model becomes nested and acts as a single layer. Regarding this, a question on SO had been asked and there is some workaround to overcome this issue. How, instead of printing the model summary, if we want to plot the model, we can use expand_nested=True to plot the whole model including that base model.


Will this change the current api? How? Yes, It will increase the strength of the model utility functions.

Who will benefit from this feature? DL researchers or engineers who use keras.

Contributing

  • Do you want to contribute a PR? (yes/no): no (but happy to share a starter for any contributor).
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

innat avatar Aug 24 '21 15:08 innat

I am implementing the expand_nested in the model.summary(). I will make a PR when I will be done!

krishrustagi avatar Aug 25 '21 10:08 krishrustagi

I can do feature requests #1 and #2. I will make a PR when I finish.

chrislevn avatar Dec 15 '21 06:12 chrislevn

Hi, i was looking for an easy contribution just to start with keras and i found this issue. As i can see, feature 1 and 3 have been already implemented so i decided to work on the 2nd one, taking main inspiration from this similar issue that has been already solved. I'm happy to hear any suggestion, i'll post my PR as soon as possible!

federicopisanu avatar Apr 24 '22 14:04 federicopisanu

Ok, my merge request has been merged in https://github.com/keras-team/keras/commit/953d13bd1f5f149848daab3b54065b1ddcccd99e so also the second feature request has been fulfilled. Thanks all!

federicopisanu avatar Jun 22 '22 15:06 federicopisanu

I can work on the first feature request (the last to be completed) 👌 I'll create an issue soon to track progress and discussions, and create the PR when ready

EDIT: As of 30/Nov, I'm still working on this feature request. I'll submit a PR this week with the current progress (only missing the nested models memory count and the total memory count)

aaossa avatar Nov 09 '22 13:11 aaossa

@aaossa As you're working, please keep the following cases in your test.

  • test_sequential and functional model (usualy you'll select them.)
  • test_sub-class model
  • test_hub.KerasLayer
  • test_huggingface.vision_model

For example, if I build keras model as follows, will it campute layer-wise as it should!

model = keras.Sequential(
    [
        keras.layers.InputLayer(input_shape=INP_SIZE+(3,)),
        hub.KerasLayer(
            model_url, 
            input_shape=INP_SIZE+(3,),
            trainable=True,
            dtype=tf.float32,
            load_options=load_locally
        ),
    ]
)

You can use the starter script I shared above. If you run it, you will get

gbytes = get_model_memory_usage(batch_size=32, model=cnn_model, verbose=1) 
print('Approximate Required GPU-RAN {} GB'.format(gbytes))

Memory for input_1  layer in MB is: 0.75
Memory for rescaling  layer in MB is: 0.75
Memory for normalization  layer in MB is: 0.75
Memory for stem_conv_pad  layer in MB is: 0.7529325485229492
....
Memory for global_average_pooling2d  layer in MB is: 0.001220703125
Memory for predictions  layer in MB is: 9.5367431640625e-07
Approximate Required GPU-RAN 15.589 GB

Now to add it in summary method, It might be better to have the above information as follows.

_________________________________________________________________
 Layer (type)                Output Shape              Param #    # Memory Usage
=================================================================
 input_1 (InputLayer)        [(None, 784)]             0          (in MB)

 dense (Dense)               (None, 64)                50240      (in MB)

 dense_1 (Dense)             (None, 64)                4160       (in MB)

 dense_2 (Dense)             (None, 10)                650        (in MB)

=================================================================
Total params: 55,050
Trainable params: 55,050
Approximate Required GPU-RAN [] GB
Non-trainable params: 0
_________________________________________________________________

innat avatar Dec 08 '22 13:12 innat