blueoil icon indicating copy to clipboard operation
blueoil copied to clipboard

Consider upgrading to TensorFlow 2.0

Open tkng opened this issue 5 years ago • 11 comments

TensorFlow 2.0 was released. We should consider upgrading our TF dependency to 2.0.

However, since this is large major version up, we still don't know how much source code needs to be rewritten for the upgrade.

Fortunately, there's a script tf_upgrade_v2.py to upgrade to TF 2.0. First, we should try this script and estimate how the upgrade is difficult.

tkng avatar Oct 02 '19 09:10 tkng

There is an official guide just for reference: https://www.tensorflow.org/guide/upgrade

yd8534976 avatar Oct 02 '19 09:10 yd8534976

Identification of targets and estimation of difficulty (for Future PRs)

By using tf_upgrade_v2.py and changing it to tf.compat.v1, you can use the 1.x features in tf2 version. I want to know the targets to change checked the difficulty of the change.

Files and directories under "blueoil" directory are the targes to change. I directly applied tf_upgrade_v2.py each of the directories and files under blueoil to know the targets to change and to know the difficulty of the change.

Convert targets

  • [ ] blueoil/blocks.py (*1) #895
  • [x] blueoil/cmd #875
  • [x] blueoil/configs #899
  • ~~blueoil/converter: (*2)~~
  • ~~blueoil/data_processor.py: (*2)~~
  • [x] blueoil/datasets: (*2) #894
  • [ ] blueoil/layers (*1) #927
  • [x] blueoil/metrics #913
  • [ ] blueoil/networks (*1)
    • [x] networks/base.py #1067
    • [x] networks/classification/base.py #1073
    • [x] networks/classification/darknet.py #1074
    • [ ] networks/classification/lm_resnet.py
    • [x] networks/classification/lmnet_v0.py #1075
    • [ ] networks/classification/lmnet_v1.py
    • [ ] networks/classification/mobilenet_v2.py
    • [x] networks/classification/quantize_example.py #1080
    • [x] networks/classification/resnet.py #1069
    • [ ] networks/classification/vgg16.py #1081
    • [x] networks/keypoint_detection #1077
      • [x] networks/keypoint_detection/base.py
      • [x] networks/keypoint_detection/lm_single_pose_v1.py
    • [ ] networks/lmnet_multi.py
    • [ ] networks/object_detection/lm_fyolo.py
    • [ ] networks/object_detection/yolo_v1.py
    • [ ] networks/object_detection/yolo_v2.py
    • [ ] networks/segmentation/__init__.py
    • [ ] networks/segmentation/base.py
    • [ ] networks/segmentation/lm_bisenet.py
    • [ ] networks/segmentation/lm_segnet_v0.py
    • [ ] networks/segmentation/lm_segnet_v1.py
  • [x] blueoil/quantizations #915
  • ~~blueoil/templates (*3)~~
  • [x] blueoil/utils #914
  • [x] blueoil/generate_lmnet_config.py #902
  • [x] output_template (output_template/python/lmnet/tensorflow_graph_runner.py) #916
  • [ ] tests/unit/conftest.py
  • [ ] tests/unit/executor_tests
  • [ ] tests/unit/fixtures
  • [ ] tests/unit/metrics_tests
  • [ ] tests/unit/networks_tests
  • [ ] tests/unit/test_layers.py
  • [ ] tests/unit/test_post_processor.py
  • [ ] tests/unit/test_quantizations.py

Not Convert targets

The following are files that have not been modified by running the script and are not subject to change:

  • blueoil/__init__.py
  • blueoil/common.py
  • blueoil/data_augmentor.py
  • blueoil/environment.py
  • ~~blueoil/generate_lmnet_config.py~~
  • blueoil/post_processor.py
  • blueoil/pre_processor.py
  • blueoil/visualize.py
  • blueoil/converter
  • blueoil/data_processor.py

Requires manual adjustment

Files can be converted automatically, but some manual adjustment is required. For contrib, the operator needs to be replaced carefully. Others may not be too difficult.

(*1)contrib is deprecated in tf2. tf_upgrade_v2 does not make any changes. We need to change to another layer manually. For example, the following contrib layer are included:

  • tf.contrib.layers.batch_norm
  • tf.contrib.layers.fully_connected
  • tf.contrib.layers.dropout
  • tf.contrib.layers.separable_conv2d

dlk may not be able to be converted by replacing contrib with a similar layer. We need to choose a layer that dlk supports. The accuracy also may decrease due to the change. We need to re-train the model check the accuracy after the replacements. This part can be difficult.

(*2) The conversion of three directories and files resulted in an error that stopped processing.

pasta.base.annotate.AnnotationError: Expected ':' but found 'metaclass'
line 24: class Base(metaclass=ABCMeta):

It can be converted by removing the metaclass and executing the process.

(*3) Since tf_upgrade_v2 may not support *.tpl.py file, it becomes invalid syntax and processing stops. We need to check the changes in the configs folder and change them manually.

ERROR: Failed to parse.
Traceback (most recent call last):
  File "/home/fukasawa/.pyenv/versions/3.6.8/envs/blueoil_tf2_convert/lib/python3.6/site-packages/tensorflow_core/tools/compatibility/ast_edits.py", line 916, in update_string_pasta
    t = pasta.parse(text)
  File "/home/fukasawa/.pyenv/versions/3.6.8/envs/blueoil_tf2_convert/lib/python3.6/site-packages/pasta/__init__.py", line 23, in parse
    t = ast_utils.parse(src)
  File "/home/fukasawa/.pyenv/versions/3.6.8/envs/blueoil_tf2_convert/lib/python3.6/site-packages/pasta/base/ast_utils.py", line 56, in parse
    tree = ast.parse(sanitize_source(src))
  File "/home/fukasawa/.pyenv/versions/3.6.8/lib/python3.6/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 20
    from blueoil.networks.classification.{{network_module}} import {{network_class}}
                                         ^
SyntaxError: invalid syntax

ytfksw avatar Mar 02 '20 06:03 ytfksw

I change the target described above.

bluoil/templates is not a target sincet it has no tf function (just import tensorflow only). On the other hand, we have modify blueoil/generate_lmnet_config.py.

generate_lmnet_config.py is a module that writes values to the tpl file, and the written values (functions) need to be changed manually.

Since values(functions) are stored as strings and not codes, tf2_upgrage_v2 does not convert it.

ytfksw avatar Mar 06 '20 09:03 ytfksw

@ytfksw Thank you for the update 👍

iizukak avatar Mar 09 '20 04:03 iizukak

Since converter and data_processor.py use metaclass, we knew we couldn't convert them automatically.

I temporarily removed the metaclass and converted, but nothing changed. converter and data_processor.py are not the targets to change.

I modified the table above.

ytfksw avatar Mar 12 '20 07:03 ytfksw

I have realized that output_template has a python file using TensorFlow (output_template/python/lmnet/tensorflow_graph_runner.py)

I update the above table.

ytfksw avatar Mar 13 '20 03:03 ytfksw

I have also realized that tests directories has a lot of python files using TensorFlow.

We are converting the code to 2 compatible, so it would be better to convert the test code after completing the change.

Since there are a lot of changes, it seems better to change them for each module as we did.

I update the above table.

For the future, I tried to apply tf_upgrade_v2 to tests Only unit changed: 21 files changed (total python files are 68 files), 63 insertions(+), 63 deletions(-)

 tests/unit/conftest.py                                                       |  2 +-
 tests/unit/executor_tests/test_output_event.py                               |  6 +++---
 tests/unit/fixtures/configs/for_build_tfds_classification.py                 |  2 +-
 tests/unit/fixtures/configs/for_build_tfds_object_detection.py               |  2 +-
 tests/unit/fixtures/configs/for_export.py                                    |  2 +-
 tests/unit/fixtures/configs/for_predict_classification.py                    |  2 +-
 tests/unit/fixtures/configs/for_predict_object_detection.py                  |  2 +-
 tests/unit/fixtures/configs/for_profile.py                                   |  2 +-
 tests/unit/fixtures/configs/for_train.py                                     |  2 +-
 tests/unit/metrics_tests/test_mean_average_precision.py                      | 26 +++++++++++++-------------
 tests/unit/networks_tests/classification_test/test_darknet.py                |  2 +-
 tests/unit/networks_tests/classification_test/test_lm_resnet_quantize.py     |  2 +-
 tests/unit/networks_tests/classification_test/test_lmnet_quantize.py         |  2 +-
 tests/unit/networks_tests/keypoint_detection_tests/test_lm_single_pose_v1.py |  2 +-
 tests/unit/networks_tests/object_detection_tests/test_yolo_v1.py             |  4 ++--
 tests/unit/networks_tests/object_detection_tests/test_yolo_v2.py             | 20 ++++++++++----------
 tests/unit/networks_tests/object_detection_tests/test_yolo_v2_quantize.py    |  2 +-
 tests/unit/networks_tests/segmentation_tests/test_lm_bisenet.py              |  4 ++--
 tests/unit/test_layers.py                                                    |  4 ++--
 tests/unit/test_post_processor.py                                            |  4 ++--
 tests/unit/test_quantizations.py                                             | 32 ++++++++++++++++----------------
 21 files changed, 63 insertions(+), 63 deletions(-)

There are no errors that cannot be converted and six warnings have occurred. 5 warnings are for tests/unit/create_images.py. The warning that tells us that the specification of tf.keas.model has changed. But it is not relevant to us since we are not using tf.keas.model.

1 warning is for tests/unit/test_post_processor.py. tf_upgrade_v2 tries to change tf.image.resize_bilinear to tf.image.resize. This is a warning that the option called align_corners cannot be used. But tf.compat.v1.image.resize_bilinear also exists, so it may be better to use this to avoid unnecessary confusion.

TensorFlow 2.0 Upgrade Script
-----------------------------
Converted 68 files
Detected 6 issues that require attention
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
File: tests/unit/create_images.py
--------------------------------------------------------------------------------
tests/unit/create_images.py:40:4: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.
tests/unit/create_images.py:61:4: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.
tests/unit/create_images.py:78:4: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.
tests/unit/create_images.py:95:4: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.
tests/unit/create_images.py:112:4: WARNING: *.save requires manual check. (This warning is only applicable if the code saves a tf.Keras model) Keras model.save now saves to the Tensorflow SavedModel format by default, instead of HDF5. To continue saving to HDF5, add the argument save_format='h5' to the save() function.
--------------------------------------------------------------------------------
File: tests/unit/test_post_processor.py
--------------------------------------------------------------------------------
tests/unit/test_post_processor.py:259:13: WARNING: tf.image.resize_bilinear called with align_corners argument, requires manual check: align_corners is not supported by tf.image.resize, the new default transformation is close to what v1 provided. If you require exactly the same transformation as before, use compat.v1.image.resize_bilinear.

ytfksw avatar Mar 13 '20 03:03 ytfksw

@ytfksw Thanks your for checking tests code!!

iizukak avatar Mar 15 '20 23:03 iizukak

I applied tf_upgrade_v2 to the networks folder to see what files would be changed. The changes to be made are as follows. Because of the many changes and the networks containing tf.contrib, We should create a PR for each file.

	modified:   networks/base.py
	modified:   networks/classification/base.py
	modified:   networks/classification/darknet.py
	modified:   networks/classification/lm_resnet.py
	modified:   networks/classification/lmnet_v0.py
	modified:   networks/classification/lmnet_v1.py
	modified:   networks/classification/mobilenet_v2.py
	modified:   networks/classification/quantize_example.py
	modified:   networks/classification/resnet.py
	modified:   networks/classification/vgg16.py
	modified:   networks/keypoint_detection/base.py
	modified:   networks/keypoint_detection/lm_single_pose_v1.py
	modified:   networks/lmnet_multi.py
	modified:   networks/object_detection/lm_fyolo.py
	modified:   networks/object_detection/yolo_v1.py
	modified:   networks/object_detection/yolo_v2.py
	modified:   networks/segmentation/__init__.py
	modified:   networks/segmentation/base.py
	modified:   networks/segmentation/lm_bisenet.py
	modified:   networks/segmentation/lm_segnet_v0.py
	modified:   networks/segmentation/lm_segnet_v1.py

ytfksw avatar May 29 '20 05:05 ytfksw

@ytfksw thanks for working on this issue. I'm curious how much amount of work is left here. Can you give me an estimate?

tsawada avatar Jul 21 '20 05:07 tsawada

@tsawada Current status is here https://github.com/blue-oil/blueoil/issues/479#issuecomment-593248601

I heared @ytfksw -san is busy now. And I removed assignees now. In above comment, 25 checks are not completed yet. I think we can solve 2 checks/day. We can solve this issue in about 2 weeks.

iizukak avatar Jul 27 '20 00:07 iizukak