OpenROAD-flow-scripts icon indicating copy to clipboard operation
OpenROAD-flow-scripts copied to clipboard

Unable to run AutoTuner for gcd

Open vijayank88 opened this issue 2 years ago • 17 comments

Subject

[Flow] for any util, flow Makefile, or flow script issues.

Describe the bug

python3 distributed.py --design gcd --platform sky130hd \
                       --config ../designs/sky130hd/gcd/autotuner.json \
                       tune

Failing to complete autotuner for gcd design as per document: https://openroad-flow-scripts.readthedocs.io/en/latest/user/InstructionsForAutoTuner.html#how-to-use

Expected Behavior

Complete AutoTuner successfully.

Environment

Git commit: 318f3042c43d9ac0c4eea675fc5f0f709b58fcba
kernel: Linux 3.10.0-1160.90.1.el7.x86_64
os: CentOS Linux 7 (Core)
cmake version 3.24.2
-- The CXX compiler identification is GNU 8.3.1
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rh/devtoolset-8/root/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- OpenROAD version: v2.0-9980-g318f3042c
-- System name: Linux
-- Compiler: GNU 8.3.1
-- Build type: RELEASE
-- Install prefix: /usr/local
-- C++ Standard: 17
-- C++ Standard Required: ON
-- C++ Extensions: OFF
-- The C compiler identification is GNU 8.3.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rh/devtoolset-8/root/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Python: /usr/local/bin/python3.9 (found version "3.9.6") found components: Interpreter 
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Performing Test C_COMPILER_SUPPORTS__-Wall
-- Performing Test C_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- Performing Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive - Success
-- Performing Test C_COMPILER_SUPPORTS__-x
-- Performing Test C_COMPILER_SUPPORTS__-x - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-x
-- Performing Test CXX_COMPILER_SUPPORTS__-x - Failed
-- Performing Test C_COMPILER_SUPPORTS__c++
-- Performing Test C_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__c++
-- Performing Test CXX_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- TCL library: /usr/lib64/libtcl.so
-- TCL header: /usr/include/tcl.h
-- TCL readline library: /usr/lib64/libtclreadline.so
-- TCL readline header: /usr/include
-- Found SWIG: /usr/bin/swig (found suitable version "4.1.0", minimum required is "3.0")  
-- Using SWIG >= 4.1.0 -flatstaticmethod flag for python
-- Found Boost: /usr/local/lib/cmake/Boost-1.72.0/BoostConfig.cmake (found version "1.72.0")  
-- boost: 1.72.0
-- Found Python3: /usr/local/include/python3.9 (found version "3.9.6") found components: Development Development.Module Development.Embed 
-- Found ZLIB: /usr/lib64/libz.so (found version "1.2.7") 
-- spdlog: 1.8.1
-- Found BISON: /usr/bin/bison (found version "3.0.4") 
-- Found Doxygen: /usr/bin/doxygen (found version "1.8.5") found components: doxygen dot 
-- STA version: 2.4.0
-- STA git sha: 7cf916ba205115a06c4531a044ced481f1ff8f12
-- System name: Linux
-- Compiler: GNU 8.3.1
-- Build type: RELEASE
-- Build CXX_FLAGS: -O3 -DNDEBUG
-- Install prefix: /usr/local
-- Found FLEX: /usr/local/bin/flex (found version "2.6.4") 
-- TCL library: /usr/lib64/libtcl.so
-- TCL header: /usr/include/tcl.h
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- SSTA: 0
-- STA executable: /home/vijayan/OPENROAD_FLOW/ORFS_LOCAL/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/app/sta
-- GPU is not enabled
-- TCL library: /usr/lib64/libtcl.so
-- TCL header: /usr/include/tcl.h
-- Found re2: /opt/or-tools/lib64/cmake/re2/re2Config.cmake (found version "9.0.0") 
-- Found Clp: /opt/or-tools/lib64/cmake/Clp/ClpConfig.cmake (found version "1.17.7") 
-- Found Cbc: /opt/or-tools/lib64/cmake/Cbc/CbcConfig.cmake (found version "2.10.7") 
-- Found Eigen3: /usr/local/share/eigen3/cmake/Eigen3Config.cmake (found version "3.3.90") 
-- Found SCIP: /opt/or-tools/lib/cmake/scip/scip-config.cmake (found version "8.0.1") 
-- GUI is enabled
-- Charts widget is not enabled
-- Found Boost: /usr/local/lib/cmake/Boost-1.72.0/BoostConfig.cmake (found version "1.72.0") found components: serialization 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Could NOT find VTune (missing: VTune_LIBRARIES VTune_INCLUDE_DIRS) 
-- Found Boost: /usr/local/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found suitable version "1.80.0", minimum required is "1.78")  
-- TCL library: /usr/lib64/libtcl.so
-- TCL header: /usr/include/tcl.h
-- Found Boost: /usr/local/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0") found components: serialization system thread 
-- TCL readline enabled
-- Tcl Extended disabled
-- Python3 enabled
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/tmp.CEz96EKs67

To Reproduce

cd flow/util
python3 distributed.py --design gcd --platform sky130hd \
                       --config ../designs/sky130hd/gcd/autotuner.json \
                       tune

Relevant log output

Traceback (most recent call last):
  File "/home/vijayan/OPENROAD_FLOW/ORFS_LOCAL/OpenROAD-flow-scripts/flow/util/distributed.py", line 41, in <module>
    import ray
  File "/home/vijayan/.local/lib/python3.9/site-packages/ray/__init__.py", line 91, in <module>
    import ray._raylet  # noqa: E402
  File "python/ray/_raylet.pyx", line 115, in init ray._raylet
  File "/home/vijayan/.local/lib/python3.9/site-packages/ray/exceptions.py", line 7, in <module>
    from ray.core.generated.common_pb2 import RayException, Language, PYTHON
  File "/home/vijayan/.local/lib/python3.9/site-packages/ray/core/generated/common_pb2.py", line 15, in <module>
    from . import runtime_env_common_pb2 as src_dot_ray_dot_protobuf_dot_runtime__env__common__pb2
  File "/home/vijayan/.local/lib/python3.9/site-packages/ray/core/generated/runtime_env_common_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/home/vijayan/.local/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 561, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Screenshots

No response

Additional Context

No response

vijayank88 avatar Aug 24 '23 06:08 vijayank88

cc/ @vvbandeira

vijayank88 avatar Aug 24 '23 06:08 vijayank88

@vijayank88 Looks like a version issue. Did you try the suggestions?

If you cannot immediately regenerate your protos, some other possible workarounds are:

  1. Downgrade the protobuf package to 3.20.x or lower.
  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much > slower).

vvbandeira avatar Aug 24 '23 10:08 vvbandeira

I tried to downgrade protobuf, but it throws some error.

$ pip3.9 install protobuf==3.20.*
Defaulting to user installation because normal site-packages is not writeable
Collecting protobuf==3.20.*
  Downloading protobuf-3.20.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
     \u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501 1.0/1.0 MB 26.8 MB/s eta 0:00:00
Installing collected packages: protobuf
  Attempting uninstall: protobuf
    Found existing installation: protobuf 4.24.0
    Uninstalling protobuf-4.24.0:
      Successfully uninstalled protobuf-4.24.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorboardx 2.5.1 requires protobuf<=3.20.1,>=3.8.0, but you have protobuf 3.20.3 which is incompatible.
ray 1.11.0 requires grpcio<=1.43.0,>=1.28.1, but you have grpcio 1.57.0 which is incompatible.
Successfully installed protobuf-3.20.3

Using export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python able to run AutoTuner.

vijayank88 avatar Aug 24 '23 12:08 vijayank88

tensorboardx 2.5.1 requires protobuf<=3.20.1

@luarss Could you try this on your end so we can update the AT docs PR? Maybe pip3.9 install protobuf==3.20.1 would work.

vvbandeira avatar Aug 24 '23 12:08 vvbandeira

@vvbandeira Seems pip3.9 install protobuf==3.20.* installing protobuf-3.20.3 by default, able to run AutoTuner without setting PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

vijayank88 avatar Aug 24 '23 12:08 vijayank88

@vijayank88 I see, but from the message above, a user might have trouble running tensorboard (a feature we claim to support and mention in the docs) if they keep protobuf-3.20.3.

vvbandeira avatar Aug 24 '23 12:08 vvbandeira

Able to launch tensorboard from our GCP with following warnings

$ ~/.local/bin/tensorboard --logdir=../logs/sky130hd/gcd/test-tune-2023-08-24-12-14-47/
2023-08-24 12:24:32.848620: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-24 12:24:33.106386: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-24 12:24:33.107344: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-24 12:24:34.920876: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server: /lib64/libc.so.6: version `GLIBC_2.25' not found (required by /home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server)
/home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server)
/home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server: /lib64/libc.so.6: version `GLIBC_2.29' not found (required by /home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server)
/home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server: /lib64/libc.so.6: version `GLIBC_2.33' not found (required by /home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server)
/home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by /home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server)
/home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server)
/home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by /home/vijayan/.local/lib/python3.9/site-packages/tensorboard_data_server/bin/server)
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.13.0 at http://localhost:6006/ (Press CTRL+C to quit)

vijayank88 avatar Aug 24 '23 12:08 vijayank88

For my Autotuner GCD Notebook I actually got it working without touching the protobuf dependency, just upgraded Ray to 2.0 and used Python 3.10. I am wondering if we currently track the metrics of Autotuner?

@vijayank88 Here's the notebook https://colab.research.google.com/drive/1wye0osn34YVWPvTrfBTftjOfGOtF3ABe?authuser=2#scrollTo=ETEAUaDzEQn2

My dependencies: image

luarss avatar Aug 25 '23 14:08 luarss

@vvbandeira What is the recommended version of python for autotuner?

vijayank88 avatar Aug 25 '23 15:08 vijayank88

@vvbandeira What is the recommended version of python for autotuner?

Anything above 3.9 should work fine.

vvbandeira avatar Aug 25 '23 17:08 vvbandeira

@vijayank88 Can you post your requirements.txt file? I can try and replicate if it works on my colab environment. If so, we can update the official repo file.

luarss avatar Aug 26 '23 03:08 luarss

@luarss I don't have requirements.txt. But followed below pip commands along with ORFS for AutoTuner.

pip3.9 install -U --user 'ray[default,tune]==1.11.0' ax-platform hyperopt nevergrad optuna pandas
pip3.9 install -U --user colorama==0.4.4 bayesian-optimization==1.4.0

As per Vitor suggestion used python version 3.9. Have to install tensorboard as well to view AutoTuner results.

vijayank88 avatar Aug 26 '23 09:08 vijayank88

I tested with pip install protobuf==3.20.1 and it seems to work without any special setting.

@vijayank88 In https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/pull/1319 we will have a requirements.txt file as follows

image

luarss avatar Aug 26 '23 10:08 luarss

yes. Covered all pip packages. tensorboard installation instructions?

vijayank88 avatar Aug 26 '23 11:08 vijayank88

Just updated the tensorboard version. It is slightly differently from yours (2.13.3). Warnings aside, is the tensorboard able to show data?

luarss avatar Aug 26 '23 12:08 luarss

Just updated the tensorboard version. It is slightly differently from yours (2.13.3). Warnings aside, is the tensorboard able to show data?

yes

vijayank88 avatar Aug 26 '23 12:08 vijayank88

@vvbandeira This issue should be fixed with the recent PRs.

luarss avatar May 25 '24 10:05 luarss

The PRs with the new autotuner.json config files were merged. Please reopen the issue if you find that we did not address the issue or file new issues in case there are other designs not completing the flow under AT.

vvbandeira avatar May 31 '24 13:05 vvbandeira