nanoGPT Update dependencies

Fixed error "AttributeError: module 'numpy' has no attribute 'typeDict'
Fixed error "TypeError: Descriptors cannot not be created directly."

Feb 04 '23 18:02 tombenj

I don't understand what's happening here, where is the error coming from?

Feb 04 '23 19:02 karpathy

Numpy greater than 1.21 spits the following when training:

 tensorflow.python.tools import module_util as _module_util
  File "/usr/lib/python3/dist-packages/tensorflow/python/__init__.py", line 45, in <module>
    from tensorflow.python.feature_column import feature_column_lib as feature_column
  File "/usr/lib/python3/dist-packages/tensorflow/python/feature_column/feature_column_lib.py", line 18, in <module>
    from tensorflow.python.feature_column.feature_column import *
  File "/usr/lib/python3/dist-packages/tensorflow/python/feature_column/feature_column.py", line 143, in <module>
    from tensorflow.python.layers import base
  File "/usr/lib/python3/dist-packages/tensorflow/python/layers/base.py", line 16, in <module>
    from tensorflow.python.keras.legacy_tf_layers import base
  File "/usr/lib/python3/dist-packages/tensorflow/python/keras/__init__.py", line 25, in <module>
    from tensorflow.python.keras import models
  File "/usr/lib/python3/dist-packages/tensorflow/python/keras/models.py", line 22, in <module>
    from tensorflow.python.keras.engine import functional
  File "/usr/lib/python3/dist-packages/tensorflow/python/keras/engine/functional.py", line 32, in <module>
    from tensorflow.python.keras.engine import training as training_lib
  File "/usr/lib/python3/dist-packages/tensorflow/python/keras/engine/training.py", line 52, in <module>
    from tensorflow.python.keras.saving import hdf5_format
  File "/usr/lib/python3/dist-packages/tensorflow/python/keras/saving/hdf5_format.py", line 37, in <module>
    import h5py
  File "/usr/lib/python3/dist-packages/h5py/__init__.py", line 46, in <module>
    from ._conv import register_converters as _register_converters
  File "h5py/h5t.pxd", line 14, in init h5py._conv
  File "h5py/h5t.pyx", line 293, in init h5py.h5t
  File "/home/ubuntu/.local/lib/python3.8/site-packages/numpy/__init__.py", line 284, in __getattr__
    raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'typeDict'

And also fails with newer versions of protobuf:

Traceback (most recent call last):
  File "train.py", line 189, in <module>
    model = torch.compile(model) # requires PyTorch 2.0
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 1412, in compile
    import torch._dynamo
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/_dynamo/__init__.py", line 1, in <module>
    from . import allowed_functions, convert_frame, eval_frame, resume_execution
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/_dynamo/convert_frame.py", line 29, in <module>
    from .output_graph import OutputGraph
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/_dynamo/output_graph.py", line 23, in <module>
    from . import config, logging as torchdynamo_logging, variables
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/_dynamo/variables/__init__.py", line 43, in <module>
    from .torch import TorchVariable
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/_dynamo/variables/torch.py", line 106, in <module>
    import transformers
  File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/__init__.py", line 30, in <module>
    from . import dependency_versions_check
  File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/dependency_versions_check.py", line 17, in <module>
    from .utils.versions import require_version, require_version_core
  File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/utils/__init__.py", line 34, in <module>
    from .generic import (
  File "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 33, in <module>
    import tensorflow as tf
  File "/usr/lib/python3/dist-packages/tensorflow/__init__.py", line 37, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "/usr/lib/python3/dist-packages/tensorflow/python/__init__.py", line 37, in <module>
    from tensorflow.python.eager import context
  File "/usr/lib/python3/dist-packages/tensorflow/python/eager/context.py", line 29, in <module>
    from tensorflow.core.framework import function_pb2
  File "/usr/lib/python3/dist-packages/tensorflow/core/framework/function_pb2.py", line 14, in <module>
    from tensorflow.core.framework import attr_value_pb2 as tensorflow_dot_core_dot_framework_dot_attr__value__pb2
  File "/usr/lib/python3/dist-packages/tensorflow/core/framework/attr_value_pb2.py", line 14, in <module>
    from tensorflow.core.framework import tensor_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__pb2
  File "/usr/lib/python3/dist-packages/tensorflow/core/framework/tensor_pb2.py", line 14, in <module>
    from tensorflow.core.framework import resource_handle_pb2 as tensorflow_dot_core_dot_framework_dot_resource__handle__pb2
  File "/usr/lib/python3/dist-packages/tensorflow/core/framework/resource_handle_pb2.py", line 14, in <module>
    from tensorflow.core.framework import tensor_shape_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__shape__pb2
  File "/usr/lib/python3/dist-packages/tensorflow/core/framework/tensor_shape_pb2.py", line 34, in <module>
    _descriptor.FieldDescriptor(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 560, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
ERROR:torch.distribut

Feb 04 '23 20:02 tombenj

something can't be right here. how is tensorflow even involved?

Feb 04 '23 21:02 karpathy

PyTorch dynamo currently imports transformers if the latter is available in the environment
https://github.com/pytorch/pytorch/blob/master/torch/_dynamo/variables/torch.py#L106

which in turns imports tensorflow if it is available as well, e.g. https://github.com/huggingface/transformers/blob/main/src/transformers/utils/generic.py#L71

@tombenj you probably already have tensorflow and an incompatible protobuf in your env

Feb 06 '23 14:02 lantiga

Yep makes sense. Lambda Labs added it by default. Closing PR.

Feb 06 '23 14:02 tombenj