python-bigquery-dataframes icon indicating copy to clipboard operation
python-bigquery-dataframes copied to clipboard

bigframes 1.5.0 example showing module 'numpy' has no attribute 'dtypes'

Open wazi55 opened this issue 1 year ago • 2 comments

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you've tried the usual "quick fixes":

  • Search the issues already opened: https://github.com/googleapis/python-bigquery-dataframes/issues
  • Search StackOverflow: https://stackoverflow.com/questions/tagged/google-cloud-platform+python

If you are still having issues, please be sure to include as much information as possible:

Environment details

  • OS type and version:
  • Python version: python --version
  • pip version: pip --version
  • bigframes version: pip show bigframes
import sys
import bigframes
import google.cloud.bigquery
import ibis
import pandas
import pyarrow
import sqlglot

print(f"Python: {sys.version}")
print(f"bigframes=={bigframes.__version__}")
print(f"google-cloud-bigquery=={google.cloud.bigquery.__version__}")
print(f"ibis=={ibis.__version__}")
print(f"pandas=={pandas.__version__}")
print(f"pyarrow=={pyarrow.__version__}")
print(f"sqlglot=={sqlglot.__version__}")
Python: 3.10.9 (main, Mar  1 2023, 12:33:47) [Clang 14.0.6 ]
bigframes==1.5.0
google-cloud-bigquery==3.22.0
ibis==8.0.0
pandas==1.5.3
pyarrow==12.0.1
sqlglot==20.11.0

Steps to reproduce

  1. Running https://github.com/googleapis/python-bigquery-dataframes/blob/main/notebooks/generative_ai/large_language_models.ipynb
  2. Gets module 'numpy' has no attribute 'dtypes' in cell 5

Code example

df = pd.DataFrame(
        {
            "prompt": ["What is BigQuery?", "What is BQML?", "What is BigQuery DataFrame?"],
        })
bf_df = bigframes.pandas.read_pandas(df)

Stack trace

# example

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!

wazi55 avatar May 14 '24 14:05 wazi55

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[5], [line 5](vscode-notebook-cell:?execution_count=5&line=5)
      [1](vscode-notebook-cell:?execution_count=5&line=1) df = pd.DataFrame(
      [2](vscode-notebook-cell:?execution_count=5&line=2)         {
      [3](vscode-notebook-cell:?execution_count=5&line=3)             "prompt": ["What is BigQuery?", "What is BQML?", "What is BigQuery DataFrame?"],
      [4](vscode-notebook-cell:?execution_count=5&line=4)         })
----> [5](vscode-notebook-cell:?execution_count=5&line=5) bf_df = bigframes.pandas.read_pandas(df)

File /usr/local/anaconda3/lib/python3.10/site-packages/bigframes/pandas/__init__.py:604, in read_pandas(pandas_dataframe)
    [603](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/pandas/__init__.py:603) def read_pandas(pandas_dataframe: Union[pandas.DataFrame, pandas.Series, pandas.Index]):
--> [604](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/pandas/__init__.py:604)     return global_session.with_default_session(
    [605](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/pandas/__init__.py:605)         bigframes.session.Session.read_pandas,
    [606](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/pandas/__init__.py:606)         pandas_dataframe,
    [607](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/pandas/__init__.py:607)     )

File /usr/local/anaconda3/lib/python3.10/site-packages/bigframes/core/global_session.py:113, in with_default_session(func, *args, **kwargs)
    [112](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/core/global_session.py:112) def with_default_session(func: Callable[..., _T], *args, **kwargs) -> _T:
--> [113](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/core/global_session.py:113)     return func(get_global_session(), *args, **kwargs)

File /usr/local/anaconda3/lib/python3.10/site-packages/bigframes/session/__init__.py:974, in Session.read_pandas(self, pandas_dataframe)
    [970](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/session/__init__.py:970)     return self._read_pandas(
    [971](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/session/__init__.py:971)         pandas.DataFrame(index=pandas_dataframe), "read_pandas"
    [972](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/session/__init__.py:972)     ).index
    [973](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/bigframes/session/__init__.py:973) if isinstance(pandas_dataframe, pandas.DataFrame):
...
    [309](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/numpy/__init__.py:309)     return Tester
--> [311](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/numpy/__init__.py:311) raise AttributeError("module {!r} has no attribute "
    [312](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/lib/python3.10/site-packages/numpy/__init__.py:312)                      "{!r}".format(__name__, attr))

AttributeError: module 'numpy' has no attribute 'dtypes'

wazi55 avatar May 14 '24 14:05 wazi55

Hello,

It seems that the issue might be related to specific environmental factors since setting up an identical package version environment did not reproduce the issue. To better diagnose and potentially resolve the problem, could you please provide additional information?

  1. Complete Traceback: The provided traceback appears to be truncated, missing some crucial details. A full traceback would be very helpful as it contains the complete sequence of calls that led to the error, including all files and line numbers involved.

  2. NumPy Version: Knowing the exact version of NumPy you are using could be useful, since this error is related to NumPy.

These details will greatly aid in pinpointing the root cause of the issue. Thank you for your cooperation!

Genesis929 avatar May 15 '24 05:05 Genesis929

Hello,

It seems that the issue might be related to specific environmental factors since setting up an identical package version environment did not reproduce the issue. To better diagnose and potentially resolve the problem, could you please provide additional information?

  1. Complete Traceback: The provided traceback appears to be truncated, missing some crucial details. A full traceback would be very helpful as it contains the complete sequence of calls that led to the error, including all files and line numbers involved.
  2. NumPy Version: Knowing the exact version of NumPy you are using could be useful, since this error is related to NumPy.

These details will greatly aid in pinpointing the root cause of the issue. Thank you for your cooperation!

Hi @wazi55, are you able to provide these details. In addition, it would help if you could share which operating system you are running on? Thanks.

shobsi avatar Aug 19 '24 18:08 shobsi

Hey that was resolved a while back! We can close this ticket.

wazi55 avatar Aug 19 '24 18:08 wazi55

Hey that was resolved a while back! We can close this ticket.

That's great to know, @wazi55! We would appreciate if you could post the resolution steps here, to help in future re-occurrence of this issue. Thank you!

shobsi avatar Aug 19 '24 18:08 shobsi

I'm able to reproduce this issue after updating my local test environment on Python 3.9. It doesn't appear to affect fresh environments. My requirements.txt:

$ pip freeze
aiohttp==3.8.5
aiosignal==1.3.1
anyio==3.7.1
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.2.1
async-lru==2.0.4
async-timeout==4.0.3
atpublic==3.1.1
attrs==22.2.0
Babel==2.12.1
backcall==0.2.0
beautifulsoup4==4.12.2
bidict==0.22.1
-e git+ssh://[email protected]/googleapis/python-bigquery-dataframes.git@adfaddcc0fd9f495368e56eead4c1983d1cdf434#egg=bigframes&subdirectory=../../../bigframes-2
bleach==6.0.0
cachetools==5.3.0
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==2.0.12
click==8.1.3
click-plugins==1.1.1
cligj==0.7.2
cloudpickle==2.0.0
comm==0.1.4
contourpy==1.2.1
coverage==7.2.2
cycler==0.12.1
db-dtypes==1.1.1
debugpy==1.6.7.post1
decorator==5.1.1
defusedxml==0.7.1
entrypoints==0.4
et-xmlfile==1.1.0
exceptiongroup==1.1.1
execnet==1.9.0
executing==1.2.0
fastjsonschema==2.18.0
filelock==3.10.7
Fiona==1.9.4.post1
fonttools==4.53.1
fqdn==1.5.1
frozenlist==1.4.0
fsspec==2023.3.0
gcsfs==2023.3.0
geopandas==0.12.2
google-api-core==2.19.1
google-auth==2.15.0
google-auth-oauthlib==1.0.0
google-cloud-bigquery==3.16.0
google-cloud-bigquery-connection==1.12.0
google-cloud-bigquery-storage==2.19.1
google-cloud-bigtable==2.24.0
google-cloud-core==2.3.2
google-cloud-functions==1.12.0
google-cloud-iam==2.12.1
google-cloud-pubsub==2.21.4
google-cloud-resource-manager==1.10.3
google-cloud-storage==2.0.0
google-cloud-testutils==1.3.3
google-crc32c==1.5.0
google-resumable-media==2.4.1
googleapis-common-protos==1.59.0
greenlet==2.0.2
grpc-google-iam-v1==0.12.6
grpcio==1.53.0
grpcio-status==1.48.2
humanize==4.6.0
ibis-framework==8.0.0
idna==3.4
importlib-metadata==6.1.0
importlib_resources==6.4.4
iniconfig==2.0.0
ipykernel==6.25.1
ipython==8.14.0
ipython-genutils==0.2.0
ipywidgets==7.7.1
isoduration==20.11.0
jedi==0.19.0
jellyfish==0.8.9
Jinja2==3.1.2
joblib==1.3.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.19.0
jsonschema-specifications==2023.7.1
jupyter-events==0.7.0
jupyter-lsp==2.2.0
jupyter_client==7.4.9
jupyter_core==5.3.1
jupyter_server==2.7.0
jupyter_server_terminals==0.4.4
jupyterlab==4.0.4
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.8
jupyterlab_server==2.24.0
kiwisolver==1.4.5
markdown-it-py==2.2.0
MarkupSafe==2.1.2
matplotlib==3.7.1
matplotlib-inline==0.1.6
mdurl==0.1.2
mistune==3.0.1
mock==5.0.1
multidict==6.0.4
multipledispatch==0.6.0
nbclassic==1.1.0
nbclient==0.8.0
nbconvert==7.7.3
nbformat==5.9.2
nest-asyncio==1.5.7
notebook==6.5.7
notebook_shim==0.2.3
numpy==1.24.2
oauthlib==3.2.2
openpyxl==3.1.2
overrides==7.4.0
packaging==23.0
pandas==1.5.0
pandas-gbq==0.19.0
pandocfilters==1.5.0
parso==0.8.3
parsy==2.1
pexpect==4.8.0
pickleshare==0.7.5
pillow==10.4.0
platformdirs==3.2.0
pluggy==1.0.0
pooch==1.7.0
prometheus-client==0.17.1
prompt-toolkit==3.0.39
proto-plus==1.24.0
protobuf==3.20.3
psutil==5.9.5
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==8.0.0
pyarrow-hotfix==0.6
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.21
pydata-google-auth==1.8.2
Pygments==2.14.0
pyparsing==3.1.4
pyproj==3.6.0
pytest==7.2.2
pytest-cov==4.0.0
pytest-retry==1.1.0
pytest-timeout==2.1.0
pytest-xdist==3.2.1
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2023.3
PyYAML==6.0
pyzmq==25.1.1
referencing==0.30.2
requests==2.27.1
requests-oauthlib==1.3.1
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.3.3
rpds-py==0.9.2
rsa==4.9
scikit-learn==1.2.2
scipy==1.11.1
Send2Trash==1.8.2
shapely==2.0.1
six==1.16.0
sniffio==1.3.0
soupsieve==2.4.1
SQLAlchemy==1.4.0
sqlglot==20.8.0
stack-data==0.6.2
tabulate==0.9.0
terminado==0.17.1
threadpoolctl==3.2.0
tinycss2==1.2.1
tomli==2.0.1
toolz==0.12.0
tornado==6.3.3
tqdm==4.65.0
traitlets==5.9.0
typing_extensions==4.5.0
uri-template==1.3.0
urllib3==1.26.15
wcwidth==0.2.6
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.1
widgetsnbextension==3.6.5
xarray==2023.7.0
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0

Failure:

$ py.test --quiet -n=20 --timeout=900 --durations=20 --junitxml=system_3.9_sponge_log.xml --cov=bigframes --cov=tests/system/small --cov-append --cov-c
onfig=.coveragerc --cov-report=term-missing --cov-fail-under=0 tests/system/small -x
bringing up nodes...
======================================================================= ERRORS ========================================================================
________________________________________________ ERROR collecting tests/system/small/test_dataframe.py ________________________________________________
tests/system/small/test_dataframe.py:2169: in <module>
    (bf_indexes.Index([1000, 2000, 3000])),
bigframes/core/indexes/base.py:89: in __new__
    block = df.DataFrame(pd_df, session=session)._block
bigframes/core/log_adapter.py:56: in wrapper
    return method(*args, **kwargs)
bigframes/dataframe.py:120: in __init__
    if dtype in {numpy.dtypes.ObjectDType, "object"}:
.nox/system-3-9/lib/python3.9/site-packages/numpy/__init__.py:320: in __getattr__
    raise AttributeError("module {!r} has no attribute "
E   AttributeError: module 'numpy' has no attribute 'dtypes'
...
=============================================================== short test summary info ===============================================================
ERROR tests/system/small/test_dataframe.py - AttributeError: module 'numpy' has no attribute 'dtypes'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! xdist.dsession.Interrupted: stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
300 warnings, 1 error in 10.89s

tswast avatar Aug 27 '24 15:08 tswast

Since we are using numpy directly, we need to add it here: https://github.com/googleapis/python-bigquery-dataframes/blob/main/setup.py and also make sure that our minimum acceptable version is pinned here: https://github.com/googleapis/python-bigquery-dataframes/blob/main/testing/constraints-3.9.txt

tswast avatar Aug 27 '24 15:08 tswast

Per https://numpy.org/neps/nep-0029-deprecation_policy.html, we should support at least numpy 1.24.x+

tswast avatar Aug 27 '24 15:08 tswast