gcsfs
                                
                                
                                
                                    gcsfs copied to clipboard
                            
                            
                            
                        "TypeError: from_buffer() cannot return the address of the raw string within a bytes or unicode object" when creating a GCSFileSystem object
What happened:
When I tried to load a GCSFileSystem object, I got the exception TypeError: from_buffer() cannot return the address of the raw string within a bytes or unicode object.
What you expected to happen:
I expect the GCSFileSystem object to instantiate and be usable. This has worked in the past. To my knowledge, nothing relevant on my machine has changed since it did. This works on a different machine using the same token file.
Minimal Complete Verifiable Example:
from gcsfs import GCSFileSystem
fs = GCSFileSystem(token="my-token.json")
I get the following exception.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/gcsfs-0.7.1-py3.6.egg/gcsfs/core.py in _dict_to_credentials(self, token)
    326             token = service_account.Credentials.from_service_account_info(
--> 327                 token, scopes=[self.scope]
    328             )
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/google/oauth2/service_account.py in from_service_account_info(cls, info, **kwargs)
    210         signer = _service_account_info.from_dict(
--> 211             info, require=["client_email", "token_uri"]
    212         )
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/google/auth/_service_account_info.py in from_dict(data, require)
     54     # Create a signer.
---> 55     signer = crypt.RSASigner.from_service_account_info(data)
     56 
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/google/auth/crypt/base.py in from_service_account_info(cls, info)
    113         return cls.from_string(
--> 114             info[_JSON_FILE_PRIVATE_KEY], info.get(_JSON_FILE_PRIVATE_KEY_ID)
    115         )
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/google/auth/crypt/_cryptography_rsa.py in from_string(cls, key, key_id)
    133         private_key = serialization.load_pem_private_key(
--> 134             key, password=None, backend=_BACKEND
    135         )
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/cryptography-3.1-py3.6-macosx-10.9-x86_64.egg/cryptography/hazmat/primitives/serialization/base.py in load_pem_private_key(data, password, backend)
     17     backend = _get_backend(backend)
---> 18     return backend.load_pem_private_key(data, password)
     19 
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/cryptography-3.1-py3.6-macosx-10.9-x86_64.egg/cryptography/hazmat/backends/openssl/backend.py in load_pem_private_key(self, data, password)
   1248             data,
-> 1249             password,
   1250         )
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/cryptography-3.1-py3.6-macosx-10.9-x86_64.egg/cryptography/hazmat/backends/openssl/backend.py in _load_key(self, openssl_read_func, convert_func, data, password)
   1440     def _load_key(self, openssl_read_func, convert_func, data, password):
-> 1441         mem_bio = self._bytes_to_bio(data)
   1442 
/opt/anaconda3/envs/deepnlp/lib/python3.6/site-packages/cryptography-3.1-py3.6-macosx-10.9-x86_64.egg/cryptography/hazmat/backends/openssl/backend.py in _bytes_to_bio(self, data)
    663         """
--> 664         data_ptr = self._ffi.from_buffer(data)
    665         bio = self._lib.BIO_new_mem_buf(data_ptr, len(data))
TypeError: from_buffer() cannot return the address of the raw string within a bytes or unicode object
my-token.json contains a dictionary with keys like type, project_id, private_key, etc.
When I step into the code I see that the value of data in _bytes_to_bio is of type bytes and is equal to the private_key value in my JSON token file.
Anything else we need to know?:
I see the same bug if I pass a dictionary to the GCSFileSystem constructor instead of a path to a token file.
This failing function is being called by GCSFileSystem._dict_to_credentials. There is a try/except block which  tries to read the credential information directly from the token in the except block. The except block is failing with a KeyError for me because my token does not contain a refresh_token field. So it may be the case that other people are hitting this bug, but aren't realizing it because their token does contain a refresh_token field.
Environment:
- Dask version: dask-glm==0.2.0, dask-ml==1.7.0, gcsfs==0.7.1, argon2-cffi==20.1.0, cffi==1.7.0, cryptography==3.1
 - Python version: 3.6.12
 - Operating System: OS X, 10.15.7
 - Install method (conda, pip, source): conda
 
This has worked in the past. To my knowledge, nothing relevant on my machine has changed since it did.
Do you know which version this was? That particular piece of the code hasn't changed. I would also appreciate, if you have the time, if you would try agains the current main branch. I'm note sure how much we can do about something that apparently is happening deep within pycryptography.
I put all the version numbers I thought were relevant at the bottom of the original report. Is there another one you need?
I'll try installing from master when I get a chance.
You said that the same thing used to work on gcsfs, so the version for which that was true. Also the set of google packages google-auth, google-auth-oauthlib, google-api-core, google-api-python-client