Transformers4Rec
                                
                                
                                
                                    Transformers4Rec copied to clipboard
                            
                            
                            
                        0.1.16 version creates import errors
- The following import 
from transformers4rec.utils.data_utils import save_time_based_splitsraises a ModuleNotFoundError, where utils.data_utils does not exist. This can be solved by usingfrom transformers4rec.data.preprocessing import save_time_based_splits - The following import from 
transformers4rec import torch as trraises a TypeErrorTypeError: __new__() missing 1 required positional argument: 'task', explicitly in 2.1. from .model.prediction_task import BinaryClassificationTask, NextItemPredictionTask, RegressionTask 2.2 class BinaryClassificationTask(PredictionTask): 2.3 and at last tm.Precision(num_classes=2), 
@enislalmi can you please tell us about your env? how did you install Transformers4rec and other Merlin libraries?  With docker image? if yes, which docker image?
can you share the output of pip list ? what's your torch version and Torchmetrics version ?
you can see that save_time_based_splits func is here.  is your branch up to date with the main branch?
this PR  new() missing  should have already fixed the TypeError: __new__() missing 1 required positional argument: 'task' issue.
thanks.
Hey @rnyak this are my pip:
!pip install nvtabular
!pip install -q transformers4rec[pytorch,nvtabular, dataloader]
I am not using docker image. I am using a virtual env in Google Coolab/Kaggle. The new version 0.1.16 gives me this error. If I try with the previous version 0.1.15, I need to use nvtabular==1.3.3  to get rid of it. However, I still have issues with the from transformers4rec import torch as tr. Thanks!
@enislalmi what GPU colab provides? you can take a look at this merlin blog post about how to install merlin on colab.
If you are not using docker, you need to install cudf and dask cudf first, and for that you can do something like that :
pip install rmm-cu11==22.10.0 cudf-cu11==22.10.0 dask-cudf-cu11==22.10.0 --extra-index-url=https://pypi.nvidia.com/
so please start with a clean env.. once you install cudf and dask cudf then you need to install all other libs:
pip install nvtabular
pip install core
pip install merlin-core
pip install merlin-dataloader
pip install transformers4rec
pip install merlin-systems
and then you need to install torch gpu version.
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
please try this and let us know if it works or not.
@rnyak thanks a lot. I got it to work on everything until cell 12 of /02-session-based-XLNet-with-PyT.ipynb which raises the following error
196     try:
--> 197         from nvtabular.io.dataset import Dataset
   198     except ImportError:
ModuleNotFoundError: No module named 'nvtabular.io.dataset'; 'nvtabular.io' is not a package
The Colab GPU I'm using is: NVIDIA T4 Tensor Core
My pip list is as follows:
`Package                       Version
----------------------------- --------------------
absl-py                       1.4.0
aeppl                         0.0.33
aesara                        2.7.9
aiohttp                       3.8.4
aiosignal                     1.3.1
alabaster                     0.7.13
albumentations                1.2.1
altair                        4.2.2
appdirs                       1.4.4
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arviz                         0.12.1
astor                         0.8.1
astropy                       4.3.1
astunparse                    1.6.3
async-timeout                 4.0.2
atomicwrites                  1.4.1
attrs                         22.2.0
audioread                     3.0.0
autograd                      1.5
Babel                         2.12.1
backcall                      0.2.0
backports.zoneinfo            0.2.1
beautifulsoup4                4.6.3
betterproto                   1.2.5
bleach                        6.0.0
blis                          0.7.9
bokeh                         2.4.3
branca                        0.6.0
bs4                           0.0.1
CacheControl                  0.12.11
cachetools                    5.3.0
catalogue                     2.0.8
certifi                       2022.12.7
cffi                          1.15.1
cftime                        1.6.2
chardet                       4.0.0
charset-normalizer            3.0.1
click                         8.1.3
clikit                        0.6.2
cloudpickle                   2.2.1
cmake                         3.22.6
cmdstanpy                     1.1.0
colorcet                      3.0.1
colorlover                    0.3.0
community                     1.0.0b1
confection                    0.0.4
cons                          0.4.5
contextlib2                   0.5.5
convertdate                   2.4.0
crashtest                     0.3.1
crcmod                        1.7
cubinlinker-cu11              0.3.0.post1
cuda-python                   11.7.0
cudf-cu11                     22.10.0
cufflinks                     0.17.3
cupy-cuda115                  10.6.0
cupy-cuda11x                  11.0.0
cvxopt                        1.3.0
cvxpy                         1.2.3
cycler                        0.11.0
cymem                         2.0.7
Cython                        0.29.33
dask                          2022.9.2
dask-cudf-cu11                22.10.0
datascience                   0.17.6
db-dtypes                     1.0.5
dbus-python                   1.2.16
debugpy                       1.6.4
decorator                     4.4.2
defusedxml                    0.7.1
distributed                   2022.9.2
dlib                          19.24.0
dm-tree                       0.1.8
dnspython                     2.3.0
docutils                      0.16
dopamine-rl                   1.0.5
earthengine-api               0.1.342
easydict                      1.10
ecos                          2.0.12
editdistance                  0.5.3
en-core-web-sm                3.4.1
entrypoints                   0.4
ephem                         4.1.4
et-xmlfile                    1.1.0
etils                         1.0.0
etuples                       0.3.8
fa2                           0.3.5
fastai                        2.7.11
fastcore                      1.5.28
fastdownload                  0.0.7
fastdtw                       0.3.4
fastjsonschema                2.16.3
fastprogress                  1.0.3
fastrlock                     0.8.1
feather-format                0.4.1
filelock                      3.9.0
firebase-admin                5.3.0
fix-yahoo-finance             0.0.22
Flask                         2.2.3
flatbuffers                   23.1.21
folium                        0.12.1.post1
fonttools                     4.38.0
frozenlist                    1.3.3
fsspec                        2022.5.0
future                        0.16.0
gast                          0.4.0
GDAL                          3.3.2
gdown                         4.4.0
gensim                        3.6.0
geographiclib                 1.52
geopy                         1.17.0
gin-config                    0.5.0
glob2                         0.7
google                        2.0.3
google-api-core               2.11.0
google-api-python-client      2.70.0
google-auth                   2.16.1
google-auth-httplib2          0.1.0
google-auth-oauthlib          0.4.6
google-cloud-bigquery         3.4.2
google-cloud-bigquery-storage 2.18.1
google-cloud-core             2.3.2
google-cloud-datastore        2.11.1
google-cloud-firestore        2.7.3
google-cloud-language         2.6.1
google-cloud-storage          2.7.0
google-cloud-translate        3.8.4
google-colab                  1.0.0
google-crc32c                 1.5.0
google-pasta                  0.2.0
google-resumable-media        2.4.1
googleapis-common-protos      1.58.0
googledrivedownloader         0.4
graphviz                      0.10.1
greenlet                      2.0.2
grpcio                        1.51.3
grpcio-status                 1.48.2
grpclib                       0.4.3
gspread                       3.4.2
gspread-dataframe             3.0.8
gym                           0.25.2
gym-notices                   0.0.8
h2                            4.1.0
h5py                          3.1.0
HeapDict                      1.0.1
hijri-converter               2.2.4
holidays                      0.20
holoviews                     1.14.9
hpack                         4.0.0
html5lib                      1.0.1
httpimport                    0.5.18
httplib2                      0.17.4
httpstan                      4.6.1
huggingface-hub               0.12.1
humanize                      0.5.1
hyperframe                    6.0.1
hyperopt                      0.1.2
idna                          2.10
imageio                       2.9.0
imagesize                     1.4.1
imbalanced-learn              0.8.1
imblearn                      0.0
imgaug                        0.4.0
implicit                      0.6.2
importlib-metadata            6.0.0
importlib-resources           5.12.0
imutils                       0.5.4
inflect                       2.1.0
intel-openmp                  2023.0.0
ipykernel                     5.3.4
ipython                       7.9.0
ipython-genutils              0.2.0
ipython-sql                   0.3.9
ipywidgets                    7.7.1
itsdangerous                  2.1.2
jax                           0.4.4
jaxlib                        0.4.4+cuda11.cudnn82
jieba                         0.42.1
Jinja2                        3.1.2
joblib                        1.2.0
jsonschema                    4.3.3
jupyter-client                6.1.12
jupyter-console               6.1.0
jupyter_core                  5.2.0
jupyterlab-pygments           0.2.2
jupyterlab-widgets            3.0.5
kaggle                        1.5.12
keras                         2.11.0
keras-vis                     0.4.1
kiwisolver                    1.4.4
korean-lunar-calendar         0.3.1
langcodes                     3.3.0
libclang                      15.0.6.1
librosa                       0.8.1
lightfm                       1.16
lightgbm                      2.2.3
llvmlite                      0.39.1
lmdb                          0.99
locket                        1.0.0
logical-unification           0.4.5
LunarCalendar                 0.0.9
lxml                          4.9.2
Markdown                      3.4.1
MarkupSafe                    2.1.2
marshmallow                   3.19.0
matplotlib                    3.5.3
matplotlib-venn               0.11.9
merlin-core                   0.9.0+51.g2767618
merlin-dataloader             0.0.2+41.gdbf8816
merlin-models                 0.11.0
miniKanren                    1.0.3
missingno                     0.5.2
mistune                       0.8.4
mizani                        0.8.1
mkl                           2019.0
mlxtend                       0.14.0
more-itertools                9.1.0
moviepy                       0.2.3.5
mpmath                        1.2.1
msgpack                       1.0.4
multidict                     6.0.4
multipledispatch              0.6.0
multitasking                  0.0.11
murmurhash                    1.0.9
music21                       5.5.0
natsort                       5.5.0
nbclient                      0.7.2
nbconvert                     6.5.4
nbformat                      5.7.3
netCDF4                       1.6.2
networkx                      3.0
nibabel                       3.0.2
nltk                          3.7
notebook                      6.3.0
numba                         0.56.4
numexpr                       2.8.4
numpy                         1.22.4
nvtabular                     1.6.0+42.g9b186ee9
nvtx                          0.2.5
oauth2client                  4.1.3
oauthlib                      3.2.2
opencv-contrib-python         4.6.0.66
opencv-python                 4.6.0.66
opencv-python-headless        4.7.0.72
openpyxl                      3.0.10
opt-einsum                    3.3.0
osqp                          0.6.2.post0
packaging                     23.0
palettable                    3.3.0
pandas                        1.3.5
pandas-datareader             0.9.0
pandas-gbq                    0.17.9
pandas-profiling              1.4.1
pandocfilters                 1.5.0
panel                         0.14.3
param                         1.12.3
parso                         0.8.3
partd                         1.3.0
pastel                        0.2.1
pathlib                       1.0.1
pathy                         0.10.1
patsy                         0.5.3
pep517                        0.13.0
pexpect                       4.8.0
pickleshare                   0.7.5
Pillow                        8.4.0
pip                           22.0.4
pip-tools                     6.6.2
platformdirs                  3.0.0
plotly                        5.5.0
plotnine                      0.10.1
pluggy                        0.7.1
pooch                         1.7.0
portpicker                    1.3.9
prefetch-generator            1.0.3
preshed                       3.0.8
prettytable                   3.6.0
progressbar2                  3.38.0
prometheus-client             0.16.0
promise                       2.3
prompt-toolkit                2.0.10
prophet                       1.1.2
proto-plus                    1.22.2
protobuf                      3.20.3
psutil                        5.4.8
psycopg2                      2.9.5
ptxcompiler-cu11              0.7.0.post1
ptyprocess                    0.7.0
py                            1.11.0
pyarrow                       9.0.0
pyasn1                        0.4.8
pyasn1-modules                0.2.8
pycocotools                   2.0.6
pycparser                     2.21
pyct                          0.5.0
pydantic                      1.10.5
pydata-google-auth            1.7.0
pydot                         1.3.0
pydot-ng                      2.0.0
pydotplus                     2.0.2
PyDrive                       1.3.1
pyerfa                        2.0.0.1
Pygments                      2.6.1
PyGObject                     3.36.0
pylev                         1.4.0
pymc                          4.1.4
PyMeeus                       0.5.12
pymongo                       4.3.3
pymystem3                     0.2.0
pynvml                        11.5.0
PyOpenGL                      3.1.6
pyparsing                     3.0.9
pyrsistent                    0.19.3
pysimdjson                    3.2.0
PySocks                       1.7.1
pystan                        3.3.0
pytest                        3.6.4
python-apt                    2.0.1
python-dateutil               2.8.2
python-louvain                0.16
python-slugify                8.0.1
python-utils                  3.5.2
pytz                          2022.7.1
pyviz-comms                   2.2.1
PyWavelets                    1.4.1
PyYAML                        6.0
pyzmq                         23.2.1
qdldl                         0.1.5.post3
qudida                        0.0.4
regex                         2022.6.2
requests                      2.25.1
requests-oauthlib             1.3.1
requests-unixsocket           0.2.0
resampy                       0.4.2
rmm-cu11                      22.10.0
rpy2                          3.5.5
rsa                           4.9
sacremoses                    0.0.53
scikit-image                  0.19.3
scikit-learn                  1.2.1
scipy                         1.10.1
screen-resolution-extra       0.0.0
scs                           3.2.2
seaborn                       0.11.2
Send2Trash                    1.8.0
setuptools                    57.4.0
shapely                       2.0.1
six                           1.15.0
sklearn-pandas                2.2.0
smart-open                    6.3.0
snowballstemmer               2.2.0
sortedcontainers              2.4.0
soundfile                     0.12.1
spacy                         3.4.4
spacy-legacy                  3.0.12
spacy-loggers                 1.0.4
Sphinx                        3.5.4
sphinxcontrib-applehelp       1.0.4
sphinxcontrib-devhelp         1.0.2
sphinxcontrib-htmlhelp        2.0.1
sphinxcontrib-jsmath          1.0.1
sphinxcontrib-qthelp          1.0.3
sphinxcontrib-serializinghtml 1.1.5
SQLAlchemy                    1.4.46
sqlparse                      0.4.3
srsly                         2.4.6
statsmodels                   0.13.5
stringcase                    1.2.0
sympy                         1.7.1
tables                        3.7.0
tabulate                      0.8.10
tblib                         1.7.0
tenacity                      8.2.2
tensorboard                   2.11.2
tensorboard-data-server       0.6.1
tensorboard-plugin-wit        1.8.1
tensorflow                    2.11.0
tensorflow-datasets           4.8.3
tensorflow-estimator          2.11.0
tensorflow-gcs-config         2.11.0
tensorflow-hub                0.12.0
tensorflow-io-gcs-filesystem  0.31.0
tensorflow-metadata           1.12.0
tensorflow-probability        0.19.0
termcolor                     2.2.0
terminado                     0.13.3
text-unidecode                1.3
textblob                      0.15.3
thinc                         8.1.7
threadpoolctl                 3.1.0
tifffile                      2023.2.27
tinycss2                      1.2.1
tokenizers                    0.12.1
toml                          0.10.2
tomli                         2.0.1
toolz                         0.12.0
torch                         1.13.1+cu116
torchaudio                    0.13.1+cu116
torchmetrics                  0.11.3
torchsummary                  1.5.1
torchtext                     0.14.1
torchvision                   0.14.1+cu116
tornado                       6.1
tqdm                          4.64.1
traitlets                     5.7.1
transformers                  4.18.0
transformers4rec              0.1.14+58.g34dbd575
tweepy                        3.10.0
typeguard                     2.7.1
typer                         0.7.0
typing_extensions             4.5.0
tzlocal                       1.5.1
uritemplate                   4.1.1
urllib3                       1.26.14
vega-datasets                 0.9.0
wasabi                        0.10.1
wcwidth                       0.2.6
webargs                       8.2.0
webencodings                  0.5.1
Werkzeug                      2.2.3
wheel                         0.38.4
widgetsnbextension            3.6.2
wordcloud                     1.8.2.2
wrapt                         1.15.0
xarray                        2022.12.0
xarray-einstats               0.5.1
xgboost                       1.7.4
xkit                          0.0.0
xlrd                          1.2.0
xlwt                          1.3.0
yarl                          1.8.2
yellowbrick                   1.5
zict                          2.2.0
zipp                          3.15.0`
                                    
                                    
                                    
                                
@jperez999  and @nv-alaiacano I can repro this issue.. any idea why ModuleNotFoundError: No module named 'nvtabular.io.dataset'; 'nvtabular.io' is not a package occurs?
@enislalmi can you do pip install merlin-models as well and test again?
@rnyak the issue remains the same even after installing pip install merlin-models
Requirement already satisfied: merlin-models in /usr/local/lib/python3.8/dist-packages (0.11.0)
@enislalmi Please can you share details of how you got the notebooks into your environment?
It's important that the version of the notebooks running match the version of the Transformers4Rec package installed. The last published version of Transformers4Rec was 0.1.16 with git tag v0.1.16. If using git clone you can use --branch v0.1.16. Or if you already have a cloned copy of the repository, can switch to the tagged version with git checkout v0.1.16
git clone --branch v0.1.16 [email protected]:NVIDIA-Merlin/Transformers4Rec
                                    
                                    
                                    
                                
I think the error is related to a mismatch of versions between nvtabular, transformers4rec and merlin models.
196     try:
--> 197         from nvtabular.io.dataset import Dataset
   198     except ImportError:
ModuleNotFoundError: No module named 'nvtabular.io.dataset'; 'nvtabular.io' is not a package
I had the same issue and installing merlin models from source (pull the latest main branch) solved the issue for me. I think the issue is that we depricated io.dataset in nvtabular.
This comment fixes the import errors: https://github.com/NVIDIA-Merlin/models/commit/47e195202237f4819783c2857266b00cd8d89bd6
Hey @bschifferer, @oliverholworthy! I am using Transformers 0.1.16, @bschifferer is right, the main reason why it doesn't work is the fact that there is a mismatch. However the from nvtabular.io.dataset import Dataset doesn't solve it for me. I cannot use docker, as I am doing it in Colab.
@enislalmi the error about nvtabular.io.dataset will only show up if running the development version of Transformers4Rec from the development (main) branch. It sounds like your environment is picking up that version of the code instead of the tagged version 0.1.16. Can you share more details on how you install Transformers4Rec into your Google Colab Environment?