dbt-fal
dbt-fal copied to clipboard
Successfully installed catboost wants to import as _catboost #40
Describe the bug
Attempt to import catboost results in error that module _catboost
cannot be found -- a leading underscore is picked up somewhere.
Your environment
- OS: Xubuntu 20.04
- Package Versions:
dbt:1.5.9
fal:1.5.4
- Adapter being used:
postgres:1.5.9
How to reproduce I'm trying to use a model trained outside dbt to predict labels via python under dbt-fal
fal-project.yml:
environments:
- name: ml
type: venv
requirements:
- scipy
- pandas
- numpy
- statsmodels
- catboost
catboost was just added to this code where other models with the other libraries listed work well. The first run of the file below produced a long installation log to stdout, ending with
[builder] [info] Successfully installed [...] catboost-1.2.2 [...]
Running the python model below with dbt run select ... gives me the subsequent error
from catboost import CatBoostRegressor
from pandas import concat
def model(dbt, fal):
dbt.config(fal_environment="ml")
df: pandas.DataFrame = dbt.ref("tr_rep_gentrification_prediction_inputs")
X = df\
.drop(['col0', 'col1', 'col2'], axis=1)\
.fillna(0.0)
catb = CatBoostRegressor()
catb.load_model('cb_model.cbm')
pred = catb.predict(X)
results = concat([df, pred], axis=0)
return(results)
stdout:
No module named '_catboost'
22:55:01 1 of 1 ERROR creating python table model trans.tr_rep_gentrification_prediction_outputs [ERROR in 42.02s]
22:55:02
22:55:02 Finished running 1 table model in 0 hours 0 minutes and 58.89 seconds (58.89s).
22:55:02
22:55:02 Completed with 1 error and 0 warnings:
22:55:02
22:55:02 No module named '_catboost'
If I remove catboost from the fal-project.yml file, I get the same error (as expected), but the leading underscore no longer appears.
I also tried as recommended by @mederka at https://github.com/fal-ai/fal/issues/40#issuecomment-1898466134 to import within the model function instead, but I get the same error.
Expected behavior I expect catboost to be imported the same as every other library
Actual behavior
model fails to run owing to _catboost
not being found -- a leading underscore is being added.
Screenshots None
Additional context Also posted Here in case there's a more generally obvious solution
it seems that this is a library, I think this is more about how catboost installs than dbt-fal itself.
https://github.com/catboost/catboost/blob/d6172a4e4b11f485c416368461feae3f3ce98745/catboost/python-package/catboost/_catboost.pyx
Hmm. It installs fine outside of dbt-fal though.
CatBoostRegressor appears to be exported out of the package level init.py from core.py. I'm not familiar with why a cython script file in the same directory would interfere?
can you add more details around
[builder] [info] Successfully installed [...] catboost-1.2.2 [...]
see if we can find a hint there
I had a look over this too, nothing jumped out at me, but I'm not an expert.
This log ended with a silly error on my part when trying to run the python model -- after fixing the obvious, I get the errors as quoted in the bug report.
Can you try to build it with a conda environment instead?
environments:
- name: ml
type: conda
packages:
- scipy
- pandas
- numpy
- statsmodels
- catboost
(
@chamini2 noted, but seriously struggling to get conda functional. I've tried so many things. Should this be a no-brainer? Or does this actually give you info?
No matter what I try, I get
Could not find conda executable. If conda executable is not available by default, please point isolate to the path where conda binary is available 'ISOLATE_CONDA_HOME'.
)
You need to have conda installed to be able to use this, but I think will make your use case work.
You need to have conda installed to be able to use this, but I think will make your use case work.
Yeah, I installed conda, tried setting the env var to every level of the install location, and activated it in the same shell, all with no joy. Great the hear that it sounds positive for the venv type.
[...], but I think will make your use case work.
Any luck here?