micromlgen
micromlgen copied to clipboard
XGBoost Port Code requires access to Temp Files
As the title says, the XGboost port code uses a temporary file in APPDATA/LOCAL to create a temporary json file. There is no info about this provided to the user. In fact, tested on 3 systems, the file was not generated because the Jupyter Notebook does not have access to the APPDATA/LOCAL folder, even with admin right or by trusting the notebook, it still cannot create it.
This is the type of error generated:
XGBoostError: [14:36:23] C:\Users\Administrator\workspace\xgboost-win64_release_1.0.0\dmlc-core\src\io\local_filesys.cc:209: Check failed: allow_null: LocalFileSystem::Open "C:\Users\ZW\AppData\Local\Temp\tmp_mu9qwkg": Permission denied
I have checked the xgboost.py file. The original code is:
def port_xgboost(clf, tmp_file=None, **kwargs):
if tmp_file is None:
with NamedTemporaryFile('w+', suffix='.json', encoding='utf-8') as tmp:
clf.save_model(tmp.name)
tmp.seek(0)
decoded = json.load(tmp)
else:
clf.save_model(tmp_file)
with open(tmp_file, encoding='utf-8') as file:
decoded = json.load(file)
trees = [format_tree(tree) for tree in decoded['learner']['gradient_booster']['model']['trees']]
return jinja('xgboost/xgboost.jinja', {
'n_classes': int(decoded['learner']['learner_model_param']['num_class']),
'trees': trees,
}, {
'classname': 'XGBClassifier'
}, **kwargs)
SOLUTION:
By removing the None from:
def port_xgboost(clf, tmp_file=None, **kwargs):
The user can then specify the None in their python script if they would prefer (and if it works) a temp file in APPDATA/LOCAL or they can actually specify the directory with the file ending in .json:
print(port(xgb, tmp_file = "C:\\Users\\*username*\\Desktop\\test.json")))
And they can use the code exemplified for the DecisionTree/RandomForest to create a .h file:
with open('XGBoostClassifier.h', 'w') as file:
file.write(port(xgb, tmp_file = "C:\\Users\\*username*\\Desktop\\test.json"))
Please update the library and add the documentation for the temp file/specified location.
Furthermore, please add all classes in the documentation. So the users know exactly how to use the namespace:
Example given: Eloquent::ML::Port::RandomForestRegressor regressor;
Correct namespace call for other ML types:
Eloquent::ML::Port::SVM name_to_be_used_in_code;
Eloquent::ML::Port::OneClassSVM name_to_be_used_in_code;
Eloquent::ML::Port::SEFR name_to_be_used_in_code;
Eloquent::ML::Port::DecisionTreeClassifier name_to_be_used_in_code;
Eloquent::ML::Port::DecisionTreeRegressor name_to_be_used_in_code;
Eloquent::ML::Port::RandomForestClassifier name_to_be_used_in_code;
Eloquent::ML::Port::GaussianNB name_to_be_used_in_code;
Eloquent::ML::Port::LogisticRegression name_to_be_used_in_code;
Eloquent::ML::Port::PCA name_to_be_used_in_code;
Eloquent::ML::Port::PrincipalFFT name_to_be_used_in_code;
Eloquent::ML::Port::LinearRegression name_to_be_used_in_code;
Eloquent::ML::Port::XGBClassifier name_to_be_used_in_code;
Thank you and take care!