hummingbird icon indicating copy to clipboard operation
hummingbird copied to clipboard

Failure to convert a torch model(generated by hummingbird) to torchscript

Open ShulinChen opened this issue 2 years ago • 7 comments

Context: I have a use case which requires me to first train a XGB model, convert it to torch model through hummingbird, and later convert the torch model to torchscript. Note that converting XGB directly to torchscript model is not an option due to my use case. pseudo-code:

torch_model = convert(xgb_model, 'torch', sample).model  # convert xgb -> torch through hummingbird

torch.jit.script(torch_model) # convert torch(generated by hummingbird) to torchscript

Stacktrace:

 script_model = torch.jit.script(model)
  File "/usr/lib/python3.6/site-packages/torch/jit/_script.py", line 1258, in script
    obj, torch.jit._recursive.infer_methods_to_compile
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 451, in create_script_module
    return create_script_module_impl(nn_module, concrete_type, stubs_fn)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 513, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/usr/lib/python3.6/site-packages/torch/jit/_script.py", line 587, in _construct
    init_fn(script_module)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 491, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 513, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/usr/lib/python3.6/site-packages/torch/jit/_script.py", line 587, in _construct
    init_fn(script_module)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 491, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 513, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/usr/lib/python3.6/site-packages/torch/jit/_script.py", line 587, in _construct
    init_fn(script_module)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 491, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 463, in create_script_module_impl
    method_stubs = stubs_fn(nn_module)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 732, in infer_methods_to_compile
    stubs.append(make_stub_from_method(nn_module, method))
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 66, in make_stub_from_method
    return make_stub(func, method_name)
  File "/usr/lib/python3.6/site-packages/torch/jit/_recursive.py", line 51, in make_stub
    ast = get_jit_def(func, name, self_name="RecursiveScriptModule")
  File "/usr/lib/python3.6/site-packages/torch/jit/frontend.py", line 264, in get_jit_def
    return build_def(parsed_def.ctx, fn_def, type_line, def_name, self_name=self_name, pdt_arg_types=pdt_arg_types)
  File "/usr/lib/python3.6/site-packages/torch/jit/frontend.py", line 302, in build_def
    param_list = build_param_list(ctx, py_def.args, self_name, pdt_arg_types)
  File "/usr/lib/python3.6/site-packages/torch/jit/frontend.py", line 330, in build_param_list
    raise NotSupportedError(ctx_range, _vararg_kwarg_err)
torch.jit.frontend.NotSupportedError: Compiled functions can't take variable number of arguments or use keyword-only arguments with defaults:
  File "/usr/lib/python3.6/site-packages/hummingbird/ml/_executor.py", line 65
    def forward(self, *inputs):
                      ~~~~~~~ <--- HERE
        with torch.no_grad():
            assert len(self._input_names) == len(inputs) or (

Things I have tried: Instead of relying on scripting to convert torch to torchscript, I have also tried tracing with

script_model = torch.jit.trace(model, (sample, ))

However this doesn't work for me because model's inputs/outputs have to be Union[Tensor, Tuple[Tensor]] to be traceable.

ShulinChen avatar Oct 26 '22 22:10 ShulinChen

Hi! Yea tracing should definitely work because that is what we do internally when we generate TorchScript. We tried scripting time ago and it was hard to make it work in general. If you know exactly how many inputs do you have, maybe you can path Hummingbird and get rid of all the variable arguments. I will think a little bit more on this and see if there is another solution.

interesaaat avatar Oct 26 '22 23:10 interesaaat

@interesaaat does this means that "torch" model generated by hummingbird is fundamentally non-scriptable?

atomic avatar Oct 27 '22 23:10 atomic

As it is now probably yes. We can trace it, but not script it. I will take a look and see if we can solve this and make models also scriptable.

interesaaat avatar Oct 27 '22 23:10 interesaaat

Hi @interesaaat, thanks for the confirmation. Would it be possible for you to provide a successful torch->torchscript conversion example through trace? I tried to leverage the provided hummingbird example: https://github.com/microsoft/hummingbird/blob/main/notebooks/XGB-example.ipynb , but so far haven't had luck getting tracing to work yet. See the screenshot on how I tried to do the torch->torchscript conversion. Screen Shot 2022-10-27 at 11 12 23 PM

ShulinChen avatar Oct 28 '22 03:10 ShulinChen

Ah this should work out of the box if instead of using ‘torch’ when converting the model you put ‘torch.jit’.

On Thu, Oct 27, 2022 at 8:13 PM ShulinChen @.***> wrote:

Hi @interesaaat https://github.com/interesaaat, thanks for the confirmation. Would it be possible for you to provide a successful torch->torchscript conversion example through trace? I tried to leverage the provided hummingbird example: https://github.com/microsoft/hummingbird/blob/main/notebooks/XGB-example.ipynb , but so far haven't had luck getting tracing to work yet. See the screenshot on how I tried to do the torch->torchscript conversion. [image: Screen Shot 2022-10-27 at 11 12 23 PM] https://user-images.githubusercontent.com/8204913/198492919-11bb37cb-be3e-4819-b111-533888786383.png

— Reply to this email directly, view it on GitHub https://github.com/microsoft/hummingbird/issues/644#issuecomment-1294394021, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALWR52Y6VAUMSGJQMCLZBLWFNAFZANCNFSM6AAAAAARPOMPLA . You are receiving this because you were mentioned.Message ID: @.***>

interesaaat avatar Oct 28 '22 03:10 interesaaat

@interesaaat Oh I do know that hummingbird can convert xgb to torchscript directly without issues. However, per what I have mentioned earlier:

I have a use case which requires me to first train a XGB model, convert it to torch model through hummingbird, and later convert the torch model to torchscript. Note that converting XGB directly to torchscript model is not an option due to my use case.

A bit more context on my use case. Reason why I first convert XGB to torch is, that way I can stack the hummingbird-converted torch model with another torch model (think model ensemble). I am working on a case to ensemble classical models with DL torch models, by leveraging hummingbird as the bridge to convert xgb to torch, so that they can be combined into 1 torch model.

ShulinChen avatar Oct 28 '22 03:10 ShulinChen

I see. I have not tried but maybe if you trace the xgb model first and then trace the traced xgb model together with your other DL model maybe it works? Another option could be to you to debug how we do tracing of the model from within Hummingbird and copy the code externally to trace both the xgb model and your DL model. I think the problem you having with tracing externally is that your are not passing the input correctly.

interesaaat avatar Oct 28 '22 03:10 interesaaat