android-demo-app icon indicating copy to clipboard operation
android-demo-app copied to clipboard

Force stopping app.

Open sunn-e opened this issue 6 years ago • 18 comments
trafficstars

Downladed the assets and tried running on android device. Log: java.lang.RuntimeException: Unable to start activity ComponentInfo{org.pytorch.helloworld/org.pytorch.helloworld.MainActivity}: com.facebook.jni.CppException: false CHECK FAILED at ../torch/csrc/jit/import.cpp (deserialize at ../torch/csrc/jit/import.cpp:178) (no backtrace available) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2946) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3081) at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:78) at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:108) at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:68) at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1831) at android.os.Handler.dispatchMessage(Handler.java:106) at android.os.Looper.loop(Looper.java:201) at android.app.ActivityThread.main(ActivityThread.java:6806) at java.lang.reflect.Method.invoke(Native Method) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:547) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:873) Caused by: com.facebook.jni.CppException: false CHECK FAILED at ../torch/csrc/jit/import.cpp (deserialize at ../torch/csrc/jit/import.cpp:178) (no backtrace available) at org.pytorch.Module$NativePeer.initHybrid(Native Method) at org.pytorch.Module$NativePeer.<init>(Module.java:70) at org.pytorch.Module.<init>(Module.java:25) at org.pytorch.Module.load(Module.java:21) at org.pytorch.helloworld.MainActivity.onCreate(MainActivity.java:39) at android.app.Activity.performCreate(Activity.java:7224) at android.app.Activity.performCreate(Activity.java:7213) at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1272) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2926) ... 11 more

sunn-e avatar Oct 11 '19 10:10 sunn-e

@sunn-e This error usually happens when model was serialized and interpreted using different versions of libtorch. Did it happen with the models from this repository or you retraced/rescripted 'model.pt' as in instructions?

If you traced it and use gradle dependencies 'org.pytorch:pytorch_android:1.3.0' please check that your python torch version is 1.3.0:

└─ $ python -c 'import torch; print(torch.version.__version__)'
1.3.0

IvanKobzarev avatar Oct 11 '19 17:10 IvanKobzarev

thx for your reply

jiayong avatar Oct 29 '19 03:10 jiayong

@IvanKobzarev Although I use 1.3.0 to convert my customized model, I have seen the same error message on the android studio.

image image

Are there any unsupported layers or data types? such as fp16 layers, ... or is there any other checklist to solve it?

Yeongtae avatar Nov 01 '19 03:11 Yeongtae

Getting the same error.

torch==1.3.0 torchvision==0.4.1

data is just in torch.float

x_train = x_train.to(device=device, dtype=torch.float)
y_train = y_train.to(device=device, dtype=torch.float)
x_test = x_test.to(device=device, dtype=torch.float)
y_test = y_test.to(device=device, dtype=torch.float)

model is pretty standard:

class Model(torch.nn.Module):

    def __init__(self):
        super(Model, self).__init__()

        self.input_1 = torch.nn.Linear(n_features, layer_1_size)
        self.prelu_1 = torch.nn.PReLU()
        self.hidden_2 = torch.nn.Linear(layer_1_size, layer_2_size)
        self.prelu_2 = torch.nn.PReLU()
        self.hidden_3 = torch.nn.Linear(layer_2_size, layer_3_size)
        self.prelu_3 = torch.nn.PReLU()
        self.out_4 = torch.nn.Linear(layer_3_size, n_classes)

        self.drop = torch.nn.Dropout(0.25)

    def forward(self, x):
        x = self.prelu_1(self.input_1(x))
        x = self.drop(x)
        x = self.prelu_2(self.hidden_2(x))
        x = self.drop(x)
        x = self.prelu_3(self.hidden_3(x))
        x = self.drop(x)
        x = self.out_4(x)

        return x

and model export is based on the example, ensuring dtype=torch.float

model.eval()
example = torch.rand(1, num_features, dtype=torch.float)
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("../xxx/app/src/main/assets/model.pt")

dev-michael-schmidt avatar Nov 01 '19 16:11 dev-michael-schmidt

@IvanKobzarev Although I use 1.3.0 to convert my customized model, I have seen the same error message on the android studio.

image image

Are there any unsupported layers or data types? such as fp16 layers, ... or is there any other checklist to solve it?

When I use the file name 'asr.pt', it makes the same error. But It's fine with the other file name. image

Yeongtae avatar Nov 05 '19 03:11 Yeongtae

Same error for me and my python torch version is 1.3.0

antonlebedjko avatar Nov 08 '19 00:11 antonlebedjko

Again 1.3.0 for both pytorch_android, pytorch

2019-11-08 14:05:49.886 9870-9870/com.dummy.app W/com.dummy.app: Got a deoptimization request on un-deoptimizable method com.facebook.jni.HybridData org.pytorch.Module$NativePeer.initHybrid(java.lang.String)
2019-11-08 14:05:51.531 9870-9870/com.dummy.app D/AndroidRuntime: Shutting down VM
2019-11-08 14:05:51.542 9870-9870/com.dummy.app E/AndroidRuntime: FATAL EXCEPTION: main
    Process: com.dummy.app, PID: 9870
    java.lang.RuntimeException: Unable to start activity ComponentInfo{com.dummy.app/com.dummy.app.MainActivity}: com.facebook.jni.CppException: false CHECK FAILED at aten/src/ATen/Functions.h (empty at aten/src/ATen/Functions.h:3535)
    (no backtrace available)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3270)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3409)
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2016)
        at android.os.Handler.dispatchMessage(Handler.java:107)
        at android.os.Looper.loop(Looper.java:214)
        at android.app.ActivityThread.main(ActivityThread.java:7356)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:492)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:930)
     Caused by: com.facebook.jni.CppException: false CHECK FAILED at aten/src/ATen/Functions.h (empty at aten/src/ATen/Functions.h:3535)
    (no backtrace available)
        at org.pytorch.Module$NativePeer.initHybrid(Native Method)
        at org.pytorch.Module$NativePeer.<init>(Module.java:70)
        at org.pytorch.Module.<init>(Module.java:25)
        at org.pytorch.Module.load(Module.java:21)
        at com.dummy.app.MainActivity.onCreate(MainActivity.java:27)
        at android.app.Activity.performCreate(Activity.java:7802)
        at android.app.Activity.performCreate(Activity.java:7791)
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1299)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3245)
        	... 11 more

dev-michael-schmidt avatar Nov 08 '19 20:11 dev-michael-schmidt

Using:

#include <torch/script.h> // One-stop header.

#include <iostream>
#include <memory>

int main(int argc, const char* argv[]) {

  if (argc != 2) {
    std::cerr << "usage: example-app <path-to-exported-script-module>\n";
    return -1;
  }


  torch::jit::script::Module module;
  try {
    // Deserialize the ScriptModule from a file using torch::jit::load().
    module = torch::jit::load(argv[1]);
  }
  catch (const c10::Error& e) {
    std::cerr << "error loading the model\n";
    return -1;
  }

  std::cout << "ok\n";
}

I am able to get

./example-app ../../MyApplication/app/src/main/assets/real_model.pt
ok

dev-michael-schmidt avatar Nov 08 '19 20:11 dev-michael-schmidt

Hi there,

I had the same problem and, in my case, I solved it.

I created the pytorch traced model using the provided tracing script. The output filename was traced_model.pt. But when I added it to the assets folder of the Android HelloWorld demo I was changing the filename to model_2.pth. In that scenario the app was crashing with the stacktrace above.

Leaving the filename traced_model.pt as upon creation, solved the issue. Strange, huh?

josecyn avatar Nov 13 '19 09:11 josecyn

Hi there,

I had the same problem and, in my case, I solved it.

I created the pytorch traced model using the provided tracing script. The output filename was traced_model.pt. But when I added it to the assets folder of the Android HelloWorld demo I was changing the filename to model_2.pth. In that scenario the app was crashing with the stacktrace above.

Leaving the filename traced_model.pt as upon creation, solved the issue. Strange, huh?

I also have the same problem and try this method, but this way does not always work for me. So strange..

peterzhang2029 avatar Nov 22 '19 07:11 peterzhang2029

@josecyn , @Yeongtae Sorry for my late reply.

My guess, that changing behavior on asset renaming could be with some not reuploading on device/emulator the latest assets. Do you have the same problem if you fully uninstall - install the application on device/emulator after renaming?

Sometimes I had an issue that adb install apk.apk did not reinstall native libraries, even if gradle dependencies were updated and java part was reinstalled on device. In that cases manual uninstall of app helped me.

IvanKobzarev avatar Nov 27 '19 22:11 IvanKobzarev

Hello @MichaelSchmidt82 Sorry for my late reply.

I checked your model on the latest nightly builds and it worked ok for me, it loads and forward() works for me. Is that error still happens for you with the latest pytorch android nightlies? (You might need to retrace/script your model with the latest python nightlies to have aligned with pytorch_android version of pt file) (We had several major fixes since you reported the issue)

To use nightlies (to force refresh dependencies gradle has argument --refresh-dependencies)

repositories {
    maven {
        url "https://oss.sonatype.org/content/repositories/snapshots"
    }
}

dependencies {
    ...
    implementation 'org.pytorch:pytorch_android:1.4.0-SNAPSHOT'
    implementation 'org.pytorch:pytorch_android_torchvision:1.4.0-SNAPSHOT'
    ...
}

IvanKobzarev avatar Nov 27 '19 23:11 IvanKobzarev

I will try it later.

On Wed, Nov 27, 2019, 6:52 PM Ivan Kobzarev [email protected] wrote:

Hello @MichaelSchmidt82 https://github.com/MichaelSchmidt82 Sorry for my late reply.

I checked your model on the latest nightly builds and it worked ok for me, it loads and forward() works for me. Is that error still happens for you with the latest pytorch android nightlies? (You might need to retrace/script your model with the latest python nightlies to have aligned with pytorch_android version of pt file) (We had several major fixes since you reported the issue)

To use nightlies (to force refresh dependencies gradle has argument --refresh-dependencies)

repositories { maven { url "https://oss.sonatype.org/content/repositories/snapshots" } }

dependencies { ... implementation 'org.pytorch:pytorch_android:1.4.0-SNAPSHOT' implementation 'org.pytorch:pytorch_android_torchvision:1.4.0-SNAPSHOT' ... }

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pytorch/android-demo-app/issues/16?email_source=notifications&email_token=ABLJ6IUBPTUQOQZIX43NXKDQV4B4ZA5CNFSM4I7XWFQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFLBFIY#issuecomment-559288995, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLJ6IXGHXSFKY5W65YJFALQV4B4ZANCNFSM4I7XWFQA .

dev-michael-schmidt avatar Nov 28 '19 03:11 dev-michael-schmidt

@IvanKobzarev i use 1.4.0-SNAPSHOT, but i meet "A/libc: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)" when loadding module

hcflrl avatar Dec 20 '19 12:12 hcflrl

FYI; We ran into this exception which halted us for a day, eventually we came to the conclusion that it worked with our locally trained models and not our cloud trained models, because our cloud trained ones have CUDA enabled. Disabled CUDA and voilà, works again.

CUDA incompatibility should have a clear exception message, same goes for all the pytorch mobile exceptions I've had so far :)

paramsen avatar Jan 17 '20 12:01 paramsen

Missing detail in error messages is a known problem in the 1.3 release. Check out the just-released PyTorch 1.4!

dreiss avatar Jan 17 '20 19:01 dreiss

FYI; We ran into this exception which halted us for a day, eventually we came to the conclusion that it worked with our locally trained models and not our cloud trained models, because our cloud trained ones have CUDA enabled. Disabled CUDA and voilà, works again.

CUDA incompatibility should have a clear exception message, same goes for all the pytorch mobile exceptions I've had so far :)

I was trying to serialize it to load on mobile, and your answer helped. In case somebody needs it, just send your model to CPU and then use torchscript to serialize it, like so: cpu_model = gpu_model.cpu() sample_input_cpu = sample_input_gpu.cpu() traced_cpu = torch.jit.trace(traced_cpu, sample_input_cpu) torch.jit.save(traced_cpu, "cpu.pth")

ref: https://pytorch.org/docs/master/jit.html#creating-torchscript-code

andreybicalho avatar May 19 '20 15:05 andreybicalho

I fixed my issue by changing the path in trace_model.py from app/src/main/assets/model.pt to app/src/main/assets/model.pth

dingusagar avatar Sep 29 '21 19:09 dingusagar