aws lambda c++ runtime seg fault (but works locally)

Open jamesrobertwilliams opened this issue 5 years ago • 1 comments

I am trying to get a simple example running on aws c++ runtime, but it keeps seg faulting when I invoke the lambda function. To start with, the example shown in: https://aws.amazon.com/blogs/compute/introducing-the-c-lambda-runtime/ works fine. So, now, I fetch something from a bucket like so:

    #include <aws/core/Aws.h>
    #include <aws/core/client/ClientConfiguration.h>
    #include <aws/core/auth/AWSCredentialsProvider.h>
    #include <aws/s3/S3Client.h>
    #include <aws/s3/model/GetObjectRequest.h>
    #include <awsdoc/s3/s3_examples.h>
    #include <vector>
    #include <rapidcsv.h>
    #include <torch/torch.h>
    #include <torch/script.h> // One-stop header.
    #include <iostream>
    #include <math.h>
    #include <memory>
    #include <aws/lambda-runtime/runtime.h>
    
    using namespace torch::indexing;
    using namespace Aws::Utils;
    using namespace aws::lambda_runtime;
    
    char const TAG[] = "LAMBDA_ALLOC";
    
    static invocation_response my_handler(invocation_request const &req)
    {
       if (req.payload.length() > 42)
       {
          return invocation_response::failure("error message here" /*error_message*/,
                                              "error type here" /*error_type*/);
       }
    
       Aws::SDKOptions options;
       Aws::InitAPI(options);
    
       const Aws::String bucket_name("buckname");
       const Aws::String object_name("fname");
    
       Aws::S3::S3Client s3_client;
       Aws::S3::Model::GetObjectRequest object_request;
       object_request.SetBucket(bucket_name);
       object_request.SetKey(object_name);
       Aws::S3::Model::GetObjectOutcome get_object_outcome =
    
       s3_client.GetObject(object_request);
       auto &retrieved_file = get_object_outcome.GetResultWithOwnership().GetBody();
       rapidcsv::Document doc(retrieved_file, rapidcsv::LabelParams(-1, -1)); /// this is where the problem is.
    
       const Aws::String objectKey2("traced_pointnet_model.tar");
       Aws::S3::S3Client s3_client2;
       Aws::S3::Model::GetObjectRequest object_request2;
       object_request2.SetBucket(bucket_name);
       object_request2.SetKey(objectKey2);
       Aws::S3::Model::GetObjectOutcome get_object_outcome2 =
       s3_client2.GetObject(object_request2);
       auto &retrieved_file2 = get_object_outcome2.GetResultWithOwnership().GetBody();
    
       torch::jit::script::Module module;
       module = torch::jit::load(retrieved_file2);  /// this is also where the problem is.
       std::cout << "Model Load ok\n";
    
       Aws::ShutdownAPI(options);
    
       return invocation_response::success(tensor_string /*payload*/,
                                           "application/json" /*MIME type*/);
    }
    
    int main()
    {
    
       run_handler(my_handler);
       return 0;
    }

the thing is, it compiles and works as expected when I run it locally (debian, cmake), but when I compile it for aws runtime, I get the seg fault.

I have narrowed down the problem to two lines:

rapidcsv::Document doc(retrieved_file, rapidcsv::LabelParams(-1, -1)); /// this is where the problem is. and also,

module = torch::jit::load(retrieved_file2); /// this is also where the problem is. I use the rapidcsv header only lib from here: https://github.com/d99kris/rapidcsv and as I said, it works fine locally.

I suspect somehow the runtime is not able to find rapidcsv (but that still fails to explain why it seg faults on module = torch::jit::load(retrieved_file2); :(

Any pointers would be great!

Nov 21 '20 12:11 jamesrobertwilliams

Check if torch is loading libraries dynamically (via dlopen for example) and those libraries are not getting packaged. See https://github.com/awslabs/aws-lambda-cpp#common-pitfalls-with-packaging

Dec 10 '20 07:12 marcomagdy