SVF icon indicating copy to clipboard operation
SVF copied to clipboard

FlowSensitive analysis do not recolve indirect jumps.

Open tregua87 opened this issue 1 year ago • 4 comments

Hi! I am using the last SVF commit f889cfbf7a4694183abbb3417f81887a44acab29 and I noticed the FlowSensitive analysis seems not to resolve the indirect jumps.

Let's take this C snippet of code.

int vpx_rb_read_bit(struct vpx_read_bit_buffer *rb) {
  const size_t off = rb->bit_offset;
  const size_t p = off >> 3;
  const int q = 7 - (int)(off & 0x7);
  if (rb->bit_buffer + p < rb->bit_buffer_end) {
    const int bit = (rb->bit_buffer[p] >> q) & 1;
    rb->bit_offset = off + 1;
    return bit;
  } else {
    return 0;
  }
}

int (*read_bit)(struct vpx_read_bit_buffer *rb) = &vpx_rb_read_bit;

int decoder_peek_si_internal(const uint8_t *data, unsigned int data_sz) {
    struct vpx_read_bit_buffer rb = { data, data + data_sz, 0};
    return read_bit(&rb);
}

When working on the ICFG and SVFG in combination with the FlowSensitive analysis, the indirect call read_bit(&rb); does not point to anything. Here is a minimal example:

LLVMModuleSet* llvmModuleSet = LLVMModuleSet::getLLVMModuleSet();
SVFModule* svfModule = LLVMModuleSet::getLLVMModuleSet()->buildSVFModule(moduleNameVec);
/// Build Program Assignment Graph (SVFIR)
SVFIRBuilder builder(svfModule);
SVFIR* pag = builder.build();
ICFG* icfg = pag->getICFG();

// TRY THIS OR THE OTHER WAY
// Andersen* point_to_analysys = AndersenWaveDiff::createAndersenWaveDiff(pag);
FlowSensitive* point_to_analysys = FlowSensitive::createFSWPA(pag);
point_to_analysys->analyze();

PTACallGraph* callgraph = point_to_analysys->getPTACallGraph();
builder.updateCallGraph(callgraph);
icfg = pag->getICFG();
icfg->updateCallGraph(callgraph);


icfg->dump("icfg");

/// Sparse value-flow graph (SVFG)
SVFGBuilder svfBuilder;
SVFG* svfg = svfBuilder.buildFullSVFG(point_to_analysys);
svfg->updateCallGraph(point_to_analysys);

svfg->dump("svf");

When using AndersenWaveDiff, things seem to work. But with FlowSensitive, both ICFG and SVFG do not resolve the indirect call at read_bit(&rb);.

I attach .bc and driver for reference. files.zip

tregua87 avatar Aug 14 '23 16:08 tregua87

This might because decoder_peek_si_internal is an uncalled function. Could you change its name to main to have a try for flow-sensitive analysis. Andersen is fine for both scenarios as it does not care which function is executed in order.

yuleisui avatar Aug 14 '23 22:08 yuleisui

The above example was an attempt to minimize something more complex. I try to show you the whole picture, and hope you can advice me.

The problem occurs when analyzing libvpx library. I leave the .bc as attachment for reference (libvpx.a.bc.zip).

When analyzing vpx_codec_peek_stream_info, SVF does not infer the target functions for iface->dec.peek_si(data, data_sz, si);.

The peek_si function pointer is initialized as a global structure whose address is returned statically through the function vpx_codec_vp8_dx (the definition is provided below).

I tried with three point-to analyses:

// TEST 1
// Andersen* point_to_analysys = AndersenWaveDiff::createAndersenWaveDiff(pag);

// TEST 2
// FlowSensitive* point_to_analysys = FlowSensitive::createFSWPA(pag);

// TEST 3
//  TypeAnalysis* point_to_analysys = new TypeAnalysis(pag);
point_to_analysys->analyze();

In all three cases, that indirect call appears without target function (no exit edges). Can you help me understand why SVF does not work correctly in this case?

I am working with SVF commit f889cfbf7a4694183abbb3417f81887a44acab29, but also older commits show same behavior.

Last remark: this is a library, so there is no main function. I need to analyze the possible indirect calls starting from each APIs (assuming also over-approximation of course). So far, the FlowSensitive analysis was quite precise. I can't understand it does not work in this case. Thanks!

vpx_codec_err_t vpx_codec_peek_stream_info(vpx_codec_iface_t *iface,
                                           const uint8_t *data,
                                           unsigned int data_sz,
                                           vpx_codec_stream_info_t *si) {
  vpx_codec_err_t res;

  if (!iface || !data || !data_sz || !si ||
      si->sz < sizeof(vpx_codec_stream_info_t))
    res = VPX_CODEC_INVALID_PARAM;
  else {
    /* Set default/unknown values */
    si->w = 0;
    si->h = 0;

    res = iface->dec.peek_si(data, data_sz, si);
  }

  return res;
}
#define CODEC_INTERFACE(id)                          \
  vpx_codec_iface_t *id(void) { return &id##_algo; } \
  vpx_codec_iface_t id##_algo

CODEC_INTERFACE(vpx_codec_vp8_dx) = {
  "WebM Project VP8 Decoder" VERSION_STRING,
  VPX_CODEC_INTERNAL_ABI_VERSION,
  VPX_CODEC_CAP_DECODER | VP8_CAP_POSTPROC | VP8_CAP_ERROR_CONCEALMENT |
      VPX_CODEC_CAP_INPUT_FRAGMENTS,
  /* vpx_codec_caps_t          caps; */
  vp8_init,     /* vpx_codec_init_fn_t       init; */
  vp8_destroy,  /* vpx_codec_destroy_fn_t    destroy; */
  vp8_ctf_maps, /* vpx_codec_ctrl_fn_map_t  *ctrl_maps; */
  {
      vp8_peek_si,   /* vpx_codec_peek_si_fn_t    peek_si; */
      vp8_get_si,    /* vpx_codec_get_si_fn_t     get_si; */
      vp8_decode,    /* vpx_codec_decode_fn_t     decode; */
      vp8_get_frame, /* vpx_codec_frame_get_fn_t  frame_get; */
      NULL,
  },
  {
      /* encoder functions */
      0, NULL, /* vpx_codec_enc_cfg_map_t */
      NULL,    /* vpx_codec_encode_fn_t */
      NULL,    /* vpx_codec_get_cx_data_fn_t */
      NULL,    /* vpx_codec_enc_config_set_fn_t */
      NULL,    /* vpx_codec_get_global_headers_fn_t */
      NULL,    /* vpx_codec_get_preview_frame_fn_t */
      NULL     /* vpx_codec_enc_mr_get_mem_loc_fn_t */
  }
};

// vpx_codec_iface_t definition
struct vpx_codec_iface {
  const char *name;                   /**< Identification String  */
  int abi_version;                    /**< Implemented ABI version */
  vpx_codec_caps_t caps;              /**< Decoder capabilities */
  vpx_codec_init_fn_t init;           /**< \copydoc ::vpx_codec_init_fn_t */
  vpx_codec_destroy_fn_t destroy;     /**< \copydoc ::vpx_codec_destroy_fn_t */
  vpx_codec_ctrl_fn_map_t *ctrl_maps; /**< \copydoc ::vpx_codec_ctrl_fn_map_t */
  struct vpx_codec_dec_iface {
    vpx_codec_peek_si_fn_t peek_si; /**< \copydoc ::vpx_codec_peek_si_fn_t */
    vpx_codec_get_si_fn_t get_si;   /**< \copydoc ::vpx_codec_get_si_fn_t */
    vpx_codec_decode_fn_t decode;   /**< \copydoc ::vpx_codec_decode_fn_t */
    vpx_codec_get_frame_fn_t
        get_frame;                   /**< \copydoc ::vpx_codec_get_frame_fn_t */
    vpx_codec_set_fb_fn_t set_fb_fn; /**< \copydoc ::vpx_codec_set_fb_fn_t */
  } dec;
  struct vpx_codec_enc_iface {
    int cfg_map_count;
    vpx_codec_enc_cfg_map_t
        *cfg_maps;                /**< \copydoc ::vpx_codec_enc_cfg_map_t */
    vpx_codec_encode_fn_t encode; /**< \copydoc ::vpx_codec_encode_fn_t */
    vpx_codec_get_cx_data_fn_t
        get_cx_data; /**< \copydoc ::vpx_codec_get_cx_data_fn_t */
    vpx_codec_enc_config_set_fn_t
        cfg_set; /**< \copydoc ::vpx_codec_enc_config_set_fn_t */
    vpx_codec_get_global_headers_fn_t
        get_glob_hdrs; /**< \copydoc ::vpx_codec_get_global_headers_fn_t */
    vpx_codec_get_preview_frame_fn_t
        get_preview; /**< \copydoc ::vpx_codec_get_preview_frame_fn_t */
    vpx_codec_enc_mr_get_mem_loc_fn_t
        mr_get_mem_loc; /**< \copydoc ::vpx_codec_enc_mr_get_mem_loc_fn_t */
  } enc;
};

tregua87 avatar Aug 15 '23 09:08 tregua87

I will take a look at this. You haven’t answered whether my previous suggestion will fix your tiny example. Have you had a try?

yuleisui avatar Aug 15 '23 10:08 yuleisui

Thank you!

I just tried by adding a main that invokes decoder_peek_si_internal, like the one below.

With this setting, FlowSensitive can resolve the indirect jump.

int main(int arch, char** argc) {
  const uint8_t data[1000] = { 0 };
 unsigned int data_sz = 10;
 return decoder_peek_si_internal(data, data_sz);
}

tregua87 avatar Aug 15 '23 11:08 tregua87