Serving float16数据类型支持问题

我们在使用paddle serving并适配自定义硬件类型时，对C++ serving中的float16数据类型有一些问题/疑问：

以examples/C++/PaddleClas/imagenet中resnet50 为例

client输入float16数据类型问题：

client端对实际的numpy输入并没有做检查/转数，如果prototxt中定义了输入输出类型为float16，但是实际输入采用了numpy.float32，会出现精度问题。

prototxt定义：

feed_var {
  name: "image"
  alias_name: "image"
  is_lod_tensor: false
  feed_type: 5
  shape: 3
  shape: 224
  shape: 224
}
fetch_var {
  name: "softmax_0.tmp_0"
  alias_name: "score"
  is_lod_tensor: false
  fetch_type: 5
  shape: 1000
}

实际代码调用：

    img = seq(image_file) # img is float32 numpy array here.
    fetch_map = client.predict(
        feed={"image": img}, fetch=["score"], batch=False)

在client.py中，会根据proto定义的float16数据类型，把img转为string数据

https://github.com/PaddlePaddle/Serving/blob/cf9ad1d9d6667974ecbff6917b0d74c11a25109d/python/paddle_serving_client/client.py#L438

general_model.cpp中把string数据设置到tensor中，并在之后向server传输。

https://github.com/PaddlePaddle/Serving/blob/cf9ad1d9d6667974ecbff6917b0d74c11a25109d/core/general-client/src/general_model.cpp#L335

但是，img本身为float32数据类型，其内存大小为float16的二倍，不能没有经过转数直接传递到server，否则paddle构建出来的tensor数据错误的： gdb可以看到输入类型指定为float16时，内存大小却仍然是43224*224=602112

(gdb) p string_feed[vec_idx].size()
$4 = 602112
(gdb) p string_shape[vec_idx]
$5 = std::vector of length 3, capacity 3 = {3, 224, 224}
(gdb) p 3*224*224
$6 = 150528

client float16输出问题：

上述案例中，用户代码对输入做转数操作，则会出现pybind解析错误，client代码示例：

    img = seq(image_file) # img is float32 numpy array here.
    fetch_map = client.predict(
        feed={"image": img.astype(np.float16)}, fetch=["score"], batch=False) # convert img to float16

使用报错：

'utf-8' codec can't decode byte

因为float16的数据是通过string在client/server中传递的，pybind中要求对C++传递到python端的string数据需要能够被utf-8 decode. 除非用户显示指定返回py::bytes不做转换。

因此，以下代码是不是需要改为：return py::bytes(self.get_string_by_name_with_rv(model_idx, name));

https://github.com/PaddlePaddle/Serving/blob/cf9ad1d9d6667974ecbff6917b0d74c11a25109d/core/general-client/src/pybind_general_model.cpp#L63

pybind参考：https://github.com/pybind/pybind11/blob/master/docs/advanced/cast/strings.rst

另外想问下，我们有可供参考的float16运行案例吗？

Aug 04 '22 07:08 czr-gc

Message that will be displayed on users' first issue

Aug 04 '22 07:08 github-actions[bot]

你提的这个问题很好，我们的代码确实存在问题，我会抽空去看一下。谢谢！

Aug 15 '22 11:08 HexToString

感谢！

另外，对于async接口，fp16会出现空数据/空指针的情况，原因是以下函数没有处理float16的情况，导致分配的内存为0，需要增加float16的判断：

    if (dtype == paddle::PaddleDType::FLOAT16) {
      return sizeof(float)/2;
    }

https://github.com/PaddlePaddle/Serving/blob/48540ef2cbdb976603a58225572a507943b35625/core/predictor/framework/bsf.h#L209

https://github.com/PaddlePaddle/Serving/blob/48540ef2cbdb976603a58225572a507943b35625/core/predictor/framework/bsf.h#L1089

Aug 16 '22 07:08 czr-gc

Serving Serving copied to clipboard

float16数据类型支持问题

client输入float16数据类型问题：

client float16输出问题：

Serving
Serving copied to clipboard