wuffs icon indicating copy to clipboard operation
wuffs copied to clipboard

wuffs significantly slower than OpenCV 4.9.0 when decoding PNGs for 7680x4320 image

Open zchrissirhcz opened this issue 2 months ago • 9 comments

Problem

When decoding a big image (height=4320, width=7680, channels=4, data type = uint8_t), wuffs is much slow than OpenCV 4.9.0, on Apple M1 (Mac-mini).

Time cost

7680x4320 image

time cost
opencv 4.9.0 270 ms
wuffs latest("unsupported.c") 370 ms

OpenCV 4.9.0 details

brew install opencv

which is built on libpng 1.6.43:

  Media I/O: 
    ZLib:                        /Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk/usr/lib/libz.tbd (ver 1.2.12)
    JPEG:                        /opt/homebrew/lib/libjpeg.dylib (ver 80)
    WEBP:                        /opt/homebrew/lib/libwebp.dylib (ver encoder: 0x020f)
    PNG:                         /opt/homebrew/lib/libpng.dylib (ver 1.6.43)
    TIFF:                        /opt/homebrew/lib/libtiff.dylib (ver 42 / 4.6.0)
    JPEG 2000:                   OpenJPEG (ver 2.5.2)
    OpenEXR:                     OpenEXR::OpenEXR (ver 3.2.4)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

What exactly code do I use

#include <iostream>
#include <fstream>
#include <vector>
#include <string>
// Copyright 2023 The Wuffs Authors.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// https://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or https://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.
//
// SPDX-License-Identifier: Apache-2.0 OR MIT

// ----------------

/*
toy-aux-image demonstrates using the wuffs_aux::DecodeImage C++ function to
decode an in-memory compressed image. In this example, the compressed image is
hard-coded to a specific image: a JPEG encoding of the first frame of the
test/data/muybridge.gif animated image.

To run:

$CXX toy-aux-image.cc && ./a.out; rm -f a.out

for a C++ compiler $CXX, such as clang++ or g++.

The expected output:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@X@@@@XX@@@@@@@@@@X
XXXXX@@XXX@@@@@@@II@@@X@X@@@@@
XXXXX@@XX@@X@@@XO+XXX@XX@@@X@@
XXXXXXXX@XX@X@XI=I@@XXI+OXX@XX
XXXXXXXXXXXXXXX+=+OXO+=::OXX@X
XXXXXXXXXXXXXXXXXX=+==:::=XXXX
XXXXXXXXO+:::::+OO+===+OI=+XXX
XXXO::=++:::==+++XI+++X@XXO@XX
XXXO=X@X+::=::::+O++=I@XX@XXXX
XXXXX@XXX=:::::::::=+@XXXX@XXX
XXXXXXXX@O::IXO=::::O@@XXXXXXX
XXXXXXXXO=X+X@@XX::O@@XXXXXXXX
XXXXXXXXXOO=X@X@X+OIXXXXXXXXXX
XXXXXXXXXXX+IIXX+X@OX@XXXXXXXX
XXXXXXXXX@XXOI+IIOOOXXXXXXXXXX
XXXXXXXXXXX@XXXXX@XXXXXXXXXXXX
XXXXXXXXXXXXXXXXX@XXXXXXXXXXXX
OOOOXXXXXXXXXXOXXXXXXXXXXXXOOO
=+++IIIIIIIOOOOOOOOOOIIIIIIII+
*/

// Wuffs ships as a "single file C library" or "header file library" as per
// https://github.com/nothings/stb/blob/master/docs/stb_howto.txt
//
// To use that single file as a "foo.c"-like implementation, instead of a
// "foo.h"-like header, #define WUFFS_IMPLEMENTATION before #include'ing or
// compiling it.
#define WUFFS_IMPLEMENTATION

// Defining the WUFFS_CONFIG__STATIC_FUNCTIONS macro is optional, but when
// combined with WUFFS_IMPLEMENTATION, it demonstrates making all of Wuffs'
// functions have static storage.
//
// This can help the compiler ignore or discard unused code, which can produce
// faster compiles and smaller binaries. Other motivations are discussed in the
// "ALLOW STATIC IMPLEMENTATION" section of
// https://raw.githubusercontent.com/nothings/stb/master/docs/stb_howto.txt
#define WUFFS_CONFIG__STATIC_FUNCTIONS

// Defining the WUFFS_CONFIG__MODULE* macros are optional, but it lets users of
// release/c/etc.c choose which parts of Wuffs to build. That file contains the
// entire Wuffs standard library, implementing a variety of codecs and file
// formats. Without this macro definition, an optimizing compiler or linker may
// very well discard Wuffs code for unused codecs, but listing the Wuffs
// modules we use makes that process explicit. Preprocessing means that such
// code simply isn't compiled.
/*
#define WUFFS_CONFIG__MODULES
#define WUFFS_CONFIG__MODULE__AUX__BASE
#define WUFFS_CONFIG__MODULE__AUX__IMAGE
#define WUFFS_CONFIG__MODULE__BASE
#define WUFFS_CONFIG__MODULE__JPEG
*/
#define WUFFS_CONFIG__MODULES
#define WUFFS_CONFIG__MODULE__AUX__BASE
#define WUFFS_CONFIG__MODULE__AUX__IMAGE
#define WUFFS_CONFIG__MODULE__ADLER32
#define WUFFS_CONFIG__MODULE__BASE
#define WUFFS_CONFIG__MODULE__CRC32
#define WUFFS_CONFIG__MODULE__DEFLATE
#define WUFFS_CONFIG__MODULE__PNG
#define WUFFS_CONFIG__MODULE__ZLIB

// Defining the WUFFS_CONFIG__DST_PIXEL_FORMAT__ENABLE_ALLOWLIST (and the
// associated ETC__ALLOW_FOO) macros are optional, but can lead to smaller
// programs (in terms of binary size). By default (without these macros),
// Wuffs' standard library can decode images to a variety of pixel formats,
// such as BGR_565, BGRA_PREMUL or RGBA_NONPREMUL. The destination pixel format
// is selectable at runtime. Using these macros essentially makes the selection
// at compile time, by narrowing the list of supported destination pixel
// formats. The FOO in ETC__ALLOW_FOO should match the pixel format passed (as
// part of the wuffs_base__image_config argument) to the decode_frame method.
//
// If using the wuffs_aux C++ API, without overriding the SelectPixfmt method,
// the implicit destination pixel format is BGRA_PREMUL.
#define WUFFS_CONFIG__DST_PIXEL_FORMAT__ENABLE_ALLOWLIST
#define WUFFS_CONFIG__DST_PIXEL_FORMAT__ALLOW_BGRA_PREMUL

// If building this program in an environment that doesn't easily accommodate
// relative includes, you can use the script/inline-c-relative-includes.go
// program to generate a stand-alone C file.
//##include "wuffs-v0.4.c"
//#include "wuffs-v0.3.c"
#include "wuffs-unsupported-snapshot.c"

//static std::string decode()
cv::Mat ncv::read_png(const std::string filename)
{
  // Call wuffs_aux::DecodeImage, which is the entry point to Wuffs' high-level
  // C++ API for decoding images. This API is easier to use than Wuffs'
  // low-level C API but the low-level one (1) handles animation, (2) handles
  // asynchronous I/O, (3) handles metadata and (4) does no dynamic memory
  // allocation, so it can run under a `SECCOMP_MODE_STRICT` sandbox.
  // Obviously, if you don't need any of those features, then these simple
  // lines of code here suffices.
  //
  // This example program doesn't explicitly use Wuffs' low-level C API but, if
  // you're curious to learn more, the wuffs_aux::DecodeImage implementation in
  // internal/cgen/auxiliary/*.cc uses it, as does the example/convert-to-nia C
  // program. There's also documentation at doc/std/image-decoders.md
  //
  // If you also want metadata like EXIF orientation and ICC color profiles,
  // script/print-image-metadata.cc has some example code. It uses Wuffs'
  // low-level API but it's a C++ program to use Wuffs' shorter convenience
  // methods: `decoder->decode_frame_config(NULL, &src)` instead of C's
  // `wuffs_base__image_decoder__decode_frame_config(decoder, NULL, &src)`.
  std::ifstream file(filename, std::ios::binary | std::ios::ate);
  if (!file.is_open())
  {
    std::cerr << "failed to open file " << filename << "\n";
    return cv::Mat();
  }
  std::streampos filesize = file.tellg();
  file.seekg(0, std::ios::beg);
  std::vector<char> buffer(filesize);
  if (!file.read(buffer.data(), filesize))
  {
    std::cerr << "error: could not read file content.\n";
    return cv::Mat();
  }
  file.close();

  wuffs_aux::DecodeImageCallbacks callbacks;
  wuffs_aux::sync_io::MemoryInput input(buffer.data(), buffer.size());
  wuffs_aux::DecodeImageResult result =
      wuffs_aux::DecodeImage(callbacks, input);
  if (!result.error_message.empty()) {
    std::cerr << "error: " << result.error_message << "\n";
    return cv::Mat();
  }
  // If result.error_message is empty then the DecodeImage call succeeded. The
  // decoded image is held in result.pixbuf, backed by memory that is released
  // when result.pixbuf_mem_owner (a std::unique_ptr) is destroyed. In this
  // example program, this happens at the end of this function.

  wuffs_base__table_u8 table = result.pixbuf.plane(0);
  //printf("table: %p, %zu, %zu, %zu\n", table.ptr, table.width, table.height, table.stride);

  // print result.pixbuf.pixcfg
//   printf("bpp: %d\n", result.pixbuf.pixcfg.pixel_format().bits_per_pixel());
//   printf("human redable: height=%zu, width=%zu, channel=%zu\n", 
//     result.pixbuf.pixcfg.height(),
//     result.pixbuf.pixcfg.width(),
//     result.pixbuf.pixcfg.pixel_format().bits_per_pixel() / 8
//   );

  cv::Size size;
  size.height = result.pixbuf.pixcfg.height();
  size.width = result.pixbuf.pixcfg.width();
  int channels = result.pixbuf.pixcfg.pixel_format().bits_per_pixel() / 8;
  cv::Mat image(size, CV_8UC(channels));
  std::copy_n(table.ptr, size.width * size.height * channels, image.data);

  return image;
}





int main()
{
    std::cout << "OpenCV version (runtime): " << cv::getVersionString() << std::endl;

    //const std::string filename = "/Users/zz/data/peppers.png";
    const std::string filename = "/Users/zz/data/ASRDebug_0_7680x4320.png";
    cv::Mat src2;
    {
        birch::AutoTimer timer1("cv::imread");
        src2 = cv::imread(filename);
    }
    printf("src2: rows=%d, cols=%d\n", src2.rows, src2.cols);
    //cv::imwrite("result2.png", src2);

    cv::Mat src1;
    {
        birch::AutoTimer timer1("ncv::read_png");
        src1 = ncv::read_png(filename);
    }
    //cv::imwrite("result1.png", src1);
    printf("src2: rows=%d, cols=%d\n", src1.rows, src1.cols);

    std::cout << cv::getBuildInformation() << std::endl;

    return 0;
}

zchrissirhcz avatar Jun 09 '24 07:06 zchrissirhcz