Slower performance of JP2KAK in comparison with JP2OpenJPEG.
Expected behavior and actual behavior.
I compiled GDAL with the Kakadu library to be able to use JP2KAK for reading my jp2 files. However, contrary to expectations, the JP2KAK driver is slower than OpenJPEG. JP2KAK advantages include the absence of the need to load the entire jp2 file when loading all pixels corresponding to the area of interest and 2x fewer requests (for the same config settings in the mapfile).
Steps to reproduce the problem.
I compiled Kakadu with GDAL as described in this ticket: https://github.com/OSGeo/gdal/issues/8417 Then, I compiled MapServer with GDAL using:
# syntax=docker/dockerfile:experimental
FROM mariush98/kdu_gdal:test as gdal
FROM gdal as builder
LABEL maintainer Camptocamp "[email protected]"
SHELL ["/bin/bash", "-o", "pipefail", "-cux"]
RUN --mount=type=cache,target=/var/cache,sharing=locked \
--mount=type=cache,target=/root/.cache \
apt-get update \
&& apt-get upgrade --assume-yes \
&& DEBIAN_FRONTEND=noninteractive apt-get install --assume-yes --no-install-recommends bison \
flex python3-lxml libfribidi-dev swig \
cmake librsvg2-dev colordiff libpq-dev libpng-dev libjpeg-dev libgif-dev libgeos-dev libgd-dev \
libfreetype6-dev libfcgi-dev libcurl4-gnutls-dev libcairo2-dev libxml2-dev \
libxslt1-dev python3-dev php-dev libexempi-dev lcov lftp ninja-build git curl \
clang libprotobuf-c-dev protobuf-c-compiler libharfbuzz-dev libcairo2-dev librsvg2-dev \
&& ln -s /usr/local/lib/libproj.so.* /usr/local/lib/libproj.so
ARG MAPSERVER_BRANCH=branch-8-0
ARG MAPSERVER_REPO=https://github.com/mapserver/mapserver
RUN git clone ${MAPSERVER_REPO} --branch=${MAPSERVER_BRANCH} --depth=100 /src
COPY checkout_release /tmp
RUN cd /src \
&& /tmp/checkout_release ${MAPSERVER_BRANCH}
COPY instantclient /tmp/instantclient
ARG WITH_ORACLE=OFF
RUN --mount=type=cache,target=/var/cache,sharing=locked \
--mount=type=cache,target=/root/.cache \
(if test "${WITH_ORACLE}" = "ON"; then \
apt-get update && \
apt-get install --assume-yes --no-install-recommends \
libarchive-tools libaio-dev && \
mkdir -p /usr/local/lib && \
cd /usr/local/lib && \
(for i in /tmp/instantclient/*.zip; do bsdtar --strip-components=1 -xvf "$i"; done) && \
ln -s libnnz19.so /usr/local/lib/libnnz18.so; \
fi )
WORKDIR /src/build
RUN if test "${WITH_ORACLE}" = "ON"; then \
export ORACLE_HOME=/usr/local/lib; \
fi; \
cmake .. \
-GNinja \
-DCMAKE_C_FLAGS="-O2 -DPROJ_RENAME_SYMBOLS" \
-DCMAKE_CXX_FLAGS="-O2 -DPROJ_RENAME_SYMBOLS" \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/usr/local \
-DWITH_CLIENT_WMS=1 \
-DWITH_CLIENT_WFS=1 \
-DWITH_OGCAPI=1 \
-DWITH_KML=1 \
-DWITH_SOS=1 \
-DWITH_XMLMAPFILE=1 \
-DWITH_CAIRO=1 \
-DWITH_RSVG=1 \
-DUSE_PROJ=1 \
-DUSE_WFS_SVR=1 \
-DUSE_OGCAPI_SVR=1 \
-DWITH_ORACLESPATIAL=${WITH_ORACLE}
RUN ninja install \
&& if test "${WITH_ORACLE}" = "ON"; then rm -rf /usr/local/lib/sdk; fi
FROM gdal as runner
LABEL maintainer Camptocamp "[email protected]"
SHELL ["/bin/bash", "-o", "pipefail", "-cux"]
# Let's copy a few of the settings from /etc/init.d/apache2
ENV APACHE_CONFDIR=/etc/apache2 \
APACHE_ENVVARS=/etc/apache2/envvars \
# And then a few more from $APACHE_CONFDIR/envvars itself
APACHE_RUN_USER=www-data \
APACHE_RUN_GROUP=www-data \
APACHE_RUN_DIR=/tmp/apache2 \
APACHE_PID_FILE=/tmp/apache2/apache2.pid \
APACHE_LOCK_DIR=/var/lock/apache2 \
APACHE_LOG_DIR=/var/log/apache2 \
MS_MAP_PATTERN=^\\/etc\\/mapserver\\/([^\\.][-_A-Za-z0-9\\.]+\\/{1})*([-_A-Za-z0-9\\.]+\\.map)$
RUN --mount=type=cache,target=/var/cache,sharing=locked \
--mount=type=cache,target=/root/.cache \
apt-get update \
&& apt-get upgrade --assume-yes \
&& apt-get install --assume-yes --no-install-recommends ca-certificates apache2 libapache2-mod-fcgid \
libfribidi0 librsvg2-2 libpng16-16 libgif7 libfcgi0ldbl \
libxslt1.1 libprotobuf-c1 libaio1 glibc-tools
RUN a2enmod fcgid headers status \
&& a2dismod -f auth_basic authn_file authn_core authz_user autoindex dir \
&& rm /etc/apache2/mods-enabled/alias.conf \
&& mkdir --mode=go+w --parent ${APACHE_RUN_DIR} ${APACHE_LOCK_DIR} \
&& mkdir --parent /etc/mapserver \
&& chmod o+w /var/lib/apache2/fcgid /var/lib/apache2/fcgid/sock \
&& find "$APACHE_CONFDIR" -type f -exec sed -ri ' \
s!^(\s*CustomLog)\s+\S+!\1 /proc/self/fd/1!g; \
s!^(\s*ErrorLog)\s+\S+!\1 /proc/self/fd/2!g; \
' '{}' ';' \
&& sed -ri 's!LogFormat "(.*)" combined!LogFormat "%{us}T %{X-Request-Id}i \1" combined!g' /etc/apache2/apache2.conf \
&& echo 'ErrorLogFormat "%{X-Request-Id}i [%l] [pid %P] %M"' >> /etc/apache2/apache2.conf \
&& sed -i -e 's/<VirtualHost \*:80>/<VirtualHost *:8080>/' /etc/apache2/sites-available/000-default.conf \
&& sed -i -e 's/Listen 80$/Listen 8080/' /etc/apache2/ports.conf \
&& rm -rf /etc/apache2/conf-enabled/other-vhosts-access-log.conf
EXPOSE 8080
COPY --from=builder /usr/local/bin /usr/local/bin/
COPY --from=builder /usr/local/lib /usr/local/lib/
COPY --from=builder /usr/local/share/mapserver /usr/local/share/mapserver/
COPY --from=builder /src/share/ogcapi/templates/html-bootstrap4 /usr/local/share/mapserver/ogcapi/templates/html-bootstrap4/
COPY runtime /
RUN ldconfig
ENV MS_DEBUGLEVEL=0 \
MS_ERRORFILE=stderr \
MAPSERVER_CONFIG_FILE=/etc/mapserver.conf \
MAPSERVER_BASE_PATH= \
MAX_REQUESTS_PER_PROCESS=1000 \
MIN_PROCESSES=1 \
MAX_PROCESSES=5 \
BUSY_TIMEOUT=300 \
IDLE_TIMEOUT=300 \
IO_TIMEOUT=40 \
APACHE_LIMIT_REQUEST_LINE=8190 \
GET_ENV=env
CMD ["/usr/local/bin/start-server"]
WORKDIR /etc/mapserver
Afterwards, I deployed MapServer on Kubernetes and created two mapfiles differing in one element: CONFIG "GDAL_SKIP" "JP2KAK". After each request, I deleted the pod and created a new one so that GDAL would not ignore the jp2kak driver in the next request and to be able to read the full log. Here are the logs: pod_open.log pod_kak.log
And here is the mapfile I'm using:
MAP
NAME "Cloudless Mozaik"
CONFIG "AWS_S3_ENDPOINT" "***"
CONFIG "AWS_ACCESS_KEY_ID" "***"
CONFIG "AWS_SECRET_ACCESS_KEY" "***"
CONFIG "CPL_CURL_VERBOSE" "YES"
CONFIG "AWS_HTTPS" "YES"
CONFIG "AWS_VIRTUAL_HOSTING" "FALSE"
CONFIG "GDAL_HTTP_TCP_KEEPALIVE" "YES"
CONFIG "CPL_VSIL_CURL_CHUNK_SIZE" "10485760"
CONFIG "GDAL_HTTP_UNSAFESSL" "YES"
CONFIG "GDAL_INGESTED_BYTES_AT_OPEN" "16384"
# CONFIG "GDAL_SKIP" "JP2KAK"
CONFIG "CPL_DEBUG" "ON"
DEBUG 5
CONFIG "PROJ_DEBUG" "ON"
EXTENT -20037508.34 -20048966.1 20037508.34 20048966.1
UNITS METERS
SIZE 1024 1024
IMAGETYPE PNG
SHAPEPATH "/data/"
PROJECTION
"init=epsg:3857"
END
WEB
IMAGEPATH "/tmp/"
IMAGEURL "/tmp/"
METADATA
"wms_title" "Cloudless Mozaik"
"wms_onlineresource" "http://64.225.139.136.nip.io/?map=/etc/mapserver/next2.map" #must change mapfile path
"wms_srs" "EPSG:3857 EPSG:4326 EPSG:2180"
"wms_enable_request" "*"
"wms_server_version" "1.3.0"
"wms_feature_info_mime_type" "text/html"
"wms_include_items" "all"
"wms_getcapabilities_version" "1.3.0"
#"wms_allow_getmap_without_styles" "true"
"wms_timeitem" "timestamp"
END
END
OUTPUTFORMAT
NAME "cairopng"
DRIVER CAIRO/PNG
MIMETYPE "image/png"
IMAGEMODE RGB
EXTENSION "png"
END
OUTPUTFORMAT
NAME "GTiff"
DRIVER GDAL/GTiff
MIMETYPE "image/tiff"
IMAGEMODE RGB
EXTENSION "tif"
END
############################################################################
# Tile Index
LAYER
DEBUG 5
STATUS OFF
NAME "time_idx"
TYPE POLYGON
CONNECTIONTYPE postgis
CONNECTION "***"
DATA 'geometry from (select * from s22 order by maxcc desc) as subquerry using unique unique_id using srid=3857'
PROJECTION
"init=epsg:3857"
END
VALIDATION
'maxCC' '^[0-9](1, 3)$'
'default_maxCC' '100'
END
METADATA
"wms_title" "tile-index-cloud"
"wms_timeextent" "2023-11-15/2023-11-15/P1D"
"wms_timeitem" "timestamp"
"wms_timedefault" "2023-11-15"
"wms_enable_request" "!*"
END
PROCESSING "CLOSE_CONNECTION=DEFER"
END
############################################################################
LAYER
DEBUG 5
NAME "S2 Masking"
TYPE RASTER
STATUS OFF
DEBUG ON
PROJECTION
"+init=epsg:3857"
END
METADATA
"wms_timeextent" "2023-11-15/2023-11-15/P1D"
"wms_timeitem" "timestamp"
"wms_timedefault" "2023-11-15"
"wms_enable_request" "*"
END
TILEITEM "location_10"
TILEINDEX "time_idx"
TILESRS "epsg"
#PROCESSING "NODATA=0"
PROCESSING "RESAMPLE=BILINEAR"
PROCESSING "CLOSE_CONNECTION=DEFER"
FILTER (`[timestamp]` = `2023-11-15`)
END
#############################################################################
END
JP2OpenJPEG [warn] [pid 16] mod_fcgid: stderr: awMap() total time: 4.333s JP2KAK [warn] [pid 15] mod_fcgid: stderr: awMap() total time: 9.017s
Operating system / MapServer version and installation method
Docker MapServer version 8.0.1 PROJ version 9.4 GDAL version 3.9 kakadu 8_3
What kind of JPEG2000 images you have as source? You probably have kdu_show, what does File - Properties show? Is you image fast to view with kdu_show?
You should try to isolate things a bit. For example just try with gdal_translate. That said, I've indeed seen (rare) situations where openjpeg can be faster than Kakadu. It all depends on the exact JPEG2000 formulation.
What kind of JPEG2000 images you have as source? You probably have kdu_show, what does File - Properties show? Is you image fast to view with kdu_show?
root@80e024210320:/kakadu/bin/Linux-x86-64-gcc# ls
kdu_buffered_compress kdu_hyperdoc kdu_merge kdu_stream_expand kdu_v_compress simple_example_c
kdu_buffered_expand kdu_jp2info kdu_render kdu_stream_send kdu_v_expand simple_example_d
kdu_compress kdu_makeppm kdu_server kdu_text_extractor kdu_vcom_fast
kdu_expand kdu_maketlm kdu_server_admin kdu_transcode kdu_vex_fast
hmmm it seems like kdu_show is missing ;/ Is it possible, though? GDAL is using the jp2kdu driver.
Perhaps kdu_show is only for Windows? The name of "kdu_jp2info" feels promising, have you tried it? Alternatively, can you share/make test data?
Perhaps kdu_show is only for Windows? The name of "kdu_jp2info" feels promising, have you tried it? Alternatively, can you share/make test data? Yup I will give it a try in a moment.
btw. I'm aiming to Sentinel-2 L2A TCI composition stored on S3, here is some sample file (it's from official ESA repository), feel free to download: https://s3.waw3-1.cloudferro.com/swift/v1/demo/T34TET_20231115T094159_TCI_10m.jp2.
Hmm it have a problem with lack of libkdu_v83.so shared libraries but it's clear that it's located in a lib directory :/
/kakadu/bin/Linux-x86-64-gcc/kdu_jp2info: error while loading shared libraries: libkdu_v83R.so: cannot open shared object file: No such file or directory
root@80e024210320:~# ls /kakadu/lib/Linux-x86-64-gcc/
libkdu.a libkdu_a83R.so libkdu_aux.a libkdu_jni.so libkdu_v83R.so
I guess Gdal is aware of this because:
-DKDU_ROOT=/kakadu/ \
-DKDU_INCLUDE_DIR=/kakadu/ \
-DKDU_LIBRARY=/kakadu/lib/Linux-x86-64-gcc/libkdu_v83R.so \
-DKDU_AUX_LIBRARY=/kakadu/lib/Linux-x86-64-gcc/libkdu_a83R.so \
EDIT: okay, I figured it out with this:
root@80e024210320:~# export LD_LIBRARY_PATH="/kakadu/lib/Linux-x86-64-gcc:$LD_LIBRARY_PATH"
root@80e024210320:~# echo $LD_LIBRARY_PATH
/kakadu/lib/Linux-x86-64-gcc:
root@80e024210320:~# /kakadu/bin/Linux-x86-64-gcc/kdu_jp2info -i /T36UVD_20230701T085601_TCI_10m.jp2
-------------
Kakadu Error:
Input file is neither a raw codestream nor a box-structured file. Not a
JPEG2000 file.
So nothing special in the image. Lossless 8-bit RGB, LRCP progression, with 1024x1024 tiles and includes precincts. Image is fast to use with kdu_show and with QGIS (JP2OpenJPEG).
But I noticed now that you are reading JPEG2000 images directly from the cloud through /vsis3/. If it takes 4 vs 9 seconds to render a map from 2 source images I guess that 0.2 seconds of the total time is spent by the OpenJPEG of Kakadu libraries, and the rest is used for something else.
So nothing special in the image. Lossless 8-bit RGB, LRCP progression, with 1024x1024 tiles and includes precincts. Image is fast to use with kdu_show and with QGIS (JP2OpenJPEG).
But I noticed now that you are reading JPEG2000 images directly from the cloud through /vsis3/. If it takes 4 vs 9 seconds to render a map from 2 source images I guess that 0.2 seconds of the total time is spent by the OpenJPEG of Kakadu libraries, and the rest is used for something else.
Yup, it is just as you said:
root@3a4f43fa4b33:/# /kakadu/bin/Linux-x86-64-gcc/kdu_jp2info -i /Desktop/T34TET_20231115T094159_TCI_10m.jp2
<JP2_family_file>
<ftyp name="file-type box" header="8" body="12" pos="12">
<brand> "jp2_" 0x6A703220 </brand>
<minor_version> 0 </minor_version>
<compatible_brand> "jp2_" 0x6A703220 </compatible_brand>
</ftyp>
<jp2h name="JP2-header box" header="8" body="37" pos="32">
<ihdr name="image-header box" header="8" body="14" pos="40"></ihdr>
<colr name="colour box" header="8" body="7" pos="62"></colr>
</jp2h>
<asoc name="association box" header="16" body="2,272" pos="77">
<lbl_ name="label box" header="8" body="8" pos="93"></lbl_>
<asoc name="association box" header="8" body="2,248" pos="109">
<lbl_ name="label box" header="8" body="17" pos="117"></lbl_>
<xml_ name="xml box" header="8" body="2,215" pos="142"></xml_>
</asoc>
</asoc>
<jp2c name="contiguous-codestream box" header="8" body="rubber" pos="2,365">
<codestream>
<width> 10980 </width>
<height> 10980 </height>
<components> 3 </components>
<tiles> 121 </tiles>
</codestream>
</jp2c>
</JP2_family_file>
root@3a4f43fa4b33:/#
I was kind of hoping that Kakadu would allow me to work with JP2 in a COG way (extracting only a sub-element of the overview, so I wouldn't have to download all of the data).
I guess it's just not possible then?
You are using JPEG2000 in a way that the format is not designed for. If OpenJPEG is faster I would use that. Remote access was planned to happen through the JPIP protocol. But if you have enough interest you may try to reach David Taubman and ask what he thinks about your use case. A long time ago there used to be a user group in Yahoo but all that I found about the community support now is this:
Kakadu Software does not provide specific support however technical support can be found at the blog https://htj2k.com/technical-forum/
Who knows if JPEG2000 folks will recognize the need for cloud friendly access and develop a new variant for that. They have already made htj2.
You should try to isolate things a bit. For example just try with gdal_translate. That said, I've indeed seen (rare) situations where openjpeg can be faster than Kakadu. It all depends on the exact JPEG2000 formulation.
I tried:
OPEN
root@3a4f43fa4b33:/# time gdal_translate -projwin 2447696.08356068469583988 7018564.54120731819421053 2464081.17749316245317459 7010831.97032264340668917 -projwin_srs EPSG:3857 -of MEM --config GDAL_NUM_THREADS 1 --config GDAL_HTTP_TCP_KEEPALIVE YES --config AWS_S3_ENDPOINT ******* --config AWS_ACCESS_KEY_ID ******* --config AWS_SECRET_ACCESS_KEY ******* --config AWS_HTTPS YES --config AWS_VIRTUAL_HOSTING FALSE --config GDAL_SKIP JP2KAK /vsis3/DIAS/Sentinel-2/MSI/L2A/2023/05/12/S2B_MSIL2A_20230512T094549_N0509_R079_T34UED_20230512T112355.SAFE/GRANULE/L2A_T34UED_A032280_20230512T094549/IMG_DATA/R10m/T34UED_20230512T094549_TCI_10m.jp2 /dev/null
Input file size is 10980, 10980
0...10...20...30...40...50...60...70...80...90...100 - done.
real 0m4.330s
user 0m1.286s
sys 0m0.231s
KAK
root@193596c77daf:/# time gdal_translate -projwin 2447696.08356068469583988 7018564.54120731819421053 2464081.17749316245317459 7010831.97032264340668917 -projwin_srs EPSG:3857 -of MEM --config GDAL_NUM_THREADS 1 --config GDAL_HTTP_TCP_KEEPALIVE YES --config AWS_S3_ENDPOINT ******* --config AWS_ACCESS_KEY_ID ******* --config AWS_SECRET_ACCESS_KEY ******* --config AWS_HTTPS YES --config AWS_VIRTUAL_HOSTING FALSE /vsis3/DIAS/Sentinel-2/MSI/L2A/2023/05/12/S2B_MSIL2A_20230512T094549_N0509_R079_T34UED_20230512T112355.SAFE/GRANULE/L2A_T34UED_A032280_20230512T094549/IMG_DATA/R10m/T34UED_20230512T094549_TCI_10m.jp2 /dev/null
Input file size is 10980, 10980
0...10...20...30...40...50...60...70...80...90...100 - done.
real 0m2.949s
user 0m1.322s
sys 0m0.115s
here kakadu seems to do better than JPEG
@jratike80 @rouault is there any known implementations of JPIP server as a Data source in Mapserver? If so, is there any way of implementing JPIP with S3 as a data storage system and tile-index technology?
Maybe someone tried before me to combine these?
Sorry, my knowledge is very outdated. I have been running JPIP server myself when Kakadu had a different license and another JPIP server made by Kodak when the company still had a space division. It was awfully long time ago, I believe that Kodak sold the division in 2004.
Hi,
maybe the following script will help you find the problem. In my specific case, the MapServer also performs a transformation (why?). GDAL and MapServer are configured with the JP2KAK driver. Maybe someone can run this script with the JP2OpenJPEG driver:
curl -sSLO https://www.opengeodata.nrw.de/produkte/geobasis/lusat/akt/dop/dop_jp2_f10/dop10rgbi_32_280_5653_1_nw_2021.jp2
cat <<EOF > mapserv.conf
CONFIG
END
EOF
cat <<EOF > jp2kak_test.map
MAP
LAYER
NAME "raster_test"
TYPE RASTER
STATUS ON
DATA "dop10rgbi_32_280_5653_1_nw_2021.jp2"
END
END
EOF
map2img -m jp2kak_test.map -conf mapserv.conf -i GTiff -e 280000 5653000 281000 5654000 -s 10000 10000 -all_debug 5 raster_test > /dev/null
msLoadMap(): 0.000s
msDrawMap(): rendering using outputformat named GTiff (GDAL/GTiff).
msDrawMap(): WMS/WFS set-up and query, 0.000s
msDrawRasterLayerLow(raster_test): entering.
msDrawRasterLayerLow(raster_test): Filename is: dop10rgbi_32_280_5653_1_nw_2021.jp2
msDrawRasterLayerLow(raster_test): Path is: /home/user/./dop10rgbi_32_280_5653_1_nw_2021.jp2
msDrawRasterLayerGDAL(): Entering transform.
msDrawRasterLayerGDAL(): src=0,0,10000,10000, dst=0,0,9999,9999
msDrawRasterLayerGDAL(): source raster PL (-0.500,-0.500) for dst PL (0,0).
msDrawRasterLayerGDAL(): red,green,blue,alpha bands = 1,2,3,0
msDrawMap(): Layer 0 (raster_test), 1.410s
msDrawMap(): Drawing Label Cache, 0.000s
msDrawMap() total time: 1.502s
msSaveImage(stdout) total time: 0.391s
msFreeMap(): freeing map at 0x19080b0.
freeLayer(): freeing layer at 0x190d4d0.
map2img total time: 1.896s
time gdal_translate dop10rgbi_32_280_5653_1_nw_2021.jp2 /dev/null -of GTIFF
Input file size is 10000, 10000
real 0m0.036s
user 0m0.030s
sys 0m0.007s
@MathewNWSH for your Samplefile T34TET_20231115T094159_TCI_10m.jp2
I get a similar transform output:
msDrawRasterLayerGDAL(): Entering transform.
msDrawRasterLayerGDAL(): src=0,0,10980,10980, dst=0,0,10000,10000
msDrawRasterLayerGDAL(): source raster PL (-0.549,-0.549) for dst PL (0,0).
msDrawRasterLayerGDAL(): red,green,blue,alpha bands = 1,2,3,0