php-vips icon indicating copy to clipboard operation
php-vips copied to clipboard

add writeToArray (was: is find_trim slow?)

Open MastroBoy opened this issue 6 years ago • 8 comments

Hi @jcupitt,

i would like to ask you why find_trim is so slow in my tests please (compared to ImageMagick one).

Environment: Ubuntu 16 php 7.2 vips headers 8.8.1 vips library 8.8.1 gcc 9

Source file: a big black PNG with transparency (88kb, 4900x2300). It's a sort of 2D height field from which i need to take some height values... i don't need to save it or part of it.

I need to split the image into N vertical images (N = 1000) and to get their height. What i am doing is:

$im = Vips\Image::newFromFile($file);
$im = $im->extract_band(3);

for every N
$cloned = $im->crop($i * $stripeWidth, 0, $stripeWidth, $im->height);
$trimArea = $cloned->find_trim(['threshold' => 0, 'background' => 0]);

This way i am getting the right results but it is taking a lot of time:

libvips-> 9.35 sec ImageMagick-> 1.80 sec

I guess i am doing something wrong, would you be so kind to help me please?

Thanks in advance, Marco

MastroBoy avatar Aug 26 '19 22:08 MastroBoy

You could try to load the main image in streaming mode (['access' => 'sequential']) and copy the cropped parts ($cloned) into a memory image (->copyMemory()), then from that, search for the trim area ($trimArea).

For example (untested):

$im = Vips\Image::newFromFile($file, ['access' => 'sequential']);
$im = $im->extract_band(3);

for ($i = 0; $i < 1000; $i++) {
    $cloned = $im->crop($i * $stripeWidth, 0, $stripeWidth, $im->height);
    $trimArea = $cloned->copyMemory()->find_trim(['threshold' => 0, 'background' => 0]);
}

kleisauke avatar Sep 16 '19 17:09 kleisauke

Hello Marco, I'm very sorry for not replying, this accidentally dropped off my to-do list.

IM and libvips are computing different things here. The libvips find_trim operator is doing quite a bit of filtering so that it will work correctly with noisy photographic images, while your IM example is probably just searching for non-zero pixels.

I guess you want to search down from the top of each strip for the first row with a non-zero pixel, is that right?

The quickest way to do this is to use project to do a horizontal sum, then use profile to find the first non-zero.

Suppose you have a tall, thin image, 10 pixels across and 1000 pixels down. project will calculate two images: a 10 x 1 image where each pixel is the sum of the corresponding column, and a 1 x 1000 image where each pixel is the sum of the corresponding row. You want the second, row sum image.

profile is similar: if you give it a M x N image it'll calculate a M x 1 image where each pixel is the index of the first non-zero pixel in that column, and a 1 x N image where each pixel is the index of the first non-zero pixel in that row.

Put the two together and you can find your trim area quickly. Flip the row sum up-down and profile a second time to trim upwards.

jcupitt avatar Sep 17 '19 09:09 jcupitt

I had a quick go in pyvips (just because I have it set up on this laptop, it's almost the same in php):

#!/usr/bin/python3

import sys
import pyvips

im = pyvips.Image.new_from_file(sys.argv[1])

# just use alpha
im = im[3]

n_strips = 1000
strip_width = im.width // n_strips

for x in range(0, im.width, strip_width):
    strip = im.crop(x, 0, strip_width, im.height)
    
    # find the row and col sums
    col_sum, row_sum = strip.project()
    
    # search down from the top of the row sums for the first non-zero pixel
    col_profile, row_profile = row_sum.profile()
    top = col_profile(0, 0)[0]
    
    # flip the sums up-down and search again
    col_profile, row_profile = row_sum.flipver().profile()
    bottom = im.height - col_profile(0, 0)[0]
    
    print(f"strip at {x}: top = {top}, bottom = {bottom}")

And made a test image:

heightmap

Is that the kind of image you are searching?

That py program is rather slow. It would probably be quicker to run profile on the whole of the alpha, then slice into strips.

If you can provide a sample image and a complete runnable benchmark I could try again.

jcupitt avatar Sep 17 '19 10:09 jcupitt

Hello @jcupitt and @kleisauke,

thanks so much for your efforts guys, the aim of my script (part of a bigger un-proportional sizing algorithm) is to divide the image you see below vertically and for each of the extracted sub-areas i need its coordinates (x, y, width and trimmed height).

I will post my code here:

$start = microtime(true);
$file = realpath('test.png');
$numSegments = 1000;
$im = Vips\Image::newFromFile($file);
// here i am making a transparent border of 1 pixel around the input image
$im = $im->embed(1, 1, $im->width+2, $im->height+2, ['background' => '0 0 0 0']);
$im = $im->extract_band(3);
$width = $im->width;
$height = $im->height;
$stripeWidth = ($width / $numSegments);

$coords = array();

for ($i=0; $i < $numSegments; $i++) {
  $cloned = $im->extract_area($i * $stripeWidth, 0, $stripeWidth, $im->height);
  $trimArea = $cloned->find_trim();
  $cloned = $cloned->crop($trimArea["left"],$trimArea["top"],$trimArea["width"],$trimArea["height"]);
  $coords[] = array(
    'x' => $i * $stripeWidth,
    'y' => $im->height - $cloneHeight,
    'w' => $cloned->width,
    'h' => $cloned->height,
  );
  $cloned = null;
}

$time_elapsed_secs = microtime(true) - $start;
echo 'time elapsed for file is '.$time_elapsed_secs.'<br>';

this is the image i am reading from:

sample

Thank you so much for your time! Marco

ps. "height field" was not the most correct word to use sorry ^_^'

MastroBoy avatar Sep 17 '19 11:09 MastroBoy

Ah, OK. Yes, you need to run profile on the whole alpha. In python:

#!/usr/bin/python3
  
import sys
import pyvips
import numpy as np

im = pyvips.Image.new_from_file(sys.argv[1], access="sequential")

# just use alpha
im = im[3]

# search down from the top for the first non-zero pixel in each column
col_profile, _ = im.profile()

# write out as a memory array, then wrap a python array around that
mem = col_profile.write_to_memory()
top = np.ndarray(buffer=mem, dtype=np.int32, shape=im.width)

# fo each vertical strip, find the lowest value for top
n_strips = 1000
strip_width = im.width // n_strips
for left in range(0, im.width, strip_width):
    print(f"left = {left}, top = {min(top[left:left + strip_width])}")

On this laptop for your sample image I see:

$ time ./trim7.py ~/pics/bump.png 
real	0m0.695s
user	0m0.945s
sys	0m0.275s

Unfortunately, php-vips does not have an image -> array function (writeToArray?) so I think you'd need to read out pixels one at a time, which is extremely slow.

Something like writeToArray would be necessary for this to work well, I think. Let's tag this issue as an enhancement.

jcupitt avatar Sep 17 '19 12:09 jcupitt

Got it! thank you a lot for your support @jcupitt, much appreciated :)

MastroBoy avatar Sep 17 '19 18:09 MastroBoy

Let's leave this open as an enhancement issue.

jcupitt avatar Sep 17 '19 20:09 jcupitt

I added writeToArray. You'll need the latest pecl install vips and php-vips 1.0.5.

This program now works:

#!/usr/bin/php 
<?php

require __DIR__ . '/vendor/autoload.php';
use Jcupitt\Vips;

$im = Vips\Image::newFromFile($argv[1], ["access" => "sequential"]);

# just use alpha
$im = $im[3];
  
# search down from the top for the first non-zero pixel in each column
$column_profile = $im->profile()["columns"];

# turn into an array ... so $im->width numbers
$arr = $column_profile->writeToArray();

# for each strip, find the lowest value for top
$n_strips = 1000;
$strip_width = intval($im->width / $n_strips);
for ($x = 0; $x < $im->width; $x += $strip_width) {
  echo("left = " . $x .  ", " .
       "top = " . min(array_slice($arr, $x, $strip_width)) . "\n");
} 

And should be pretty quick.

jcupitt avatar Sep 26 '19 15:09 jcupitt