sharp
sharp copied to clipboard
Discussion: Operations and pipelines external API
This "issue" is to gather feedback on potential API improvements that would result in logically splitting existing methods into operations vs options, then allowing image processing pipelines to be constructed from JavaScript.
This would more closely mirror the way libvips works underneath, allowing a cleaner way of exposing more of its features, and make this module more of a complete binding (whilst still providing the bridge from JavaScript <-> V8 <-> libuv-managed threads <-> libvips).
I've no idea right now how much of an API breaker this will turn out to be. At best it will simply complement the existing API. As always there will be a deprecation route for any methods that have to move out of the way.
@LinusU has made some excellent suggestions about this - see https://github.com/lovell/sharp/issues/235#issuecomment-115148217
Reposting my comment here
I'm so used to chaining now that it feels really fluid and natural to me.
That's good for you but I think that it would hold true for even the most awkward api. Use it enough and it'll start feeling natural. Now, I'm not saying that chaining is that bad, but it does have a number of drawbacks.
Having a "pipeline" would be cool but that is not at all supported at the moment. The only one that actually works happens to be the one you specified extract => resize => extract
; and that's only because it's actually preExtract => resize => postExtract
. You can't for example do resize => extract => resize
, or even rotate => extract => rotate
.
Sometimes a chaining api can be a nice way to specify things and that could certainly stay, but I would propose that we do it like this.
1. All the functions that currently only manipulate options
would instead push a new object to the "transform pipeline".
(writing in ES6, because faster)
class Sharp {
constructor (input) {
this.input = input
this.queue = []
this.transformPipeline = []
}
rotate (angle) {
this.transformPipeline.push({ op: 'rotate', angle: angle })
return this
}
// ...
}
export default function sharp (input) {
return new Sharp(input)
}
2. We have a function to run the chained commands
// ...
run (cb) {
this.queue.push({ ops: this.transformPipeline, cb: cb })
this.transformPipeline = []
this.scheduleWork()
}
// ...
3. We also provide a function that runs an entire pipeline.
// ...
transform (ops, cb) {
this.queue.push({ ops: ops, cb: cb })
this.scheduleWork()
}
// ...
Usage example:
let linus = sharp('linus.jpg')
let thumbnailOps = [
{ op: 'resize', size: '100x100' },
{ op: 'normalize' }
]
linus.transform(thumbnailOps, function (err, out) {
if (err) throw err
// out is your image :)
})
4. All of this can be run using the method for paralellization that I posted earlier
// ...
scheduleWork () {
if (this.workScheduled) return
this.workScheduled = true
setImmediate(() => {
let queue = this.queue
this.queue = []
this.workScheduled = false
// Call out to C++ and handle all the lists in `queue` with the input `this.input`
})
}
// ...
Example usage
let linus = sharp('linus.jpg')
let thumbnailOps = [
{ op: 'resize', size: '100x100', type: 'crop' },
{ op: 'normalize' }
]
let iphoneOps = [
{ op: 'resize', size: '800x800', type: 'max' }
]
let desktopOps = [
{ op: 'resize', size: '2048x2048', type: 'max' },
]
function handleImage (err, out) {
if (err) throw err
// out is your image :)
}
linus.transform(thumbnailOps, handleImage)
linus.transform(iphoneOps, handleImage)
linus.transform(desktopOps, handleImage)
Wow that was a bit longer than I originally intended :) What do you guys think?
I understand that this is quite large changes but it's just to get the discussion started. I think that we can accomplish something very great here! :tada:
I think I would be pro chaining, as long as it's stateless.
libvips is supposed to have a functional, or stateless API: you never modify objects, you only ever create new objects. There's no state in the sense that there are no objects whose value changes over time.
sharp has chaining plus state at the moment (I think, hope I have this right). You chain together a set of methods like resize
or crop
, but the methods don't actually do those actions, instead they set fields on a global object for this request. Right at the end, sharp looks at that set of parameters and constructs a libvips pipeline to implement it. In other words, sharp methods set parameters for a fixed pipeline.
My instinct would be to make a JS binding like the Python or Ruby ones, where image.crop(x, y, w, h)
really does call vips_crop()
, and then to build sharp's nice, high-level resizing API on top of that.
This would maybe give you the best of both worlds. You'd get high-level resize operations, but also have the ability to change the pipeline if you needed to.
It would be a fair amount of work, unfortunately :( The Ruby and Python bindings use gobject-introspection to generate the binding automatically and at run time, but sadly (as far as I know) there's no working GOI for node, only for spidermonkey. The simplest approach might be to use a little Python or Ruby to generate the C++ for the methods.
I've done an experimental PHP binding for libvips here:
https://github.com/jcupitt/php-vips-base
It's in C and is about 1500 lines of code for the whole thing. There are no dependencies except libvips, so no gobject-introspection or anything like that.
It's very low-level. There are three functions to make new images from files, buffers and arrays, two writers (to files and buffers), get and set header fields, and call operation.
Call operation is the only tricky one. You can call any vips operation like this:
$result = vips_call("min", $image, ["x" => true, "y" => true, "x_array" => true]);
$result
is an array of output objects. The final array argument is the set of options. Required input arguments go after the operation name. It handles things like conversion from PHP types to vips types, constant expansion, reference counts, and so on.
There's a pure PHP package here that builds on php-vips-base
to try to make a nice API. This is another 800 lines of code.
https://github.com/jcupitt/php-vips
With this thing you can write PHP like:
$im = Vips\ImageClass::newFromFile($argv[1], ["access" => "sequential"]);
$im = $im->crop(100, 100, $im->width - 200, $im->height - 200);
$im = $im->reduce(1.0 / 0.9, 1.0 / 0.9, ["kernel" => "linear"]);
$mask = Vips\ImageClass::newFromArray(
[[-1, -1, -1],
[-1, 16, -1],
[-1, -1, -1]], 8);
$im = $im->conv($mask);
$im->writeToFile($argv[2]);
I think I'd suggest something similar for sharp. Make a low-level thing in C or C++ implementing a general vips_call()
function, then layer some JS on top of that to make it nice to use. Finally, build a specialized resizing framework on top, compatible with the system you have now.
The advantage would be that, apart from a fairly small amount of C/C++ that wouldn't change much, you'd have the whole thing written in JS, making development a lot faster and simpler. And your JS layer could use any part of libvips, so you could add features, like perhaps entropy cropping, without having to add more hard-to-maintain native code.
There's been some movement on the libuv threadpool discussion, with talk of a possible pluggable API - see https://github.com/libuv/leps/pull/4 - I expect a "full" binding may depend upon this.
I'm slowly moving options-as-functions that only apply to one operation to become a parameter of the relevant operation instead, e.g. https://github.com/lovell/sharp/pull/569#issuecomment-246344296
Hello, I've had a go at a full node binding for libvips:
https://github.com/jcupitt/node-vips
It's based on node-ffi, so there's no native code, though obviously node-ffi is native. Node-ffi supports async and libuv and all that, so it should be compatible with sharp.
It's dynamic, so it opens the libvips shared library and uses the libvips introspection facilities to discover operations. It seems to work: it has 60 or so tests which pass cleanly and with no leaks. It can run code like this:
vips = require('vips');
var image = vips.Image.new_from_file('k2.jpg');
var image = vips.Image.new_from_file('k2.jpg', {shrink: 2});
var pixel = Image.black(1, 1).add(value).cast(this.format);
var image = pixel.embed(0, 0, this.width, this.height,
{extend: 'copy'});
image = image.copy({
interpretation: this.interpretation,
xres: this.xres,
yres: this.yres,
xoffset: this.xoffset,
yoffset: this.yoffset
});
image.write_to_file('x.png');
So basically the same API as the php / python / ruby / lua bindings.
Unfortunately, I know nothing about node, so I've probably done lots of things wrong (naming, packaging, etc. etc.). It really needs a node expert to look it over and fix it up.
I plan to do a little bit more work on it, but then leave it for someone else to pick up and get production-ready (hopefully).
Anyway, any feedback would be very welcome.
(oh, and it's 1,300 lines of JS, so not so difficult to work on, I hope)
@jcupitt Hey, that's great John, I'm happy to help see what needs to be done to get this more production-ready and therefore sharp-ready.
Attempting to switch to the async version of ffi might prove tricky as each operation within the same pipeline might be run on different worker threads from the libuv pool, but definitely worth experimenting with.
That's great @lovell, I'm sure most of my code will make you wtf :( please fix anything. I'd be very happy to hand the whole thing over to you, if you'd take it. Sharp would be a good home for it.
As far as I know it should work with async -- operations just add to the pipeline, they don't execute, so it doesn't matter if they are added from different threads. I don't have any detailed knowledge though, of course.
I've finished as much of the TODO on node-vips as I can. It's complete now and seems fast and stable (to me), but I'm no node programmer and it really needs an expert's eye. There are a couple of examples:
https://github.com/jcupitt/node-vips/tree/master/example
I know you're only one man @lovell -- but any feedback regarding this topic? Any idea on perf or memory delta between the approaches?
@asilvas If you're referring to John's node-vips module then I've not yet had a proper chance to look at it so I'm unsure what performance overhead, if any, the use of FFI will add.
I took a look at this the other day. Mainly for the approach suggested by @LinusU. From what I can tell this would be a huge change.
The issues that I've found, that need to be resolved:
- The Options argument in
constructor.js
is a mismatch of options. Some of which are used by multiple commands. - Deprecated functions, like
crop
, will modify past chained items. (Example following). -
pipeline.cc
has operations in an almost random order. With some parts of 1 function taking place before and after other functions.
For # 1.
I believe starting here is a good place to begin. It will let us identify which arguments are used where, and by which functions. Using something like Typescript would also greatly help for static checking...
This will help with the native code, as we will need to separate command options into their own structs. I think this leads nicely into having each command be isolated from each other.
For # 2.
In this example:
sharp(someImage)
.resize(300, 400)
.crop()
Crop will modify the arguments of the resize command. Namely the position
and canvas
options. A few strategies I see around this...
- Break backwards compatibility.
- Have the
crop
function go back through thetransformPipeline
modifying any existingresize
commands. - Have
crop
only modify future calls to resize. Thus also breaking backwards compatibility, but in a lesser extent.
For # 3.
This is a much harder one to crack.
For the resize command we start doing scaling calculations right at the beginning of Execute()
in pipeline.cc
. Including things like 'shrink on load', 'will we rotate this image', 'should we rotate this image now'. We will then do operations like invert, gamma, greyscale, overlay.. Then we will then do the actual resize, followed by other commands. This makes a bit complicated to just isolate the resize function.
I also don't know how much of the order of the code in the PipelineWorker
is that way for performance reasons, or just placed in different spots because that's how it evolved.
The other factor to consider with this is any change of the order of operations in the PipelineWorker
will have an impact of backwards compatibility. Eg. as it stands today invert is always applied before gamma.
Extra...
I did take a quick stab at this refactor, but it quickly got more and more complicated. I think another part of helping to simplify this task, is ensuring the baton
is immutable once passed into the PipelineWorker
. That will help with its random argument reassignments in the function.
Thanks!