We are exploring parallel tracks for cloud-based MapKnitter exporting, and one option is a JavaScript based process.

The base idea is to run the export process as a scalable web service, possibly "serverless" or REST, in Google Cloud and/or other cloud providers like Amazon AWS Lambda (primarily Google Cloud but compatible with others). Comments/suggestions/eurekas welcome! 🎉

Importantly, either track would ideally present the same API so that we could compare their performance.

JavaScript track

In this track, more experimentally, we'd use Image Sequencer, possibly with the webgl-distort library.

The major challenges here, I'd guess, would be:

handling very big image files (up to 8mb each?) in memory
serious speed improvements in IS, such as the proposed WebAssembly or WebGL adapters
figuring out the best way to persist images for later access, and how to integrate the exporter with this (passing a callback function to upload them to a given store? Credentials?)
trying to duplicate or integrate GDAL's generation of a giant combined GeoTIFF (just really huge images to manage in memory?)
trying to duplicate or integrate GDAL's generation of TMS-formatted map tiles

For these last two, see #296 where there are some JS options to experiment with.

Also, we would try to develop this track in such a way as to make it possible to run locally in the browser, natively or in an Electron-style local JS app.

Ruby/ImageMagick/GDAL track

A more traditional approach is being explored here: #258, where we take the exporting sections currently featured in MapKnitter, and duplicate them in a minimal Ruby container that can be run on-demand.

Spec

To guide the development of both tracks, we're imagining a basic common behavior of:

receiving a collection of image URLs or data-URLs of images AND a scale (cm/px or final pixel size)
outputting a combined JPG image at a given scale or pixel size
advanced versions might cut tiles or output GeoTiffs (see challenges in JS version above)

Links and resources are being compiled here: https://github.com/publiclab/mapknitter/issues/296

What have I missed? @tech4GT @icarito would you mind adding any questions, clarifications?

Update: diagrams

I've put together a diagram of the current exporter workflow, which I hope is helpful. It's also largely ported into a standalone Ruby library in #341 -- soon to potentially be a Gem:

screenshot 2019-02-16 at 2 49 30 pm

Image Sequencer should allow us to parallelize this, and improve its speed, as illustrated in this diagram:

screenshot 2019-02-16 at 2 49 20 pm

Jan 16 '19 16:01 jywarren

@jywarren this looks really nice! Things immediately make a lot more sense!🎉

Jan 16 '19 17:01 tech4GT

I've just spent some time deploying a learning project to Google Cloud Platform (App Engine) as a Docker container. I've got a better understanding now of what is required! Thanks!

Jan 22 '19 07:01 icarito

Awesome. I have a more system-wide planning issue drafted and will post soon. But these container tests can start whenever you both are ready. Sebastian do you think getting the gdal and imagemagick containers will be pretty straightforward too?

Thanks again!

On Tue, Jan 22, 2019, 4:12 AM Sebastian Silva <[email protected] wrote:

I've just spent some time deploying a learning project to Google Cloud Platform (App Engine) as a Docker container. I've got a better understanding now of what is required! Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-456294003, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJxwWLCcBjSpGwbipwzLX6OgVhYoLks5vFrm-gaJpZM4aDgKo .

Jan 22 '19 14:01 jywarren

OK, i'd like to add in an overview of the export system step by step; i've left notes where we might make changes or improvements as well, and will link to lines of code where these things currently happen!

Also -- a couple ideas:

Idea: produce separate GeoTiffs to skip the
Idea: produce TMS tiles from any collection of images, given tile coordinates and image sources (with known corner coordinates)

Breaking down the export process

Separable steps

collect set of image URLs and their corner coordinates
for each image (could do this from existing Ruby code or in npm module):
- determine image pixel dimensions
- convert corner coordinates to pixel positions
for each image (using existing Ruby/ImageMagick code or in remote Image Sequencer container):
- current code at https://github.com/publiclab/mapknitter/blob/main/app/models/warpable.rb#L153-L336 generate_perspectival_distort
- perform image distortion from original dimensions to new pixel corner positions
- embed exif data for corner coordinates
- save and return URL for download
- (optional, could do later) produce GeoTiffs of each image
- (optional, could do later) produce TMS tileset of each image
given collection of warped images, calculate pixel positions of image collection relative to each other (Ruby code exists)
- (optional alternative) produce SVG or PDF containing images at relative positions (less memory use)
- currently code appears in https://github.com/publiclab/mapknitter/blob/main/app/models/map.rb#L231 in run_export, distort_warpables, generate_composite_tiff, generate_tiles, generate_jpg
- produce composite/merged image using this data
- save and return URL of combined image for download
- (optional) produce GeoTiff of combined image
- (optional) pass GeoTiff to GDAL for conversion into traditional TMS tileset
Possible next steps:
- produce merged TMS of step 3 per-image TMS tiles instead of generating from step 4's giant GeoTiff
- produce single TMS from combination of per-image GeoTiffs from end of step 3

Jan 28 '19 17:01 jywarren

@SidharthBansal @tech4GT @icarito just so you see this additional note breaking down the export process. There are portions that could be accomplished with traditional ImageMagick/GDAL combo just breaking out the Ruby-controlled code in our codebase (see #296 but i'll copy in more here), but I am hoping we can accomplish a lot in stand-alone containers in a serverless or at least remote REST model.

Jan 28 '19 20:01 jywarren

Starting work now! @icarito Can you please share some of the resources you have been going through, that would be a big help for me :)

Feb 07 '19 06:02 tech4GT

@jywarren @icarito I would be starting with a basic express configuration that takes an image url and a sequencer string and returns the final output, can we create a repository for this on publiclab? Or should I make this on my github??

Feb 07 '19 07:02 tech4GT

Okay a couple of things here

We should add a flag to the run config which allows us to disable the progress logs(it'll unnecessarily slow down the server otherwise)
I have some ideas in mind to speed up the pixelManipulation API which will in turn speed up most modules
Should we return the output as a data uri or us the imgur service like we originally planned? Or maybe we can have a parameter in the request which allows both options

/* Request Body */
{
'url': <String>, // URL if input image
'sequence': <String> // The sequence string which will be imported into sequencer,
'upload': <Boolean> // Denotes whether to return the data uri or to upload to imgur and return that
}

How does this sound @jywarren @icarito ??

Feb 07 '19 08:02 tech4GT

How about https://github.com/publiclab/image-sequencer-app, which we created a while back?

Let's start with a dataurl, but we should also plan to have an abstract way to "put" the image somewhere.

Great!

On Thu, Feb 7, 2019 at 3:36 AM Varun Gupta [email protected] wrote:

Okay a couple of things here

We should add a flag to the run config which allows us to disable the progress logs(it'll unnecessarily slow down the server otherwise)

I have some ideas in mind to speed up the pixelManipulation API which will in turn speed up most modules

Should we return the output as a data uri or us the imgur service like we originally planned? Or maybe we can have a parameter in the request which allows both options

/* Request Body */ {'url': String, // URL if input image'sequence': String // The sequence string which will be imported into sequencer,'upload': boolean // Denotes whether to return the data uri or to upload to imgur and return that }

How does this sound @jywarren https://github.com/jywarren @icarito https://github.com/icarito ??

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-461329950, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJw5cIYwIVHGe5TuxzTkK1mvYRb0Xks5vK-WFgaJpZM4aDgKo .

Feb 07 '19 19:02 jywarren

@jywarren Ok pushing the most basic setup now!

Feb 07 '19 19:02 tech4GT

@jywarren Can you please grant me push access to the repository :sweat_smile:

Feb 07 '19 19:02 tech4GT

doing so now, thanks!!!

On Thu, Feb 7, 2019 at 2:46 PM Varun Gupta [email protected] wrote:

@jywarren https://github.com/jywarren Can you please grant me push access to the repository 😅

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-461569688, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJ6EgdE3JvF9S42CVQM3xjNkq2MuCks5vLIJcgaJpZM4aDgKo .

Feb 07 '19 20:02 jywarren

@jywarren One more thing, do you want me to get cracking on the optimizations for sequencer first or deploy the container first?

Feb 07 '19 20:02 tech4GT

Let's get the container working first -- but we can also encourage people in IS to tackle some of the optimizations, and point at this container to show why it'll be important!

On Thu, Feb 7, 2019 at 3:22 PM Varun Gupta [email protected] wrote:

@jywarren https://github.com/jywarren One more thing, do you want me to get cracking on the optimizations for sequencer first or deploy the container first?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-461581609, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJ8hcFGGQRrf4c7ercVIGG0UVva9dks5vLIrbgaJpZM4aDgKo .

Feb 07 '19 20:02 jywarren

Okay I'll try to deploy the container with a very basic setup tomorrow, and then I'll raise an issue for the optimizations, maybe I can document some of my ideas over there too! Also on a different note I tried out the app locally and it works like a charm :v:

Feb 07 '19 20:02 tech4GT

oh wow!!! very cool.

Check out the various "projects" several of which are optimization related: https://github.com/publiclab/image-sequencer/labels/project

On Thu, Feb 7, 2019 at 3:27 PM Varun Gupta [email protected] wrote:

Okay I'll try to deploy the container with a very basic setup tomorrow, and then I'll raise an issue for the optimizations, maybe I can document some of my ideas over there too! Also on a different note I tried out the app locally and it works like a charm ✌️

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-461583123, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJwekjjGujCRH36NHXMVawY1rYa5Rks5vLIv6gaJpZM4aDgKo .

Feb 07 '19 20:02 jywarren

One think I am concerned about though is, if we do switch to web assembly, what parts of the main code we would need to re-write or should we just switch to something like openCV entirely? I think we can start with making optimizations in javascript and then move towards web-assembly if that gets unmanageable, what do you think?

Feb 07 '19 20:02 tech4GT

Yeah i am not sure about it. I think we can follow multiple paths to optimize and should probably discuss that in the IS repo. Switching several modules to openCV would be powerful and flexible. So would webAssembly of pixelManipulation.

On Thu, Feb 7, 2019 at 3:33 PM Varun Gupta [email protected] wrote:

One think I am concerned about though is, if we do switch to web assembly, what parts of the main code we would need to re-write or should we just switch to something like openCV entirely? I think we can start with making optimizations in javascript and then move towards web-assembly if that gets unmanageable, what do you think?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-461584988, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJ6gpOiK6JeljqGN9nA6zAp1HfQ67ks5vLI1xgaJpZM4aDgKo .

Feb 07 '19 20:02 jywarren

I think you are right, also please do have a look at the repository, I have pushed the basic file I wrote earlier today, will be extending this A LOT but I think this gives us a start.

Feb 07 '19 20:02 tech4GT

Just a note that Google Cloud Engine has Standard Environment and Flexible Environment and Ruby seems to only be supported on Flexible Environment which is significantly more expensive: https://cloud.google.com/appengine/docs/standard/appengine-generation

Feb 08 '19 20:02 icarito

Oh, but is-app will be pure node.js, so I guess thats not a problem!

Feb 09 '19 10:02 tech4GT

Hi @tech4GT @icarito @sashadev-sky and others -- i just uploaded diagrams above in which i tried to very clearly articulate the current and planned export workflows. Please have a look!

Feb 18 '19 17:02 jywarren

However, we should think about, in both cases, what points we should try to report status in a status.json file which could be polled in JavaScript by MapKnitter users as their export runs, to be able to see what stage their work is in.

This, and other aspects such as the parallel running and the image pairing during compositing, make me think we really need to consider a new layer, a mapknitter-exporter-runner that could persist a bit longer, run in a container itself, but could persist a status.json file for the entire export run.

We could even think more broadly and develop it as an image-sequencer-runner which can handle complex branching image sequencer runs. @tech4GT maybe this is where the full express-based image-sequencer-app comes in, since the simpler individual steps seem to be possible using just cloud functions? Love to hear your thoughts on all this.

Feb 18 '19 17:02 jywarren

Haha awesome label @sashadev-sky -- i'll respond more completely later today i hope!

Just noting that @icarito has created a Dockerfile for the GDAL/ImageMagick container track: https://github.com/publiclab/mapknitter/pull/349

Feb 20 '19 20:02 jywarren

Hi @jywarren I was thinking about the is-runner and maybe we can base it on the nodejs clustered api? https://nodejs.org/api/cluster.html Also How do we want to divide up the work inside these processes exactly? I mean is it specified by the user or we want some kind of algorithm to decide?

Mar 15 '19 07:03 tech4GT

Hi Varun!!! Interesting. I guess we could start by noting for each step what previous steps must be complete for it to run, and we could point each at prior step references. We might track their state, but also trigger a re-assessment of if all are complete, using an event listener? I think it might be worth writing this out step by step. Like:

receive multiple image URLs and coordinates
step one: run one process for each image, /concurrently/
once all images are done with step 2, begin next step
go through images one by one and add them /sequentially/ to the previous image (using coordinate offsets) to make a big combined image
...

at some of these steps, we would need to know a) what triggers the step to check if it can begin, and b) what conditions must be met for it to start, right?

On Fri, Mar 15, 2019 at 3:32 AM Varun Gupta [email protected] wrote:

Hi @jywarren https://github.com/jywarren I was thinking about the is-runner and maybe we can base it on the nodejs clustered api? https://nodejs.org/api/cluster.html Also How do we want to divide up the work inside these processes exactly? I mean is it specified by the user or we want some kind of algorithm to decide?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-473186237, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJ9in_wZVCtW-g90UqnbSp_tvNxFPks5vW0xNgaJpZM4aDgKo .

Mar 15 '19 18:03 jywarren

This makes sense Jeff, let me write up some code and see if it works, let’s build this into is-app.

Mar 16 '19 17:03 tech4GT

That's great. You could even start with a sequence that doesn't yet do distortion, but something simpler that already works. Then we'll have the shell of the system in the right format and can focus on just getting the internal modules to work.

Thanks!

On Sat, Mar 16, 2019, 1:02 PM Varun Gupta [email protected] wrote:

This makes sense Jeff, let me write up some code and see if it works, let’s build this into is-app.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/mapknitter/issues/298#issuecomment-473565411, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJ1IwF6c5gnZ-8t5GpgFXQCb35mX_ks5vXSN3gaJpZM4aDgKo .

Mar 16 '19 17:03 jywarren

Yeah that’s what I was thinking, we can plug in the distortion part later since there are a couple of options I need to explore there and I don't want to slow this down because of that!

Mar 16 '19 17:03 tech4GT

@jywarren Is there any is-module which stitches the images together? Or do I need to write one?

Mar 17 '19 06:03 tech4GT

MapKnitterExporter architecture discussion

JavaScript track

Ruby/ImageMagick/GDAL track

Spec

Update: diagrams

Breaking down the export process

Separable steps