nimib icon indicating copy to clipboard operation
nimib copied to clipboard

Multiple `nbCodeToJsInit` can't be interlaced

Open HugoGranstrom opened this issue 1 year ago • 12 comments

As has been discussed between me and @pietroppeter, the file size of the final HTML file can grow quite a bit if many nbCodeToJs are used as they will have duplicate code between them that could be reduced. For this, we have the nbCodeToJsInit + addCodeToJs API. But there is one problem with it, namely that we are using a global table https://github.com/pietroppeter/nimib/blob/631ea5520917146ab92c6b5b11ee39fbd8257668/src/nimib/jsutils.nim#L12 which is reset each time nbCodeToJsInit is called (or nbCodeToJs which calls it in turn). Its job is to for every script, rewrite gensymed symbols to a form the compiler will accept in the external nim file. And to make sure different ones are rewritten to the same one, eg a'gensym1 and a'gensym2 are both rewritten to a_1234567. But if the table is reset it won't know that both of them should be rewritten to the same symbol. So this means this code won't work inside a template:

template foo() =
  let script = nbCodeToJsInit:
    let a = 1
  # a will in general be gensymed as it is defined in a template
  nbCodeToJs: # here the table is reset
    discard
  script.addCodeToJs:
    echo a # so it has no idea that a variable `a` was defined in the previous block here

The hard part is that this must be done at compiletime so we can't just assign a random id to each script, because the macro has no idea what script each code block comes from.

HugoGranstrom avatar Jul 14 '22 13:07 HugoGranstrom

I haven't tested it yet but it could perhaps work if we pass in the script variable to nimToJsString and use it as a key for a table containing the existing table for each script. Each script variable, given that it isn't defined in a proc, should then get a unique key to the table. In theory... :sweat_smile:

HugoGranstrom avatar Jul 14 '22 16:07 HugoGranstrom

to be honest this does not seem a big issue for me (but I understand optimistically half of what you say when you talk gensym, so maybe you are onto something good and keep going... ;)).

to recap, the discussion was about the typewriter widget for nimislides, done with karax. My view on that is:

  • karax should be used more for complete apps that take most of the page (and it is nice to be able to have multiple karax apps on a page mostly to document and explain karax).
  • for widgets I would say it would be better to use vanilla ~js~ nim

In particular, for a typewriter widget (which is not necessarily specific to nimislides, could be used elsewhere in nimib too), I think a better approach would be to use something like typewriterjs. After all, another advantage of nim is an easy FFI :)

unrelated, can I try turning this into a discussion?

pietroppeter avatar Jul 15 '22 07:07 pietroppeter

to be honest this does not seem a big issue for me (but I understand optimistically half of what you say when you talk gensym, so maybe you are onto something good and keep going... ;)).

It means that if nimib chooses to use nbCodeToJsInit + addCodeToJs for a component anytime in the future, it would be forbidden for any library or user who wanted to use that feature to use any of nbCodeToJsInit , nbCodeToJs, nbKarax as it would invalidate adding more of that component. So to me, it is a big issue. but yeah I don't understand fully all the stuff I say about gensym either :rofl:

Also something I don't particularly like about the nbCodeToJsInit is that you have to call script.addToDocAsJs at the end. And if we have a component we there is really anywhere we can call it behind the scenes, so we basically have to add a nbBeforeSave template or something that the user must remember to call before nbSave. Unless we create a script-mechanism of some sort like adding a private field nb.scripts which you can add your scripts to with nb.addScript(script) and in nbSave it will just loop through them and add them to the document.

karax should be used more for complete apps that take most of the page (and it is nice to be able to have multiple karax apps on a page mostly to document and explain karax).

I'm not really agreeing, but fair enough :+1: In my case it is really unnecessary to use Karax though.

or widgets I would say it would be better to use vanilla js nim

If it can be easily achievable, then yes :+1:

In particular, for a typewriter widget (which is not necessarily specific to nimislides, could be used elsewhere in nimib too), I think a better approach would be to use something like typewriterjs. After all, another advantage of nim is an easy FFI :)

The reason I rolled my own version was that I wanted to integrate it fully with Reveal's animation system so it only starts when reveal is showing it and fast forwards if one steps forward. But this library seems like it could actually work alongside Reveal as it has quite a lot of methods, so I'll investigate it. Thanks for the tip :D

unrelated, can I try turning this into a discussion?

Sure, go ahead :)

HugoGranstrom avatar Jul 15 '22 07:07 HugoGranstrom

I'm still not sure how to deal with this. My proposed solution probably works, but it is a bit fragile if the user chooses to do anything fancy with the script variable. Our recent discussions about removing the typed branch of the macro don't change this sadly as it's about name mangling. I'll try and implement it in the PR with all the other changes, as it at least would be better than the current state of affairs.

HugoGranstrom avatar Aug 02 '22 17:08 HugoGranstrom

I've hit an obstacle, we are defining the NbBlock first with result = NbBlock() in the nbJsFromCodeInit and the returned one is stored in script. So my original idea of just comparing the symbols won't work :(

HugoGranstrom avatar Aug 04 '22 09:08 HugoGranstrom

Ah, sorry about that. Not sure how can I help I understand very little of this. :/

pietroppeter avatar Aug 04 '22 12:08 pietroppeter

No worries, I'll try and think something out during your vacation. I'm honestly a bit tempted to make a separate non-mangled version for everything that doesn't need to be compiled in separate files (basically everything except Karax). It would instead be using block: to separate different scripts and all scripts would be compiled in the same file.

HugoGranstrom avatar Aug 04 '22 12:08 HugoGranstrom

Yes I'm starting to convince myself the non-mangling block version should be the norm and Karax the exception (it's hidden inside nbKaraxCode so it doesn't matter if it's uglier). So The general structure would be:

# imports here

block: # script1
  Here we paste everything that would have been in the separate file for script1
block: # script2
  Same here

One problem would be that imports must happen at the top-level and not inside a block. So either the init code is in the top-level and users are urged not to define global variables there (easy) or we try and extract the imports from the code blocks and paste them at the top. (hard if the import is in a non-trivial place like inside a template)

Another problem would be how to represent this big script in nimib. Perhaps by having a field nb.scripts: seq[NbBlock] which are processed accordingly in nbSave. This way the user shouldn't have to add the script to doc which is quite nice. A script is "registered" in the init call and added to the seq. Then we can modify the script as we like and the changes will be reflected in the seq as NbBlock is a ref type.

HugoGranstrom avatar Aug 04 '22 20:08 HugoGranstrom

Something that's against the block-approach is that I don't know how {.exportc.} works if inside a block, perhaps it's not visible to the JS library then :thinking:

(I'm just brain-dumping everything I can think of right now so I don't forget it)

HugoGranstrom avatar Aug 05 '22 08:08 HugoGranstrom

New idea:

  • Remove the init + addScript machinery
  • All nbJsFromCode are compiled in the same file and added to separate blocks. All global code (like imports and definitions of global variables) must be done inside nbJsGlobalFromCode.
  • nbJsFromCode shouldn't need the advanced gensym logic, it only has to make all gensym'd variables into valid identifiers.
    • It should be enough to just remove the gensym of these entirely? The blocks should guard us against any conflicts, I think. Verify this!
  • nbKaraxFromCode uses the old advanced gensym logic as it's compiled in different files.

HugoGranstrom avatar Oct 20 '22 08:10 HugoGranstrom

I like the idea but I am still not understanding some (likely basic) stuff:

  • what is keep us from being able to have nbJsFromCode do the simplest pipeline:
    • dump all the content of this single block in a single file, compile to js and then put that js in a script tag?
    • is it because in that way to identical nbJsFromCode would share the variables?
    • and then having them in a single file in separate blocks would allow us to have gensymed stuff?
    • I think the "natural" pipeline has a lot of appeal (maybe not the most powerful but easier to reason about) and maybe we can have that as default and add some opt-in features to change and go with a more complicated way (there was a forum thread where juan carlos shared some interesting stuff, but cannot find).
  • why would nbKaraxFromCode need to be compiled anyway in different files? same reason as for nbJsFromCode but in karax case the block thing would not work?

as for reference this was partly stimulated by the fact that the expectation that a simple three_js.nim example does not work with the current setup: https://pietroppeter.github.io/nblog/drafts/three_js.html

pietroppeter avatar Oct 21 '22 10:10 pietroppeter

dump all the content of this single block in a single file, compile to js and then put that js in a script tag? is it because in that way to identical nbJsFromCode would share the variables?

Yes exactly, Nim is very predictable in its naming-scheme so if we have this:

template:
  nbJSFromCode:
    var x: int

then if we compile this piece of code 10x, the variable x will have the same name in all of them, creating collisions/shadowing/chaos. The way we are solving it at the moment is to gensym all identifiers ourselves, so we knew all of them were unique. And thus avoiding collisions. This is the step that causes all of our troubles at the moment as well. And this step will only be needed for nbKarax if all of this works out.

and then having them in a single file in separate blocks would allow us to have gensymed stuff?

The opposite, having them in the same file in separate blocks allows us to "undo" all the gensym that might have been thrown upon us and let Nim redo it when compiling it. The biggest problem with gensym'd variables is that they aren't valid identifiers if you try to compile it. Example of how we won't need gensym:

var longVariableName = 1
block:
  var longVariableName = 2
block:
  var longVariableName = 3

This generates this JS:

var longVariableName_520093697 = [1];
Label1: {
  var longVariableName_520093698 = [2];
};
Label2: {
  var longVariableName_520093699 = [3];
};

They are all different variables without us having to gensym anything.

I think the "natural" pipeline has a lot of appeal (maybe not the most powerful but easier to reason about) and maybe we can have that as default and add some opt-in features to change and go with a more complicated way (there was a forum thread where juan carlos shared some interesting stuff, but cannot find).

The simplest pipeline possible is the following:

  1. Read the code from AST.
  2. Remove all gensyms
  3. Turn code to string
  4. Add code string to the "big js file"
  5. "The big js" file is compiled when we do nbSave and added in a script tag.

why would nbKaraxFromCode need to be compiled anyway in different files? same reason as for nbJsFromCode but in karax case the block thing would not work?

Because to have multiple karax instances in the same page, you need to compile them all using different "karax ids". An you specify this id when you compile the file thorugh -d:kxi=id or something like that. So each karax component needs its own file because they all need unique compiler arguments passed to each and every one of them.

Ask more if anything still seems unclear :smile:

HugoGranstrom avatar Oct 21 '22 12:10 HugoGranstrom

I just got an idea, it's crazy but might have some merit to it. The reason we need the crazy gensym-logic when working with Karax is that Nim will compile scripts in a template to the exact same variable names in the generated JS. This causes conflict if we have multiple of the same components in the same document. Now to the juicy part: what if we managed to get Nim to generate different names for these somehow by for example adding a bunch of no-ops at the top of the document and hoping it causes it to generate new variable names? And if we add different amounts of no-ops to each file, perhaps it could work? I'm going to try at least...

HugoGranstrom avatar Oct 26 '22 13:10 HugoGranstrom

This might actually work! This Nim code produces the below javascript (notice the suffix of numbers for the variables):

let a = 1
let b = 2
let c = 3
var a_452984833 = 1;
var b_452984834 = 2;
var c_452984835 = 3;

Now if we add a new line at the top:

let h = 0

let a = 1
let b = 2
let c = 3

we get this js:

var h_452984833 = 0;
var a_452984834 = 1;
var b_452984835 = 2;
var c_452984836 = 3;

And notice the difference here: a_452984834 vs a_452984833

By adding an extra line at the top, we have caused the gensym to change and these two files wouldn't conflict with each other. (Unless we are reeeeeaaally unlucky that it shifts the gensym to match another variable with the same name)

HugoGranstrom avatar Oct 26 '22 13:10 HugoGranstrom

And the simplest solution seems to be to create a macro, which generates a new gensym, thus increasing the "counter":

macro bumpGensym() =
  let _ = gensym()

Then all of these files would generate difference variable names for its variables:

bumpGensym()
let a = 1 # a_452984839
bumpGensym()
bumpGensym()
let a = 1 # a_452984840
bumpGensym()
bumpGensym()
bumpGensym()
let a = 1 # a_452984841

HugoGranstrom avatar Oct 26 '22 13:10 HugoGranstrom

I'm rambling like a mad man, but the summary of this rant is: We can throw the crazy gensym logic in the trash bin! And instead, we insert a unique number of bumpGensym() into each file. We could still get into trouble if we have two variables named the same that are close to each other. But that is in my opinion still better than what we currently have.

HugoGranstrom avatar Oct 26 '22 13:10 HugoGranstrom

I like it! It seems to be much better and easier to reason about! Will we apply this also to the JsCode pieces and avoid the global thing?

Is there a way to know what the last id from a previous block is and make sure that there are no overlaps? If we add a parameter startFromGensymId that we could use when compiling to js and we could also retrieve last id, we could be sure to avoid also the unfortunate collisions.

pietroppeter avatar Oct 26 '22 14:10 pietroppeter

Anyway the idea of bumpGensym is also easier to understand and use by the user for the occasional collision

pietroppeter avatar Oct 26 '22 14:10 pietroppeter

Will we apply this also to the JsCode pieces and avoid the global thing?

No, this "only" affects code written in different files. We still need the blocks when packing multiple nbJs into the same nim file. And if we use blocks, we will need the global block. And we do want to put them into the same file to reduce the final file size and most importantly the compile time.

Is there a way to know what the last id from a previous block is and make sure that there are no overlaps?

If we insert the bumpGensym() into the code string directly (instead of inserting it in the AST in macro), then I think we should in theory at least, be able to compute it from the generated javascript. It won't be trivial and it won't be perfect, but we would reduce the chance of collisions. Good idea! :D

Anyway the idea of bumpGensym is also easier to understand and use by the user for the occasional collision

Definitely :+1:

HugoGranstrom avatar Oct 26 '22 14:10 HugoGranstrom

Is there a way to know what the last id from a previous block is and make sure that there are no overlaps?

I gave it a few more minutes of thought now, and I'm not so sure that this is feasible. Because we are starting from such random numbers, we don't know at which number we will start when we compile the next file. So we can't really know the needed bump. If we had just a single template, then we would get the same code every time we compiled it. So then it would work. But given that you might define more than one different karax components, we can't know anything about the numbers that will be used sadly

HugoGranstrom avatar Oct 26 '22 14:10 HugoGranstrom