nimib
nimib copied to clipboard
Multiple `nbCodeToJsInit` can't be interlaced
As has been discussed between me and @pietroppeter, the file size of the final HTML file can grow quite a bit if many nbCodeToJs
are used as they will have duplicate code between them that could be reduced. For this, we have the nbCodeToJsInit
+ addCodeToJs
API. But there is one problem with it, namely that we are using a global table https://github.com/pietroppeter/nimib/blob/631ea5520917146ab92c6b5b11ee39fbd8257668/src/nimib/jsutils.nim#L12 which is reset each time nbCodeToJsInit
is called (or nbCodeToJs
which calls it in turn). Its job is to for every script, rewrite gensymed symbols to a form the compiler will accept in the external nim file. And to make sure different ones are rewritten to the same one, eg a'gensym1
and a'gensym2
are both rewritten to a_1234567
. But if the table is reset it won't know that both of them should be rewritten to the same symbol. So this means this code won't work inside a template:
template foo() =
let script = nbCodeToJsInit:
let a = 1
# a will in general be gensymed as it is defined in a template
nbCodeToJs: # here the table is reset
discard
script.addCodeToJs:
echo a # so it has no idea that a variable `a` was defined in the previous block here
The hard part is that this must be done at compiletime so we can't just assign a random id to each script, because the macro has no idea what script each code block comes from.
I haven't tested it yet but it could perhaps work if we pass in the script variable to nimToJsString
and use it as a key for a table containing the existing table for each script. Each script variable, given that it isn't defined in a proc, should then get a unique key to the table. In theory... :sweat_smile:
to be honest this does not seem a big issue for me (but I understand optimistically half of what you say when you talk gensym, so maybe you are onto something good and keep going... ;)).
to recap, the discussion was about the typewriter widget for nimislides, done with karax. My view on that is:
- karax should be used more for complete apps that take most of the page (and it is nice to be able to have multiple karax apps on a page mostly to document and explain karax).
- for widgets I would say it would be better to use vanilla ~js~ nim
In particular, for a typewriter widget (which is not necessarily specific to nimislides, could be used elsewhere in nimib too), I think a better approach would be to use something like typewriterjs. After all, another advantage of nim is an easy FFI :)
unrelated, can I try turning this into a discussion?
to be honest this does not seem a big issue for me (but I understand optimistically half of what you say when you talk gensym, so maybe you are onto something good and keep going... ;)).
It means that if nimib chooses to use nbCodeToJsInit
+ addCodeToJs
for a component anytime in the future, it would be forbidden for any library or user who wanted to use that feature to use any of nbCodeToJsInit
, nbCodeToJs
, nbKarax
as it would invalidate adding more of that component. So to me, it is a big issue. but yeah I don't understand fully all the stuff I say about gensym either :rofl:
Also something I don't particularly like about the nbCodeToJsInit
is that you have to call script.addToDocAsJs
at the end. And if we have a component we there is really anywhere we can call it behind the scenes, so we basically have to add a nbBeforeSave
template or something that the user must remember to call before nbSave
. Unless we create a script-mechanism of some sort like adding a private field nb.scripts
which you can add your scripts to with nb.addScript(script)
and in nbSave
it will just loop through them and add them to the document.
karax should be used more for complete apps that take most of the page (and it is nice to be able to have multiple karax apps on a page mostly to document and explain karax).
I'm not really agreeing, but fair enough :+1: In my case it is really unnecessary to use Karax though.
or widgets I would say it would be better to use vanilla js nim
If it can be easily achievable, then yes :+1:
In particular, for a typewriter widget (which is not necessarily specific to nimislides, could be used elsewhere in nimib too), I think a better approach would be to use something like typewriterjs. After all, another advantage of nim is an easy FFI :)
The reason I rolled my own version was that I wanted to integrate it fully with Reveal's animation system so it only starts when reveal is showing it and fast forwards if one steps forward. But this library seems like it could actually work alongside Reveal as it has quite a lot of methods, so I'll investigate it. Thanks for the tip :D
unrelated, can I try turning this into a discussion?
Sure, go ahead :)
I'm still not sure how to deal with this. My proposed solution probably works, but it is a bit fragile if the user chooses to do anything fancy with the script variable. Our recent discussions about removing the typed branch of the macro don't change this sadly as it's about name mangling. I'll try and implement it in the PR with all the other changes, as it at least would be better than the current state of affairs.
I've hit an obstacle, we are defining the NbBlock first with result = NbBlock()
in the nbJsFromCodeInit
and the returned one is stored in script
. So my original idea of just comparing the symbols won't work :(
Ah, sorry about that. Not sure how can I help I understand very little of this. :/
No worries, I'll try and think something out during your vacation. I'm honestly a bit tempted to make a separate non-mangled version for everything that doesn't need to be compiled in separate files (basically everything except Karax). It would instead be using block:
to separate different scripts and all scripts would be compiled in the same file.
Yes I'm starting to convince myself the non-mangling block version should be the norm and Karax the exception (it's hidden inside nbKaraxCode
so it doesn't matter if it's uglier). So The general structure would be:
# imports here
block: # script1
Here we paste everything that would have been in the separate file for script1
block: # script2
Same here
One problem would be that imports must happen at the top-level and not inside a block. So either the init code is in the top-level and users are urged not to define global variables there (easy) or we try and extract the imports from the code blocks and paste them at the top. (hard if the import is in a non-trivial place like inside a template)
Another problem would be how to represent this big script in nimib. Perhaps by having a field nb.scripts: seq[NbBlock]
which are processed accordingly in nbSave
. This way the user shouldn't have to add the script to doc which is quite nice. A script is "registered" in the init call and added to the seq. Then we can modify the script as we like and the changes will be reflected in the seq as NbBlock is a ref type.
Something that's against the block-approach is that I don't know how {.exportc.}
works if inside a block
, perhaps it's not visible to the JS library then :thinking:
(I'm just brain-dumping everything I can think of right now so I don't forget it)
New idea:
- Remove the
init
+addScript
machinery - All
nbJsFromCode
are compiled in the same file and added to separate blocks. All global code (like imports and definitions of global variables) must be done insidenbJsGlobalFromCode
. -
nbJsFromCode
shouldn't need the advanced gensym logic, it only has to make all gensym'd variables into valid identifiers.- It should be enough to just remove the gensym of these entirely? The blocks should guard us against any conflicts, I think. Verify this!
-
nbKaraxFromCode
uses the old advanced gensym logic as it's compiled in different files.
I like the idea but I am still not understanding some (likely basic) stuff:
- what is keep us from being able to have
nbJsFromCode
do the simplest pipeline:- dump all the content of this single block in a single file, compile to js and then put that js in a script tag?
- is it because in that way to identical
nbJsFromCode
would share the variables? - and then having them in a single file in separate blocks would allow us to have gensymed stuff?
- I think the "natural" pipeline has a lot of appeal (maybe not the most powerful but easier to reason about) and maybe we can have that as default and add some opt-in features to change and go with a more complicated way (there was a forum thread where juan carlos shared some interesting stuff, but cannot find).
- why would
nbKaraxFromCode
need to be compiled anyway in different files? same reason as fornbJsFromCode
but in karax case theblock
thing would not work?
as for reference this was partly stimulated by the fact that the expectation that a simple three_js.nim example does not work with the current setup: https://pietroppeter.github.io/nblog/drafts/three_js.html
dump all the content of this single block in a single file, compile to js and then put that js in a script tag? is it because in that way to identical nbJsFromCode would share the variables?
Yes exactly, Nim is very predictable in its naming-scheme so if we have this:
template:
nbJSFromCode:
var x: int
then if we compile this piece of code 10x, the variable x
will have the same name in all of them, creating collisions/shadowing/chaos. The way we are solving it at the moment is to gensym all identifiers ourselves, so we knew all of them were unique. And thus avoiding collisions. This is the step that causes all of our troubles at the moment as well. And this step will only be needed for nbKarax
if all of this works out.
and then having them in a single file in separate blocks would allow us to have gensymed stuff?
The opposite, having them in the same file in separate blocks allows us to "undo" all the gensym that might have been thrown upon us and let Nim redo it when compiling it. The biggest problem with gensym'd variables is that they aren't valid identifiers if you try to compile it. Example of how we won't need gensym:
var longVariableName = 1
block:
var longVariableName = 2
block:
var longVariableName = 3
This generates this JS:
var longVariableName_520093697 = [1];
Label1: {
var longVariableName_520093698 = [2];
};
Label2: {
var longVariableName_520093699 = [3];
};
They are all different variables without us having to gensym anything.
I think the "natural" pipeline has a lot of appeal (maybe not the most powerful but easier to reason about) and maybe we can have that as default and add some opt-in features to change and go with a more complicated way (there was a forum thread where juan carlos shared some interesting stuff, but cannot find).
The simplest pipeline possible is the following:
- Read the code from AST.
- Remove all gensyms
- Turn code to string
- Add code string to the "big js file"
- "The big js" file is compiled when we do
nbSave
and added in a script tag.
why would nbKaraxFromCode need to be compiled anyway in different files? same reason as for nbJsFromCode but in karax case the block thing would not work?
Because to have multiple karax instances in the same page, you need to compile them all using different "karax ids". An you specify this id when you compile the file thorugh -d:kxi=id
or something like that. So each karax component needs its own file because they all need unique compiler arguments passed to each and every one of them.
Ask more if anything still seems unclear :smile:
I just got an idea, it's crazy but might have some merit to it. The reason we need the crazy gensym-logic when working with Karax is that Nim will compile scripts in a template to the exact same variable names in the generated JS. This causes conflict if we have multiple of the same components in the same document. Now to the juicy part: what if we managed to get Nim to generate different names for these somehow by for example adding a bunch of no-ops at the top of the document and hoping it causes it to generate new variable names? And if we add different amounts of no-ops to each file, perhaps it could work? I'm going to try at least...
This might actually work! This Nim code produces the below javascript (notice the suffix of numbers for the variables):
let a = 1
let b = 2
let c = 3
var a_452984833 = 1;
var b_452984834 = 2;
var c_452984835 = 3;
Now if we add a new line at the top:
let h = 0
let a = 1
let b = 2
let c = 3
we get this js:
var h_452984833 = 0;
var a_452984834 = 1;
var b_452984835 = 2;
var c_452984836 = 3;
And notice the difference here:
a_452984834
vs a_452984833
By adding an extra line at the top, we have caused the gensym to change and these two files wouldn't conflict with each other. (Unless we are reeeeeaaally unlucky that it shifts the gensym to match another variable with the same name)
And the simplest solution seems to be to create a macro, which generates a new gensym, thus increasing the "counter":
macro bumpGensym() =
let _ = gensym()
Then all of these files would generate difference variable names for its variables:
bumpGensym()
let a = 1 # a_452984839
bumpGensym()
bumpGensym()
let a = 1 # a_452984840
bumpGensym()
bumpGensym()
bumpGensym()
let a = 1 # a_452984841
I'm rambling like a mad man, but the summary of this rant is:
We can throw the crazy gensym logic in the trash bin!
And instead, we insert a unique number of bumpGensym()
into each file.
We could still get into trouble if we have two variables named the same that are close to each other. But that is in my opinion still better than what we currently have.
I like it! It seems to be much better and easier to reason about! Will we apply this also to the JsCode pieces and avoid the global thing?
Is there a way to know what the last id from a previous block is and make sure that there are no overlaps? If we add a parameter startFromGensymId that we could use when compiling to js and we could also retrieve last id, we could be sure to avoid also the unfortunate collisions.
Anyway the idea of bumpGensym is also easier to understand and use by the user for the occasional collision
Will we apply this also to the JsCode pieces and avoid the global thing?
No, this "only" affects code written in different files. We still need the blocks when packing multiple nbJs
into the same nim file. And if we use blocks, we will need the global block. And we do want to put them into the same file to reduce the final file size and most importantly the compile time.
Is there a way to know what the last id from a previous block is and make sure that there are no overlaps?
If we insert the bumpGensym()
into the code string directly (instead of inserting it in the AST in macro), then I think we should in theory at least, be able to compute it from the generated javascript. It won't be trivial and it won't be perfect, but we would reduce the chance of collisions. Good idea! :D
Anyway the idea of bumpGensym is also easier to understand and use by the user for the occasional collision
Definitely :+1:
Is there a way to know what the last id from a previous block is and make sure that there are no overlaps?
I gave it a few more minutes of thought now, and I'm not so sure that this is feasible. Because we are starting from such random numbers, we don't know at which number we will start when we compile the next file. So we can't really know the needed bump. If we had just a single template, then we would get the same code every time we compiled it. So then it would work. But given that you might define more than one different karax components, we can't know anything about the numbers that will be used sadly