embed-page
embed-page copied to clipboard
Diary - JS scope insulation
EPA / embed-page
notes on JS scope, window and application security of scope insulation.
-
anonymous
default scope makes complete insulation -
none
is a global scope, no insulation. Useful for html include -
scope="xxx"
named. Variables and API are shared between scope with same name
How to hide the page scope(global) JS objects and substitute those with own implementation?
eval() with closure-defined global ovverrides appeared to be a way to make embed-page content insulation from host page. Here are the notes on implementation evolution.
How to make the window properties exposed as global( in EPA scope ) objects? The top level objects are window and document. The trick is to make the window objects exposed as part of scope.
The pill is sweater due to the fact that initial window content could be used for populating the scope. Later changes on window and global variables still needed but is not a usual case, so it could be ignored for now. From security stand of point it does not weaken the insulation, except of ability to detect EPA environment. Which is possible in multiple other ways anyway.
The with()
operator will do the exposure trick.
How to trap global location object assignment? Location object is a tricky as it should be identical between window.location, document.location and global location object. The need to override assignment of string to object itself( to reset embed-page content ) is a special challenge.
While the window.location="someUrl" could be overridden using setter of window object property, the "global" in the EPA scope 'location' object could not have a setter associated. Unless the name resolution is involving the with operator:
with(window) { location='abc'; /* resolves to window.location='abc' */ }
At this stage the solution chosen for location=url
is to convert it into window.location=url
during script load. It does not solve the eval( "location=url" )
case which will complain on attempt to override the const
variable. Other cases like location.href, location.assign, etc. seems working fine and covered by unit tests.
While thinking on window management in <embed-page>
the convention of cross-app grouping been revised again. JS frames( aka windows ) are referenced to each other via name
or directly obtained during frame creation by window.open()
. Note, the frame content could be (re-)set using a target
attribute on <a>
or <form>
which matches the frame name. Security hole: it creates the ability to compromise the visual content of identified session by 3rd party app.
Frame names are scoped by browser identity session, which is usually either global and incognito. Which is definitely insufficient as multiple identities could be used and better not correlate between each other.
To avoid cross-identity session overlap, in <embed-page>
( and eventually in browser ) the target
attribute could be used for additional scoping of frames (instances of embed-page
). That way new window/embed-page
instance will inherit the target
and will be available over window.frames[], window.parent,...
in same scope. The windows with another target
will not have access to given scope and as result not able to manipulate its content like replacing content URLs, closing window or opening new ones.
The main window will have a collection of top level target
's and ability to operate with all app instances under same target: close, freeze, save, (re-)open. It similarity with group operation over bookmarks folder except of executing on host web page with microapplication content within embed-page
.
target
attribute on <embed-page>
dedicated for "identity session".
It does not mean the hyperlinks and forms will be targeting the specific frame, rather it will define the scope where those instances will resolve frame names including named or special name like _top
. That way it would be possible to create two and more "targets" serving the different cross-site accounts. Like app authorized by FaceBook, shown some FB pages along with disqus threads, associated with same identity. If you need to create another set of windows with different identity, it would be done by starting the "identity session" with own target
value.
All scripts inside of window.xxx
functions and variables. JS engine does not give an access to enumeration of scope variables so
- hide
<embed-page>
globals from container window. - enumerate the variables/functions from EPA scripts
- detect changes to variable values referenced as global variable or as
window.xxx
between beginning of script section and in the end.
To make the insulated from container windows scope, scripts are executing within <script type="module">
. Which automatically scoping es6-style declared variables. Still var
or undeclared variable assignment infecting the container window. The last could be trapped by preserving the container window state, adding container window
changes to epa.globals
and finally restoring original window state.
In order to share same scope each sub-script or element.onxxx
event handler have to be executed within common script section.
poc/global-scope.html covers that behavior.
In order to reuse the scope, content of script tags and event handlers should be executed in same scope. Which could be achieved either by
- running all code within single SCRIPT tag associated with embed-page.
- wrapping each script individually.
1st method will use shared set of global variables, only sync window.xxx assignment with variables is needed. CONS:
- each SCRIPT could fail but it should not prevent to run others. That requires to surround each code section by try/catch
- delay until all scripts are loaded
2nd method PROS:
- SCRIPT type=module could be handled differently, without sync to window props. Which actually makes implementation even more difficult in comparison with 1st method.
- script execution could be done before following scripts loaded. Inapplicable.
- script execution could be done simultaneously with fetching following scripts. Performance optimization. Inapplicable due to need of all variables collection before run of any script. CONS:
- all variables from each script should be declared in each section
Common:
- variables from each SCRIPT should be declared ahead of any SCRIPT execution to avoid container window props pollution. That could be achieved by collecting all scripts and extracting all words and window object properties (with exception of keywords and embed-page specific variables).
- after execution of each script all "window" properties should be assigned to "globals" in each SCRIPT(one or many) scope.
- after execution of each script all "global" variables should become a window object property.
Global script handling will try to implement 1st option -
- Collect all scripts code, extract all variables as keywords with exception of
- JS keywords
- "clean" window properties from blank iframe window
- EPA_ prefixed variables
- EpaWindow properties
- in rendered script declare all collected variables.
- clone all EpaWindow properties into variables. For each script code
- temporary clear container window properties to avoid leaking container globals into embed-page scope
- preserve "unclean" window properties
- remove those properties from window
- in try{ section insert code }
- catch(ex){ console.error(ex)} will permit to run following SCRIPTs
- finally{}
- move added to container window properties into EpaWindow ( detect by comparing with reference iframe )
- restore container window properties
- For each onXXX attribute
- in try section set event handler
- in event handler body insert code as in 4-7
Async/defer scripts Worth special treatment. Globals in "detauched" scripts most likely would not expose variables for reuse in external scripts. In this case execution could be delayed after page load. There are exceptions though. Analytics on one side reside in defer script, but custom code is inside of page scripts. Globals in this case serve the joint. Need to collect most popular globals to include in vars list.
Most popular global variables
- jQuery, $, query versions
- '_' lowdash
- dojo
- d3
- Sortable, Effect, ...( http://script.aculo.us )
- shaka ( Shaka Player )
- spf ( spfjs )
- swfobject, THREE, WebFont,
- React
- Vue
- Backbone
- Hammer
- $$, Class, Element, Request (Mootools.net)
- zawgyiDetector,? (Myanmar tools)
- s, t(), s_account (Adobe analytics) other
- analytics
- feedback
A good start to collect js libs with globals is CDN https://developers.google.com/speed/libraries
Unifying JS under single script allows a single variables list sharing. But concatenation of multiple files conflicts with import module statements which meant to be used ONLY in beginning of JS file, definitely not in try{} scope closure ( finally
section needed for container window recovery and emitting load
event).
What could be done further?
- try/finally could be substituted with
setTimeout(x, 0)
in beginning of concatenated script- problem with broken script- it would prevent to load all other scripts, even if those are valid
-
inject scripts individually, try/finally could be substituted with
setTimeout(x, 0)
- problem with common variable sets synchronized for all scripts
- a bit hassle to collect all scripts completion to emit DOMContentLoaded and
load
event. - on positive side broken JS scripts would not prevent to execute other scripts
Intermediate (though more complex) solution, matching browser script loading convention :
- collect all non-module scripts into concatenated text. Such scripts assumed do not have
import
statements and could use try/catch enclosure for each section.- still suffer from broken script preventing to load others
- load
type="module"
scripts individually, synchronizing globals in beginning of script.- globals modified asynchronously would be changed only in own scope. Which means cross-module data sharing would need to be done over
window.XXX
instead of directXXX
use. That could be tolerated in most cases.
- globals modified asynchronously would be changed only in own scope. Which means cross-module data sharing would need to be done over
From MDN formats of static import is limited to:
import defaultExport from "module-name";
import * as name from "module-name";
import { export1 } from "module-name";
import { export1 as alias1 } from "module-name";
import { export1 , export2 } from "module-name";
import { foo , bar } from "module-name/path/to/specific/un-exported/file";
import { export1 , export2 as alias2 , [...] } from "module-name";
import defaultExport, { export1 [ , [...] ] } from "module-name";
import defaultExport, * as name from "module-name";
import "module-name";
Individual script loading seems to be quite attractive. PROS:
- broken scripts would not break others
- no conflicts on import variables between otherwise concatenated scripts
- development could use saved to FS rendered content for easy debug, in prod content would be embedded.
CONS: Synchronization of globals across all scripts scopes
- upon completion of each.
- for async/await module code
With try/finally
or appending to the end( when script executed without exceptions) synchronization is not an issue: globals would be immediately populated to epa.globals
and propagated into each script scope( mean scopes should be exposed for embed-page ). Unfortunately try/finally is not an option for scripts with static imports.
SOLUTION: Sync code exposed as local method and called
- in the end of section
- on timeout(0) to cover the case of error or async code, with check whether sync has already been done.
The list of variables would be collected by loading all scripts before execution, saved into epa.globals
, listed and initialized from epa.globals
in beginning of each script.
The scope would register itself in epa.scripts
and expose
-
syncFrom( globals )
method which would copy globals to scope variables. -
syncTo( globals )
method to be called in the end of script, timeout, or after async code( event handlers, promise, etc.).EPA_sync()
method would be available in scope for implicit call by epa modules to populateepa.globals
and propagate in each scope.
Rather using setTimeout(0)
, need to evaluate https://github.com/YuzuJS/setImmediate or https://github.com/medikoo/next-tick
sequential execution of SCRIPT type="module" saved the hassle of hooking into last script execution. 'load' event is emitted on embed-page
withing SCRIPT appended to HTML body.
Since each SCRIPT has own context with simulated list of global variables, functions would reside in own script closure. When functions used from another module, globals would be visible as local to script scope. Which mean
- before function is invoked the globals have to be copied into function scope. but how to retrieve them from caller scope first?
- after function call the locals should be populated back to globals (
EmbedPage.globals
). But how to populate caller's local variables from globals?
The top level SCRIPT functions would be trapped by SCRIPT wrapper (scriptTemplate
) and on exit point from script would be wrapped with code which will perform globals to local sync before/after the function call.
Question of populating scope of caller is open for now. As an idea, additional wrapper within the scope would populate locals.
Apparently when calling the function from another scope the marshaling of variables needed in both scopes. Otherwise upon return from caller scope the callee scope variables not updated.
The call sequence would be:
- caller scope:
EPA_vars2globals()
- callee(function) scope:
EPA_StartScope()
- call function()
-
finally{ EPA_EndScope() }
- caller scope:
EPA_globals2Vars()
Each top level function( assuming it is not dynamically changed variable ) need to be surrounded by wrapper in each scope only once.
- In callee (function declaration) scope wrapping is done before script execution( function by name is available in beginning of module ).
- In other(caller) scopes wrapping should be done on demand in
EPA_globals2Vars()
call. To avoid populating back intoepa.globals
the wrapper would be marked asEPA_callerWrapper
.
embed-page.loadCount
The SCRIPT once added to DOM, is not going to be removed from execution even if removed or its content cleared. Which given the difficulties if embed-page
content changed before previous content is executed.
The way around is to keep the flag specific to loaded page within generated script and compare to embed-page
current flag value before execution. If flag is not same, it means the content been changed and script is not valid anymore, hence need to be skipped.
Scripts inlining
The none
scope assumes no insulation but each microapplication could expect the module
script execution at least once in their life cycle which conflicts with container behavior where script run only once in page life cycle.
The use of imports and variables insulation are the perks of module
script, but due to 'run only once' policy it is not possible to reuse such script in different contexts of individual microapplications.
embed-page scope="none"
would not honor the 'run only once' policy for top level scripts module or not, keeping this policy only for modules loaded via import
JS statement.
The scoped embed-page
have same challenges with top level modules which have to be invoked in each embed-page instance. The difference is in implementation of scripts inlining.
document.currentScript
Browser has given current script access only for non-module type of script. Which excludes the use of modern ES6 modular dependencies in content-aware scripts.
embed-page
breaks this pattern by making document.currentScript
to module and non-module top level scripts. Which gives
- ES6 modules dependencies capabilities
- access to SCRIPT tag parameters
- access to DOM location where SCRIPT is injected, (along with each occurrence execution) making
scriptlets
pattern useful.
Tricks to implement.
- as script tag executed not in sync matter the setting of
currentScript
should happen in first lines of code. Which is possible only in inlined code where the source is available as a text to be changed. The scripts injection routine knows the sequence and adds the embed-page as variable and uses script index in code to setdocument.currentScript
. -
mode=none
would require replacement ofdocument
object with proxy which overridecurrentScript
- In scoped
embed-page
the document isEpaDocument
instance which givessetCurrentScript()
method.
Performance issues due to unified variables treatment
While in rev 0.0.20 the global variables defined in script and in event attribute level are handled more or less properly, the implementation suffers from quite a bit of overhead:
- variables list is literally whole bundle from all variables (internal and global)
- variables list is compiled from all scripts
- applying all variables even if those are not used in particular script.
- does not make a distinction between VAR, CONST, Functions resulting in exception trapping on each initialization attempt. (try/catch is a costly method if applied on all vars on each script and event handler)
To make global variables sync more efficient:
- extract global vars from AST into separate buckets ( per script or event attribute ):
- const,
- let,
- var,
- function,
- import vars
- html vars As opposite to sniff of all top level and scoped vars.
- AST buckets extraction per each script and event handler. As opposite to collecting all names for all scripts.
- in each script/event handler to sync only variables which actually used by this script. As opposite to sync all vars.
- in script/event handler initialize locally used variables from globals only if they are not
- locally declared as let, const, function
- after script/event handler sync back only locally declared globals
Variables handling sequence
-
sanitize
epa.globals_removable
fromepa.globals
-
load HTML
-
extract globals from DOM by
id="XXX"
intoepa.globals
&epa.globals_removable
-
extract event
onXXX
event handlers from DOM into not executed SCRIPT -
Load scripts body.
-
extract globals from AST into
script.globals
&epa.globals
- from variables declarations( var,let,const,function, )
- from globals defined as
window.XXX
andwindow['XXX']
( only for valid variable names ) - from undeclared globals assigned without declaration
XXX=abc;
-
prepare and execute the scripts in defined by DOM order, one a time
- declare and init var globals
var XXX=epa_globals.XXX, YYY=epa_globals.YYY...
fromepa.globals
except of vars inscript.globals
( those are initialized within script itself ) - define sync back from vars to
epa.globals
fromscript.globals
list - define wrapper for global assignment
window.XXX=
to fillepa.document.currentScript.globals
- execute the script
- declare and init var globals
Import vars as globals
In many cases the import with variables is sufficient to identify APIs meant to be used in global scope. They are treated in particular script in similar fashion as const
variables. I.e. not allowed to be defined before and overridden later.
Access to such APIs are valuable in event handlers but there is not much use in duplication the import statement in SCRIPT and within inline event handler.
Hence, event handler could have a good use of import declaration when located
- bellow the import statement
- after all SCRIPT tags are executed
To minimize the number of SCRIPT tags, the event handlers could fit into the end of last SCRIPT within finally
section.
Globals in event handlers
Undeclared variables assignment is popular case: onclick="isClicked=true"
It means the body of event handler should be scanned in same fashion for globals as SCRIPT content. By injecting event handler into tail of last script all imports made available from withing.