engine Bundles Refactor

Fixes #5626

Refactor and improve behavior of Bundles:

Loading asset is now consistent and behaves the same regardless if it is in a bundle or not. As it will trigger bundle loading.
Loading bundles now uses fetch instead of XMLHttpRequest. This provides a source for a ReadableStream.
Bundle tar is now stream parsed, so its assets trigger load as they individually are loaded before whole bundle is downloaded. This spreads parsing time and improves overall loading speed.
Various text assets (e.g. text, json, template, ..) now do not create Blob, which does not pollute network tab with Blob url requests and loads such assets at sync speed when it gets available from bundle. Similar approach can be applied to other binary formats (separate PR).
Loading a bundle now will trigger load for all of its assets once they are downloaded.
As loading and untar'ing a bundle takes little time, this PR removed WebWorker path.
When loading an asset, by default if there are suitable bundles, then loaded bundles are prioritized first, then not-loaded bundles sorted by size (smallest first). This logic can be controlled by a developer using bundlesFilter.

This makes working with bundles much easier, and does not require any changes to existing projects that use bundles. It actually provides a simplification so developers can load async bundles and its assets much easier without custom code. As well as slightly speeds up the loading as it spreads parsing as it downloads.

New APIs:

// pc.AssetRegistry
assets.load(asset, { // optional options
    bundlesIgnore: true, // force asset loading not from a bundle
    bundlesFilter: (bundles) => { // if there are suitable bundles, then this will be called provided all suitable bundles
        return bundles[0]; // you can control which bundle to load asset from
    }
})

Benchmarks https://github.com/playcanvas/engine/pull/5675#issuecomment-1735230229

In case of no cache, loading is 18-35% faster depending on latency.

Worse latency = more benefits from bundles. Larger bundles (more assets in them) = more benefit from bad latency. Preloaded vs post-loaded assets improvements might vary depending on bundles contents. When loading from cache, there is no effect (marginal error).

I confirm I have read the contributing guidelines and signed the Contributor License Agreement.

Sep 25 '23 17:09 Maksims

Is is possible to have some perf numbers in the PR description please? I think the concern before was the bundles didn't reliably give better startup times than unbundled assets. So if we can now demonstrate that, I think I'd be much happier to promote bundle usage to users.

Sep 25 '23 17:09 willeastcott

Given the following example, it's not totally clear to me whether bundlesIgnore takes precedence or if it loads from the first bundle specified in the bundleFilter.

// pc.AssetRegistry
assets.load(asset, {
    bundlesIgnore: true,
    bundlesFilter: (bundles) => bundles[0];
})

This could be potentially be simplified with something along the following lines

// pc.AssetRegistry
assets.load(asset, { // optional options object
    useBundle: (bundles) => { // by default (undefined) or if true and multiple assets contain a bundle, it chooses either already loaded bundle, or if none available the smallest file size bundle. If false load asset directly
      return bundles[0] // you can provide custom method to control this logic, if falsy defaults to logic as if useBundle: false 
   }
})

Examples would be:

assets.load(asset) // Asset loads from bundle if it exists in one
assets.load(asset, { useBundle : true }) // Same as above
assets.load(asset, { useBundle : false }) // Asset does not loads from bundle and loads directly
assets.load(asset, { useBundle : bundle => bundles[0] }) // Asset loads from specific bundle if it exists

Sep 26 '23 07:09 marklundin

Below are benchmarks of a specific project that I believe is a good case for bundles.
It has some preload assets to have menu screen with some interaction (renders at that moment). And post loads more assets so the application is fully playable. Assets (/files/ directory) are hard cached (will load from local cache). Timings start from when DOM is loaded. Averages from 5 runs on each case.

Conclusions:

In case of no cache, loading is 18-35% faster depending on latency.

Worse latency = more benefits from bundles. Larger bundles (more assets in them) = more benefit from bad latency. Preloaded vs post-loaded assets improvements might vary depending on bundles contents. When loading from cache, there is no effect (marginal error).

Test Types:

no bundles, no cache: 312 requests, 12.3Mb download no bundles, cache: 309 requests, 4.1Kb download bundles, no cache: 65 requests (163 blobs), 12.4Mb download bundles, cache: 62 requests (163 blobs), 4.1Kb download

Timing Names:

First Frame - when first update is called. Renderable - when assets are preloaded and shaders compiled (second frame usually). Playable - when preload and post load assets are loaded and app is fully playable.

Timings (good latency, Latvia - Frankfurt):

Platform	Bundles	Cache	First Frame	Renderable	Playable	Advantage (higher is better)
PC (fibre)			3.46s	4.63s	5.71s	-
PC (fibre)	:white_check_mark:		2.32s +33%	3.60s +22%	4.63s	+19%
PC (fibre)		:white_check_mark:	1.83s	3.14s	3.81s	-
PC (fibre)	:white_check_mark:	:white_check_mark:	1.83s +0%	3.10s +1%	3.96s	-4%
Mobile (5G)			5.45s	7.69s	9.42s	-
Mobile (5G)	:white_check_mark:		4.19s +23%	6.52s +15%	7.69s	+18%
Mobile (5G)		:white_check_mark:	3.46s	5.82s	6.88s	-
Mobile (5G)	:white_check_mark:	:white_check_mark:	3.88s -12%	6.21s -7%	7.32s	-6%
Mobile (4G)			6.62s	8.93s	11.45s	-
Mobile (4G)	:white_check_mark:		4.49s +22%	7.3s +18%	9.24s	+19%
Mobile (3G)			7.84s	10.11s	14.46s	-
Mobile (3G)	:white_check_mark:		6.42s +18%	8.7s +14%	13.88s	+4%

Timings (bad latency 150ms+, Latvia - San Francisco):

Platform	Bundles	Cache	First Frame	Renderable	Playable	Advantage (higher is better)
PC (fibre)			7.84s	9.02s	12.06s	-
PC (fibre)	:white_check_mark:		4.09s +48%	5.45s +40%	7.81s	+35%
PC (fibre)		:white_check_mark:	1.87s	3.11s	3.96s	-
PC (fibre)	:white_check_mark:	:white_check_mark:	1.98s -6%	3.3s -6%	3.89s	+2%
Mobile (4G)			13.31s	15.6s	20.67s	-
Mobile (4G)	:white_check_mark:		7.89s +41%	10.17s +35%	13.39s	+35%

Sep 26 '23 10:09 Maksims

marklundin When bundlesIgnore is provided, then bundlesFilter is not called as there are no bundles to filter. Same happens if asset is not in a bundle, and bundlesFilter is provided - it will not be called, as there are no bundles to filter. bundlesFilter is called only when there are suitable bundles (that contain asset) are provided, and custom behavior is required (I still haven't came up with the case tbh). Within the engine, we try to avoid multi-typed arguments (boolean | function), as it makes it harder to learn and IDE tooltips are harder to read.

Here is current PRs ways:

// default to be loaded from bundle if any is suitable
assets.load(asset);

// force asset to be loaded not from a bundle
assets.load(asset, {
    bundlesIgnore: true
});

// choose a random bundle (if any suitable) to load from
assets.load(asset, {
    bundlesFilter: (bundles) => {
        const ind = Math.ceil(Math.random() * bundles.length);
        return bundles[ind];
    }
});

Sep 26 '23 10:09 Maksims

Just wanted to ask another question about bundles. Obviously, these work fine in the Editor. But are they also useful for engine-only developers? So I believe I originally recommended the TAR format because anyone can open a terminal window and tar up some files. But is that all a bundle is? A vanilla TAR file? Or is there extra metadata that would make it hard for engine-only users to create?

Sep 26 '23 13:09 willeastcott

Just wanted to ask another question about bundles. Obviously, these work fine in the Editor. But are they also useful for engine-only developers? So I believe I originally recommended the TAR format because anyone can open a terminal window and tar up some files. But is that all a bundle is? A vanilla TAR file? Or is there extra metadata that would make it hard for engine-only users to create?

It is just a tar file. Tests run over a bundle that I've produced using command line tar. The only rule that asset.file.url of assets in bundle should match folder structure in tar. Also bundle asset should have a list of ids in asset.data.assets. So they are perfectly usable by engine-only users, and easy to create/manage.

Sep 26 '23 13:09 Maksims

It'd be great to have an engine example demonstrating its use.

Sep 26 '23 13:09 mvaligursky

Added engine-only example. Described in that example a process of creating a bundle, as simple as:

cd engine/examples/
tar cvf bundle.tar assets/models/geometry-camera-light.glb assets/models/torus.png

Then make sure asset.file.url matches that tar's folder structure.

Sep 28 '23 10:09 Maksims

This PR is looking great @Maksims!

One thing I'd like to address is the new asset openBinary function. We already have similar functionality using fetchArrayBuffer which is used for loading embedded images from glb files.

fetchArrayBuffer will either return asset.file.contents if it is provided or it will download and return the file data using url in the normal way. The helper means that all callers don't to have to worry either way. This fast path results in much less packing and unpacking of data when it's already available, much like openBinary. One advantage with asset.file.contents though is that users can also provide the asset file contents.

So yeah, ideally the asset system would have just one method for this fast path that works everywhere.

Oct 03 '23 10:10 slimbuck

slimbuck I've looked into fetchArrayBuffer, and could not figure out how to combine the Bundle and fetchArrayBuffer ways of opening data. First of all, opening data from DataView is different per asset type handlers, so either way we need a custom "pre-processing" per handler type, e.g. for JSON it is: JSON.parse(this.decoder.decode(data)); where we have to decode it first, and then parse it. Also fetchArrayBuffer if failed to use asset.file.contents will make arraybuffer request which is not desirable.

Oct 07 '23 14:10 Maksims

What effect will it have on the size of the application config file? We have some apps with thousands of assets. This creates a config JSON of several megabytes. The browser trying to read such a large file is taking too long on itself, and that is without PlayCanvas application parsing it afterwards. It would be great, if bundles would be a solution to it. If each asset is still referenced in the config file, then it won't help us. If this adds another field for each asset in the config file, increasing the final size, it will even make it worse.

Less http requests is still good, though. We are currently mitigating those by creating archives, then unpacking and generating assets at runtime. Its only suitable for some assets, though, e.g. we can't do it for materials, etc. I suppose bundles would help us in this regard?

I guess what I am trying to ask is whether an asset of any type can be invisible to config file, live only in a bundle, and be loaded only by referencing to the correct bundle at load time?

Oct 17 '23 14:10 LeXXik

Hi @LeXXik. Bundles only pack actual files in a tar. Meta data of an asset (json data in config.json) is not affected. Bundle meta json (data in config.json) only stores an asset IDs as an array, so impact on config.json size is minimal.

I had a similar challenge as you describe, and I solved it by a pre-processing config.json after build, where I used tags to mark some assets in a way that allowed me to split one config.json into multiple asset-meta.json files. Which then I loaded based ln application logic. In another case with similar challenge, I actually used database with API to query for a lists of assets based on application logic.

Oct 17 '23 15:10 Maksims

We've tested this PR deployed with the engine in production project, and it performed all as expected. @willeastcott let me know please what is holding a review.

Nov 01 '23 08:11 Maksims

Is there any update on this? This could be potentially increase our C2P values on our web games.

Jan 30 '24 17:01 devcem

Hey, @slimbuck and @mvaligursky - I want us to set aside some time this week to properly review this PR and finally get it merged. 🙏

Feb 04 '24 20:02 willeastcott

Hi @Maksims - we're keen to get to review this / finalize this. When you get a chance, could you please merge main to it and resolve the conflicts.

Feb 27 '24 10:02 mvaligursky

Updated this PR:

Example
TextDecoder is now lazily initialized same as in the other places in the engine.
Added docs where needed. openBinary should be added only on specific handlers where it is necessary, so not adding it to ResourceHandler.

Tested on existing project - works as before.

Mar 04 '24 14:03 Maksims

engine engine copied to clipboard

Bundles Refactor

New APIs:

Benchmarks https://github.com/playcanvas/engine/pull/5675#issuecomment-1735230229

In case of no cache, loading is 18-35% faster depending on latency.

Conclusions:

In case of no cache, loading is 18-35% faster depending on latency.

Test Types:

Timing Names:

Timings (good latency, Latvia - Frankfurt):

Timings (bad latency 150ms+, Latvia - San Francisco):

engine
engine copied to clipboard