OCCT
OCCT copied to clipboard
Improve TDocStd_Document Memory Management via Lazy Loading
Description
Currently, the TDocStd_Document API loads the entire dataset into memory upon initialization. This approach can quickly lead to severe memory consumption and reduced performance, especially when dealing with very large assemblies. In many real-world applications, the .xbf files can reach tens of gigabytes, which makes the current eager loading strategy unsuitable for systems with limited RAM.
Use Case
Implementing a lazy loading mechanism would allow users to navigate and interact with the Hierarchical Assembly Graph (HAG) independently of the underlying BRep data.
Scenario: Users iterating over a large assembly tree could access structural relationships and metadata without incurring the cost of loading all the BRep geometries into memory.
On-Demand Loading: When a particular node’s detailed geometry is required, the corresponding BRep data would be loaded dynamically, significantly reducing the upfront memory footprint.
Benefits
Reduced Memory Usage: By decoupling the assembly structure from the heavy BRep geometry data, applications can load and manipulate larger documents efficiently even on systems with 8GB to 64GB of available RAM.
Improved Performance: Lazy loading can lead to faster startup times and smoother navigation through complex assemblies, as only the necessary parts of the file are loaded at any given time.
Enhanced User Experience: Developers working with large data sets will benefit from a more responsive system that can handle complex assemblies without exhausting system resources.
Scalability: This approach paves the way for supporting even larger assemblies in the future, thus extending the applicability of the toolkit to more demanding industrial applications.
Additional Context
Industry Standards: Modern CAD and modeling systems increasingly rely on lazy loading techniques to balance resource allocation, and incorporating a similar strategy in OpenCascade could enhance its competitiveness.
@GabrielJMS have you investigated partial loading mechanism of OCAF-based documents via PCDM_ReaderFilter introduced by 0031918: Application Framework - New binary format for fast reading part of OCAF document?
I suppose that it was designed specifically for such scenarios - for loading structure first, and then loading only necessary data later, though I've never used it myself.
We are have multiple project who rely on filtering. So, that is works normal enough. But I would recommend to use version 11 of XBF and lower. As for XBF vs some native CAD formats, XBF is not CAD format, it is application session information. That can store wide range of the information. XCAF is only some CAD specific, which I found very unbalanced. And during development of 8.0.0 there a plans to some reorganizing.
OCAF by default have no any default information about for which purpose it is used. Realizing partial load is rely on what is user/developer is needed. In that context, users must to build own filter based on the class mentioned by gkv311. Summary, there are no way to realise that request on the XBF. OCAF is too much flexible. For sure, we can prepare some realisation for XCAF, solonely for XDE. But it can leads to an issues for custom attributes or complex relationships with shapes, which are accepted by OCAF, but not expected by XBF. I still hope complete new data exchange internal format, which purely targeted CAD data.(Backwards compatible with XCAF)
@gkv311 @dpasukhi Thank you for the insights.
I believe the core issue lies in the need to store a full TopoDS_Compound for each assembly prototype that participates in an assembly, in order to register it within a TDocStd_Document. In practical applications, this approach results in all TopoDS_Shape being loaded into memory, which becomes extremely costly in terms of memory usage for large assemblies.
A more scalable and efficient framework would allow us to define the assembly structure by creating part prototypes and referencing their instances (with transformation data) without needing to hold their full BRep geometry in memory upfront. Instead, the BRep data for each prototype should be stored externally (e.g., in a file or streaming format) and loaded on-demand when a user or application explicitly requires it. The mesh can be stored also in the prototype to have the visualization.
This decoupling of structure and geometry aligns with modern practices in CAD systems and would significantly improve scalability when dealing with large assemblies.