XbimEssentials icon indicating copy to clipboard operation
XbimEssentials copied to clipboard

SQLite database

Open kozintsev opened this issue 8 years ago • 24 comments

Hello!

Why do you use the Esent rather than SQLite? Do you have plans to use SQLite?

kozintsev avatar Oct 24 '16 16:10 kozintsev

Hello, this is a long standing desire of mine, but we needed to port to ifc4 first. What we need to ensure is that efficiency is similar to the esent engine, and this is not obvious. We would probably have to cache a pool of SqLite commands and keep them open to speed up data access (so that sql syntax is pre-compiled). We are happy to accept help on this.

CBenghi avatar Oct 24 '16 20:10 CBenghi

I'll keep this open as a reminder.

CBenghi avatar Nov 15 '16 15:11 CBenghi

why do you want a sql database for this kind of data? isn't VelocityDB a much better choice for fast and efficient object cache store?

I personally vote to use a object database as cache store.

Based on those benchmarks, object database is much much faster than those traditional sql database. benchmark are provided here, as you can see, velocitydb almost 20 times faster than sqlite. and I don't think this stone age esent db is faster than sqlite.

And, there is another huge advantage of transition to another cache db, if we drop this esentdb, then we can make Xbim cross platform.

NPOI also has some cross platform replacement.

steve3d avatar Nov 07 '18 17:11 steve3d

As much as it sounds interesting VelocityDB is not free. The minimal cost for perpetual license is $200 per user. This is certainly not an option for xBIM. But if you purchase a licenses for your product you are free to implement IModel implementation based on the VelocityDB.

martin1cerny avatar Nov 08 '18 07:11 martin1cerny

I got it , but still have some question about this esentdb store.

What is this esentdb store for? a internal cacheing method or just another data store file format? I googled about this but got nothing.

steve3d avatar Nov 08 '18 09:11 steve3d

and what will happen when I strip out the entire esent support to create a cross platform version of XbimEssentials?

if this just another internal file format, then I will definitely not using it. but if this is an important internal caching method, strip out this will cause what?

steve3d avatar Nov 08 '18 09:11 steve3d

If you hold a few days I'll publish this. There will be more granular packages where most of Essentials will target .NET Standard 2 and .NET 4.7. Esent will still be there but will only be available for .NET 4.7

martin1cerny avatar Nov 08 '18 09:11 martin1cerny

And to answer your question: Esent is an embeded database which is part of every installation of Windows. It is used as a storage format in cases where IFC data are to be accessed repeatedly when it provides constant access time or in scenarios where memory consumptions is important because data is kept on the drive in the DB and are only read as a transient objects so the memory footprint is kept low even for extensive model operations (like geometry processing).

martin1cerny avatar Nov 08 '18 09:11 martin1cerny

More info on Esent: https://en.wikipedia.org/wiki/Extensible_Storage_Engine

Yes, we need to remove this dependency to support cross platform use cases.

EsentModel is an implementation of IModel that's employed heuristically when models exceed a certain size. You could remove it but performance on larger models will be horrible for if you just use the MemoryModel.

We're looking into other options but I'm pretty sure we'll want a storage layer that compatible with the open source licence we use. The IModel is pluggable so it's possible we could provide plugins that support commercial storage engines.

andyward avatar Nov 08 '18 16:11 andyward

Well, I've seen you guys work fast, as to this cache model, why don't we switch to System.Runtime.Caching? It provide a MemoryCache as pre made cache provider. so by using this, a user can provider any kind of provider to use as cache, and I think this will also get rid of this ESENT db. Just as you switched to Microsoft.Extensions.Logging. It's the same idea.

steve3d avatar Nov 19 '18 04:11 steve3d

Esent is actually quite opposite to memory cache. It is a persistent database which makes it possible to process large models with minimal memory footprint allowing large models to be processed simultaneously or with limited memory resources. There is an IModel interface and you are free to provide any implementation of that and it will work with all the schema implementations.

martin1cerny avatar Nov 19 '18 07:11 martin1cerny

I know that, maybe I didn't make my self clear, I mean if we can switch to System.Runtime.Caching, and use MemoryCache to replace current memory model, then you don't need to provide any non-memory-cache, because there are some non-memory cache providers on nuget, so if users of XbimEssentials need other ways of caching, they can just use existing disk cache with nuget, or stick to the Caching standard to create his/her own disk/object/sql and whatever based cache provider.

by switch to System.Runtime.Caching, you will only need to care about how to use the standard way of caching, not the caching implementation. so basically I'm suggesting a way from currently do what and how to do what only.

steve3d avatar Nov 19 '18 10:11 steve3d

And this is exactly the same you changed from log4net to Microsoft.Extensions.Logging, right?

by using Microsoft.Extensions.Logging, you only need to care about what to log, and leave how to log to log provider.

same idea, by using System.Runtime.Caching, you only need to care what to cache and when to cache, and you don't need to care about how this cache is stored or cached, and this is cache provider's job.

steve3d avatar Nov 19 '18 10:11 steve3d

A cache may work for some scenarios, but not all. A cache is by definition ephemeral - reboot the server and it's gone - and obviously items may be purged based on usage/lack of usage etc.

The Esent IModel implementation was designed for a couple of purposes:

  1. As a long-term persistent store of the IFC data in a structure that can be queried interactively without a significant conversion overhead. This is particularly of benefit for retaining the geometry - you don't want to be recalculating it every time a cache is purged (especially in a we context). Just as importantly, parse/scan is also not cheap on bigger models.
  2. It supports indexing, which the Linq queries can use to optimise querying of large models without loading the whole dataset into RAM. When models get into the 100's MBs you don't want to be holding them in memory, especially if you're providing some kind of web service. The InMemory model is quite naive when evaluating queries, so will basically brute-force enumerate the results, which will be inefficient, when scanning through millions of entities.
  3. If you're editing/creating data you want a persisted ACID record of the changes. We could save back to a new IFC-SPF but it would be heavy weight.

So fundamentally the Esent store is about scalability, long term durable persistence and performance - what you might need when your service has many large models in use on a platform.

I think we should do a bit of analysis on how we make IfcStore more 'plugable', and maybe look at alternate persistent storage solutions. The provider approach sounds right for this, but I don't think the Cache contract is.

andyward avatar Nov 19 '18 14:11 andyward

Dear! Maybe worth a look at LiteDB? http://www.litedb.org/

kozintsev avatar Feb 23 '19 16:02 kozintsev

Hi @kozintsev. LireDB looks very interesting. Feel free to contribute with IModel implementation backed by LiteDB. It would certainly be appreciated by many members of the xBIM community.

martin1cerny avatar Feb 25 '19 08:02 martin1cerny

Interested in what is the status on this after +3yrs since last message to the thread.

Currenty experiencing some out-of-memory problems in IFC files with a huge amount of PSets (and ESENT is not an option as we're running this from Linux).

I could contribute partially in some needed effort to not depend on MemoryModel (i.e. available physical RAM) in Linux just for parsing PSets.

Also, saw that this persistent KV storage for C# looks really promising...

  • https://microsoft.github.io/FASTER/

... but don't know if fits for an ESENT replacement in Xbim 🙂

If you @martin1cerny can provide some additional feedback about the state of this issue-thread, and if you would be available to guide some collaboration here, I would remain available for that 🤗

tmarti avatar Apr 06 '22 10:04 tmarti

I doubt this project is abandoned now. At least is not actively maintaining.

And this esent thing finally drives me away from this project. Because I need cross platform.

steve3d avatar Apr 07 '22 16:04 steve3d

Oh but please @steve3d let's 1st ask and then get conclusions 🤗

@CBenghi @martin1cerny @SteveLockley do you think it's realistic to expect the possibily (if external contributions are also an option here) to land a non-ESENT and non-Memory based IModel implementation into this project?

tmarti avatar Apr 08 '22 12:04 tmarti

I'd better start by saying this project is definitely not abandoned! Lots of organisations have built their services on it, including ourselves (where three of the main project contributors work).

Currently the implementation is fairly stable (at least until IFC4.3 drops) so you won't see much activity aside from fixes. The last year or so we've been exceptionally busy on commercial projects, but we are planning to spend some more time on the toolkit open source on both major projects like IFC4.3, and engineering work like full cross-platform support - but also bug fixes.

In terms of external contributions, we'd welcome PRs from contributors, both for features and bug fixes. A key reason we made xbim open source in the first place was to try not to be dependent on just the goodwill of a few developers.

On this specific Esent replacement issue, at xbim Ltd we do have a couple unpublished libraries that might help here. We have two private IModel implementations that support netstandard/netcore and work cross platform.

  1. We have an implementation backed by Sqlite. This employs a graph data model to fit Express entities into a relational schema. This obviously makes use of indexing etc. This is quite a new library so I'm not sure if it's quite ready for showtime.

  2. We have an IModel implementation we call FileModel which after generating an index file, essentially provides an efficient means of random access to contents of a STEP file with minimal memory overhead. This employs secondary indexing for efficient querying. This is a pretty mature library we've 'battle-hardened'.

We're going to review whether it's appropriate to make either of these libraries available via the open source licence as an alternative to EsentModel and whether we can collaborate with other organisations.

We're also working on a replacement geometry engine that is cross platform - again this is currently private, but compatible with the public interfaces.

andyward avatar Apr 08 '22 16:04 andyward

That's great news @andyward! So many thanks!

I'm keen to try your couple still unpublished SQLiteModel and FileModel pieces!

If publishing them depends on you needing more hands, please tell! 😀

If it instead depends on internal dynamics at Xbim Ltd, just know very strongly there is quite a bunch a people that is already eating popcorn until it happens! 🍿

tmarti avatar Apr 09 '22 14:04 tmarti

Hi @tmarti - cheers we're discussing internally and will be in touch to see if we can isolated the pieces and/or make available via other means. Will be in touch.

andyward avatar Apr 13 '22 09:04 andyward

Looking forward to it @andyward, just mention me again in this thread if that's the case 🤗

tmarti avatar Apr 19 '22 10:04 tmarti

Hello @andyward 😃, did you talk internally about the collaboration possibility?

Would love to help here as it's the last mile preventing us to process huge models (>2GB IFC files) from Linux with Xbim 🤗

tmarti avatar Jun 29 '22 18:06 tmarti