ng-monitoring
                                
                                
                                
                                    ng-monitoring copied to clipboard
                            
                            
                            
                        genjidb vs sqlite?
The storage engine is evolved as the follows:
Use a KV that supports ZSTD to achieve max compression → Use genjidb for easier access over that KV engine
However, as we are now actually not using ZSTD for block-compressing, but compressing at a per-profile level, the genjidb + badger is not the only choice any more. For example, as the most widely deployed database engine, Sqlite may be a better choice.
I do a simple test to compare the genjidb and sqlite, here is the result:
write & read performance
genjidb total_write_size: 1301.977 MB, cost: 2.286s
genjidb total_read_size: 1301.977 MB, cost: 474.471211ms
sqlite total_write_size: 1301.977 MB, cost: 7.697s
sqlite total_read_size: 1301.977 MB, cost: 955.562619ms
# sqlite with batch commit when write data
sqlite total_write_size: 1301.977 MB, cost: 3.824s
sqlite total_read_size: 1301.977 MB, cost: 1.115740125s
genjidb is a little bit faster than sqlite.
data compression
▶ du -h profile-data-file   --original profile data directory
1.3G    profile-data-file
▶ du -h /tmp/badger  -- genjidb (badger) data directory
1.3G    /tmp/badger
▶ du -h foo.db  --sqlite.
1.3G    foo.db
                                    
                                    
                                    
                                
@crazycs520 Doing a benchmark is a good attempt, however it does not help answering the problem I raised:
- 
The workload you are testing with is not the real world workload. Conprof never continuously write bulk profiling data. For example, there is no difference when writing takes 2s or 20s, as the Conprof only writes at a 1 minute interval.
 - 
Even with the real world workload, the duration metric is trivial when other important aspects are not considered. For example, I can implement a simple db that beats genjidb and sqlite with a 0.0001s write latency totally, by simply performing the write to a memory buffer. The following questions should be at least checked:
 
- What is the crash assurance for the genjidb? What will happen when you performed a write and then the power is lost? Will there be data lost or corruption?
 - How is the memory consumption?
 - How they performs when this process is running for long time?
 - How is the stability?
 
- There are also other aspects needs to be evaluated when comparing different solutions. Named a few:
 
- Code quality
 - Feature sets
 - The behavior for our future possible workloads
 - etc.