rondb icon indicating copy to clipboard operation
rondb copied to clipboard

Documentation about differences between InnoDB and RonDB storage engines

Open zarere opened this issue 1 year ago • 19 comments

Is there a documentation for unsupported features between InnoDB and RonDB ? Something like a markdown table showing what is not supported in RonDB but supported in InnnoDB. That will help a lot with migrations from InnoDB to RonDB. Thank you in advance.

zarere avatar Mar 28 '24 09:03 zarere

Since RonDB is based on MySQL NDB Cluster, you can use the descriptions of changes between InnoDB and NDB Cluster. The RonDB changes should not change anything related to how one moves from InnoDB to to RonDB, one should do the same as when moving from InnoDB to NDB. Here is one page in the NDB docs: https://dev.mysql.com/doc/mysql-cluster-excerpt/8.3/en/mysql-cluster-ndb-innodb-engines.html

mronstro avatar Mar 28 '24 15:03 mronstro

Foreign Keys are supported by both InnoDB and RonDB, there are minor differences in how they are operating. There is one thing that RonDB does that InnoDB doesn't do and RonDB is a little bit different in using FOREIGN_KEY_CHECKS flag.

mronstro avatar Mar 28 '24 15:03 mronstro

MySQL Replication is supported by both, however RonDB uses epochs as documented in RonDB Documentation and InnoDB can use MySQL Group Replication.

mronstro avatar Mar 28 '24 15:03 mronstro

Another thing to consider is that VIEWs, Triggers and Stored Procedures are local to a MySQL Server, so if you have multiple MySQL Servers in a RonDB cluster, you need to define them in all MySQL Servers. For user management one can use NDB_STORED_USER which makes the users global to all MySQL Servers in the RonDB cluster.

mronstro avatar Mar 28 '24 15:03 mronstro

I was aware with all the information that you wrote above before posting this question here but some important points remained unanswered for me so I will post them here:

  1. Is there a way to connect RonDB as a replica of a source that is using InnoDB storage engine ?
  2. Since I am 99% sure that the above point can not be done because of the epochs in RonDB my next question is:
  • Is there a way to migrate to RonDB ( from InnoDB ) with minimal downtime ? I saw that the data can be exported ( from another storage engine) and then imported in RonDB but that will take enormous amount of time. Even that is accomplished a new data will still be hitting the source storage engine ( since the apps will be still running ) and the data can not be the same ( in both storage engines ) in order to point the apps to the new storage engine.

So is there an option to connect RonDB with InnoDB ( without stopping the traffic for so long ) so the data can be the same in the both storage engines ? Thanks again for your reply.

zarere avatar Mar 28 '24 18:03 zarere

This Quora issue explains how to do it, it will work for RonDB as well. So essentially the answer is yes, you can connect an InnoDB instance and replicate to RonDB. The epochs are translated to transactions, so this means that the MySQL slave can replicate both binlogs coming from RonDB and from InnoDB.

https://www.quora.com/Can-I-switch-from-normal-mysql-to-mysql-cluster-without-losing-data-or-doing-a-restart

Interestingly the documentation has no information on this special topic, neither MySQL nor RonDB. I will put it on our TODO list to develop such RonDB documentation.

mronstro avatar Mar 29 '24 13:03 mronstro

Thank you very much for the information. So the replication can be made only with old way using binlog position and not with GTIDs right or the replication can work also with GTIDs in RonDB ? In the ndb documentation I see that GTIDs are not supported.

zarere avatar Mar 29 '24 13:03 zarere

It is correct that GTIDs cannot be used with NDB Cluster, it was discussed when I was at Oracle a number of times, but never decided to be done, haven't heard anything about it since I joined Hopsworks. But using binlog positions should work fine.

mronstro avatar Mar 29 '24 15:03 mronstro

Thank you very much for the information Mikael.

zarere avatar Mar 29 '24 15:03 zarere

Good luck with the migration and let us know how it works.

mronstro avatar Mar 29 '24 16:03 mronstro

Hello @mronstro

I just found RonDB and it seems fascinating. I am trying to understand if someone can migrate from InnoDB to RonDB. It would be welcome to have something like that written in your documentation.

Also another case i try to underestand is, can i choose RonDB over Galera Cluster? Will our applications scale better with RonDB? What advantages RonDB has over Galera Cluster or InnoDB Cluster? What is the compatibility level among these two?

It would be great to have such comparisons in your docs.

MichalisDBA avatar May 23 '24 12:05 MichalisDBA

I had a chapter in my book about MySQL NDB Cluster about the differences between InnoDB Cluster and NDB Cluster. We are planning to integrate this also into the RonDB documentation.

There are a number of differences between InnoDB Cluster and RonDB. InnoDB Cluster is a cluster of single-machine DBMSs that can replicate data between themselves. There is no concept of sharding and partitioning in InnoDB Cluster. InnoDB is a disk-based storage engine with a clustered index that can be used both for primary keys and ordered indexes.

RonDB is by nature a clustered DBMS, this means that it is highly available, it can grow as online operation and the algorithms in RonDB are very much designed for high availability and no downtime for management operations and node failures. RonDB is an distributed in-memory database with a hash index for primary keys and unique keys and a T-tree for ordered indexes. Non-indexed data or the value part of a key-value store can be on disk as well. Data in RonDB is recoverable and new nodes can join the cluster as online operations. It is possible to dynamically grow and shrink the number of replicas in the RonDB cluster.

Galera Cluster is similar to InnoDB Cluster.

One difference is that InnoDB Cluster replicates transactions after completion in the primary replica. This limits the number of transactions per second to tens of thousands whereas RonDB can handle up to 20 million transactions per second since all transactions can happen in parallel.

Hope this helps while waiting for us to add the chapter on a comparison between RonDB and InnoDB Cluster.

Which cluster to use as usual depends on the use case. Feel free to ask more questions if you need more details.

mronstro avatar May 24 '24 18:05 mronstro

Hi @mronstro, i am really grateful for your reply. I didn't know you had a book written about NDB. Definitely i am going to purchase it to deep dive on NDB's internals to learn more. I really want to deploy RonDB to production and migrate our databases there if it meets our expectations. The hardest part is when businesses i work for, push the business logic to the database instead of the application and makes the migration process harder. Databases are for storing, retrieving and updating data. I hate when they use triggers, foreign constrains, etc to do the application's job.

MichalisDBA avatar May 25 '24 12:05 MichalisDBA

I understand your concern about foreign keys and triggers. I participated in a project at MySQL where we wanted to implement foreign keys in the MySQL Server, it was interesting how complex it was and how different DBMSs did things differently. The report on it was more than 100 pages long, so more or less beyond human understanding :)

Triggers are very useful in implementing internal DBMS functions like keeping indexes up-to-date, scaling up with more nodes, recovery and many more parts of DBMS internals, we use it heavily inside RonDB, but using it in the application is not something we recommend.

mronstro avatar May 27 '24 17:05 mronstro

The new chapter on comparison between RonDB and InnoDB should be online now in docs.rondb.com

mronstro avatar Jun 18 '24 10:06 mronstro

Hi @mronstro

I just read the chapter for the comparison between RonDB and InnoD and i am truly amazed about the level of details you wrote. I am just starting to learn about NDB Cluster. I have used only InnoDB in my career and really want to reach a level of knowledge so i can deploy RonDB to our production systems. I have also ordered your book and can't wait to read it.

One thing it can't seem to click in my mind is, RonDB stores data in memory with non indexed columns to hard disks in data nodes. With InnoDB you can have a MySQL server, for example, with 64GB of RAM and 500GB of data on disk. If someone wanted to migrate the same data to RonDB, should he deploy data nodes with at least 500GB of RAM each? What is the analogy in this situation? How to calculate the correct specs the data nodes must have from InnoDB to RonDB?

Also in RonDB documentation i can't find how to setup RonDB as a service. Do i have to write a systemd file? Is there an example template to use? Usually when you install mysql server dnf install mysql-community-server there is a systemd file created that then you can use systemctl start mysqld and systemctl enable mysqld.

Is there something similar in RonDB installation?

Thank you.

MichalisDBA avatar Jun 18 '24 13:06 MichalisDBA

I spent 17 years working in MySQL and some of those years I worked both in MySQL Server and even did a number of patches for InnoDB and I worked with many developers with inside knowledge about how InnoDB works. One of our RonDB developers in Hopsworks also have several years of experience with InnoDB, so there is a lot to write about :)

How much memory one needs when replacing InnoDB with RonDB has no simple formula unfortunately. All indexes in RonDB are stored in memory (around 15-16 bytes per row for hash indexes and about 10 bytes per row for ordered indexes). If the application has information that isn't required to index, that is very suitable to place in the disk columns. For example in the Hopsworks Feature store we can usually store all the features in disk columns. At the same time many applications use RonDB to achieve low latency in which case the in-memory columns are preferrable. So it depends a bit on your application how much memory one would require. But it is not uncommon that one can store 10x more data in disk columns than in memory.

mronstro avatar Jun 18 '24 14:06 mronstro

Regarding how to run RonDB as a service there are many solutions. Hopsworks have a cloud service for our customer that runs in AWS, Azure and Google cloud, we can install also RonDB and Hopsworks in on-prem situations. Very soon there will also be a Kubernetes solution available to manage RonDB and Hopsworks installations.

Oracle also have MySQL Cluster Manager for managing NDB, and I think that they also have a Kubernetes operator that I am a bit uncertain of how stable it is.

Finally there is also a set of open source scripts in github.com/logicalclocks/ndb-chef. In that tree you will find the systemd scripts we use to start and stop various RonDB services.

I also developed a framework for running RonDB benchmarks that can be found in: github.com/logicalclocks/dbt2-0.37.50 and github.com/logicalclocks/sysbench-0.4.12

mronstro avatar Jun 18 '24 14:06 mronstro

For trying out RonDB on a local machine one can also use github.com/logicalclocks/rondb-docker

In the RonDB tree there is also a test facility called MTR, it is usually used to run tests, but it can also be used to quickly spawn up a RonDB cluster.

cd RONDB_PATH/build/mysql-test ./mtr --suite=ndb --start-and-exit A cluster comes up within about 30 seconds killall -9 mysqld ndbmtd ndb_mgmd to shut it down again. Edit RONDB_PATH/mysql-test/include/default_ndbd.cnf to change the configuration of the cluster.

Alternatively use: ./mtr --suite=ndbcrunch --defaults-extra-file=suite/ndbcrunch/cpubind.cnf --start-and-exit and edit RONDB_PATH/suite/ndbcrunch/my.cnf and cpubind.cnf

mronstro avatar Jun 18 '24 15:06 mronstro