horaedb
                                
                                 horaedb copied to clipboard
                                
                                    horaedb copied to clipboard
                            
                            
                            
                        Release v0.3
Description
We prepare to release v0.3 at the end of Aug. Here is the feature list:
- Release multi-language client. Include Java, Rust and Python.
- Support static cluster mode. And keep pushing toward a full-featured dynamic distributed version (related project: Distributed CeresDB).
- Extend supported SQLs (tag: A-SQL).
- Implement the hybrid storage format. And support reading from two formats.
Feel free to suggest or discuss other features you would like to add :heart:
Will ceresdb support multiple data sources? e.g. read records from mysql's REDO log and structure them into ceresdb's data structure storage
Will ceresdb support multiple data sources?
This sounds like data ingest, are you meaning bulk load?
Will ceresdb support multiple data sources?
This sounds like data ingest, are you meaning bulk load?
yes, which means that ceresdb can import data from other existing commercial database files. I don't know much about this, so i not sure the terminology.
I think bulk ingest is an important feature for easy adoption, prometheus/influxdb all support this, so will we.
This might cover three scenarios. Let's narrow our discussion:
- For offline data migration, our persistent format is relatively straightforward -- only a few metadata and data in the parquet format, all stored in OSS. We can achieve this in a few ways. And for some common formats like CSV or standard parquet generated in other systems, we can also support them directly.
- Online data ingesting, on the other hand, would be a little more complicated. Maybe we need to add support for consuming data from streaming systems like Kafka, Flink, Pulsar or others. They have splendid ecosystems. By supporting them we can easily be integrated into various systems as a downstream warehouse.
- The last one is querying from other databases. This may be a little off-topic but let me mention it as well. CeresDB is only a query frontend in this situation. In some cases I can imagine there are other projects that can do this. So I'll assign a low priority to this.
Offline migrating implementations are different case by case. We can support needed upstream on demand. Online ingesting also has a few candidating upstream, but I believe there is a common pattern among them. We can choose one to support at first if we decide to work on this. It can take a lot of effort and we need to discuss it further.
Thanks for the summary @waynexia . I will give some additional comments on these scenarios.
- For data migration or data initialization from external data source, there could be some tools. But as far as I know, demands of this scenario is not so frequent. This feature can be implemented as an independent binary, like tools in mysql ecosystem. We can discuss this feature later.
- Online data ingestion, this is a much more complex topic. If we start working on this, we should consider latency, consistency, transformation and other aspects in real-time computing. These requirements are commonly implemented using stream-computing framework like Apache Flink. So, in my opinion, the CeresDB project will be more focusing on core features of time-series database its own.
- For the scenario: querying from other databases, there is a better choice, presto. So, we will not work on this direction.
Released https://github.com/CeresDB/ceresdb/releases/tag/v0.3.0