ErXi issues

Results 11 issues of


                                            ErXi

[Hadoop_Notes] 关于 HDFS 需要掌握哪些内容？

该 issue 用于记录 HDFS 相关的内容~~

hadoop

[FLINK_CDC] 调研 FLINK_CDC

当前项目数据源较多，目前各个指标均存放在 MySQL 中，后续可能会同步到 Hive、Hudi 以及 ClickHouse 等数据库中。关于 MySQL 数据全量同步到 Hive 使用的是 DataX，但由于其支持的数据源较少，因此需要调研新的数据集成与同步框架。在初步对比 flink_cdc 和 seatunnel 之后，考虑使用门槛，先调研 flink_cdc～～

flink_cdc

**FINISH:** - [Spark-Core](https://github.com/QuakeWang/BigData-Notes/tree/main/code/SparkTutorial/spark-core)：编写一篇使用 RDD 计算热门商品的博客 - [SparkSQL](https://github.com/QuakeWang/BigData-Notes/tree/main/code/SparkTutorial/spark-sql)：结合《Spark 权威指南》完善对于 DataSet 的使用 - [SparkStreaming](https://github.com/QuakeWang/BigData-Notes/tree/main/code/SparkTutorial/spark-streaming)

spark

Update Jupyter notebook快速上手.md

修改运行Jupyter notebook命令行，由 `jupter notebook` 更改为 `jupyter notenook`

feat(catalog): add catalog API

fix: #52

ut: add serialize/deserialize tests for spec

part of #26

Implement the Catalog API

when #27 is ready we can add [Catalog](https://github.com/apache/paimon/blob/release-0.8.2/paimon-core/src/main/java/org/apache/paimon/catalog/Catalog.java) API. Maybe we can add struct with definition first, the function and the detail behavior can be implemented for the further PRs.

version-2.1

version-dev

version-2.0

version-3.0

ErXi

[Hadoop_Notes] 关于 HDFS 需要掌握哪些内容？

[FLINK_CDC] 调研 FLINK_CDC

[Spark_Notes] Spark 学习指南

Update Jupyter notebook快速上手.md

feat(catalog): add catalog API

ut: add serialize/deserialize tests for spec

Implement the Catalog API

spec: Implement Changelog File

[Doris] Doris 入门指南

chore: For better reading experience