incubator-pegasus icon indicating copy to clipboard operation
incubator-pegasus copied to clipboard

Release 2.4.0

Open foreverneverer opened this issue 3 years ago • 2 comments

New Module

From this version, more project module will join Apache Pegasus Project. In this version, the following projects are included:

New architecture

In this version, we remove the shared log to enhance the pegasus performance, Related pull request as follow:

  • https://github.com/XiaoMi/rdsn/pull/993
  • https://github.com/XiaoMi/rdsn/pull/994
  • https://github.com/XiaoMi/rdsn/pull/999
  • https://github.com/XiaoMi/rdsn/pull/1019
  • https://github.com/XiaoMi/rdsn/pull/1022
  • https://github.com/XiaoMi/rdsn/pull/1048
  • https://github.com/XiaoMi/rdsn/pull/1028
  • https://github.com/apache/incubator-pegasus/pull/890

New Feature

Replica-factor update

Supporting flexible replica count. In the past, the replica factor was Immutable once one table was created. In current version, user can dynamically adjust the factor of specified table. Related pull request as follow:

  • https://github.com/XiaoMi/rdsn/pull/1061
  • https://github.com/XiaoMi/rdsn/pull/1072
  • https://github.com/XiaoMi/rdsn/pull/1077
  • https://github.com/XiaoMi/rdsn/pull/1087
  • https://github.com/XiaoMi/rdsn/pull/1109
  • https://github.com/XiaoMi/rdsn/pull/1110
  • https://github.com/XiaoMi/rdsn/pull/1115
  • https://github.com/apache/incubator-pegasus/pull/914
  • https://github.com/apache/incubator-pegasus/pull/999
  • https://github.com/apache/incubator-pegasus/pull/1035

Read Request Limiter

In the past, we only support write limiter, in this version, we add the supporting for read:

  • https://github.com/XiaoMi/rdsn/pull/941
  • https://github.com/XiaoMi/rdsn/pull/939
  • https://github.com/apache/incubator-pegasus/pull/829
  • https://github.com/XiaoMi/rdsn/pull/948
  • https://github.com/XiaoMi/rdsn/pull/947
  • https://github.com/XiaoMi/rdsn/pull/946

Jemalloc Support

  • https://github.com/XiaoMi/rdsn/pull/910
  • https://github.com/apache/incubator-pegasus/pull/1050

Build Feature

We have made some restrictions on the compilation environment and support MacOS and aarcch64:

  • https://github.com/XiaoMi/rdsn/pull/1041
  • https://github.com/XiaoMi/rdsn/pull/1034
  • https://github.com/XiaoMi/rdsn/pull/1097
  • https://github.com/apache/incubator-pegasus/pull/1049

New BatchGetAPI

In the past, the batchGet implement based the singleGet, the latest version will aggregate different request first berfore sending, it will improve the performace:

  • https://github.com/apache/incubator-pegasus/pull/897
  • https://github.com/XiaoMi/pegasus-java-client/pull/175

Task Queue limiter

  • https://github.com/XiaoMi/rdsn/pull/902
  • https://github.com/apache/incubator-pegasus/pull/831

Feature enhancement

Bulkload

We improve bulkload feature to reduce the io-load of downloading and ingesting, besides, we offer better interfaces and failure handling logic, the related pull request as follow:

  • https://github.com/XiaoMi/rdsn/pull/952
  • https://github.com/XiaoMi/rdsn/pull/958
  • https://github.com/XiaoMi/rdsn/pull/959
  • https://github.com/XiaoMi/rdsn/pull/960
  • https://github.com/XiaoMi/rdsn/pull/962
  • https://github.com/XiaoMi/rdsn/pull/964
  • https://github.com/XiaoMi/rdsn/pull/967
  • https://github.com/XiaoMi/rdsn/pull/1004
  • https://github.com/XiaoMi/rdsn/pull/1011
  • https://github.com/XiaoMi/rdsn/pull/1018
  • https://github.com/XiaoMi/rdsn/pull/1027
  • https://github.com/XiaoMi/rdsn/pull/1031
  • https://github.com/XiaoMi/rdsn/pull/1035
  • https://github.com/XiaoMi/rdsn/pull/1039
  • https://github.com/XiaoMi/rdsn/pull/1102
  • https://github.com/XiaoMi/rdsn/pull/1103
  • https://github.com/XiaoMi/rdsn/pull/1104
  • https://github.com/XiaoMi/rdsn/pull/1105
  • https://github.com/XiaoMi/rdsn/pull/1069
  • https://github.com/XiaoMi/rdsn/pull/1074
  • https://github.com/XiaoMi/rdsn/pull/1009
  • https://github.com/apache/incubator-pegasus/pull/881
  • https://github.com/apache/incubator-pegasus/pull/888
  • https://github.com/apache/incubator-pegasus/pull/968
  • https://github.com/apache/incubator-pegasus/pull/975
  • https://github.com/XiaoMi/rdsn/pull/1002
  • https://github.com/apache/incubator-pegasus/pull/868

Duplication

In the past, duplication has some shortcoming: It depends remote filesystem to sync the checkpoint; The synchronization of plog data only sends a single mutation at each RPC. In this version, we enhance the above problem(the detail design see https://github.com/apache/incubator-pegasus/issues/892), related pull request as follows:

  • https://github.com/XiaoMi/rdsn/pull/1038
  • https://github.com/XiaoMi/rdsn/pull/1040
  • https://github.com/XiaoMi/rdsn/pull/1045
  • https://github.com/XiaoMi/rdsn/pull/1046
  • https://github.com/XiaoMi/rdsn/pull/1049
  • https://github.com/XiaoMi/rdsn/pull/1051
  • https://github.com/XiaoMi/rdsn/pull/1053
  • https://github.com/XiaoMi/rdsn/pull/1055
  • https://github.com/XiaoMi/rdsn/pull/1056
  • https://github.com/XiaoMi/rdsn/pull/1059
  • https://github.com/XiaoMi/rdsn/pull/1060
  • https://github.com/XiaoMi/rdsn/pull/1063
  • https://github.com/XiaoMi/rdsn/pull/1064
  • https://github.com/XiaoMi/rdsn/pull/1065
  • https://github.com/apache/incubator-pegasus/pull/917
  • https://github.com/XiaoMi/rdsn/pull/1066
  • https://github.com/XiaoMi/rdsn/pull/1067
  • https://github.com/apache/incubator-pegasus/pull/919
  • https://github.com/XiaoMi/rdsn/pull/1071
  • https://github.com/XiaoMi/rdsn/pull/1076
  • https://github.com/XiaoMi/rdsn/pull/1080
  • https://github.com/apache/incubator-pegasus/pull/930
  • https://github.com/XiaoMi/rdsn/pull/1084
  • https://github.com/apache/incubator-pegasus/pull/935
  • https://github.com/apache/incubator-pegasus/pull/936
  • https://github.com/XiaoMi/rdsn/pull/1085
  • https://github.com/apache/incubator-pegasus/pull/940
  • https://github.com/XiaoMi/rdsn/pull/1121
  • https://github.com/apache/incubator-pegasus/pull/1007
  • https://github.com/apache/incubator-pegasus/pull/1008
  • https://github.com/XiaoMi/rdsn/pull/976
  • https://github.com/apache/incubator-pegasus/pull/1065
  • https://github.com/apache/incubator-pegasus/pull/1078

PerfCounter

In the version, we support new metric implement to optimize performance:

  • https://github.com/XiaoMi/rdsn/pull/1033
  • https://github.com/XiaoMi/rdsn/pull/1070
  • https://github.com/XiaoMi/rdsn/pull/1073
  • https://github.com/XiaoMi/rdsn/pull/1075
  • https://github.com/XiaoMi/rdsn/pull/1081
  • https://github.com/apache/incubator-pegasus/pull/1074

Manual Compaction

  • https://github.com/XiaoMi/rdsn/pull/989
  • https://github.com/XiaoMi/rdsn/pull/987
  • https://github.com/XiaoMi/rdsn/pull/983
  • https://github.com/apache/incubator-pegasus/pull/854
  • https://github.com/XiaoMi/rdsn/pull/981

Learn with NFS

To reduce the impact of data migration for IO-LOAD and ensure the migration rate, our data transmission supports disk level speed limits:

  • https://github.com/XiaoMi/rdsn/pull/944
  • https://github.com/XiaoMi/rdsn/pull/943
  • https://github.com/XiaoMi/rdsn/pull/985

Latency Tracer

The latest latency tracer support perf-counter and fix some bugs:

  • https://github.com/XiaoMi/rdsn/pull/1029
  • https://github.com/XiaoMi/rdsn/pull/1023
  • https://github.com/XiaoMi/rdsn/pull/951
  • https://github.com/XiaoMi/rdsn/pull/945
  • https://github.com/XiaoMi/rdsn/pull/965

Other important

  • https://github.com/XiaoMi/rdsn/pull/1086
  • https://github.com/XiaoMi/rdsn/pull/988
  • https://github.com/XiaoMi/rdsn/pull/979
  • https://github.com/XiaoMi/rdsn/pull/978
  • https://github.com/XiaoMi/rdsn/pull/963
  • https://github.com/apache/incubator-pegasus/pull/870
  • https://github.com/apache/incubator-pegasus/pull/852
  • https://github.com/apache/incubator-pegasus/pull/907
  • https://github.com/apache/incubator-pegasus/pull/1009
  • https://github.com/apache/incubator-pegasus/pull/1085
  • https://github.com/apache/incubator-pegasus/pull/1061

Java Client

  • https://github.com/XiaoMi/pegasus-java-client/pull/184
  • https://github.com/XiaoMi/pegasus-java-client/pull/181
  • https://github.com/XiaoMi/pegasus-java-client/pull/180
  • https://github.com/XiaoMi/pegasus-java-client/pull/177
  • https://github.com/apache/incubator-pegasus/pull/1019
  • https://github.com/apache/incubator-pegasus/pull/1000
  • https://github.com/apache/incubator-pegasus/pull/973

Go Client

  • https://github.com/XiaoMi/pegasus-go-client/pull/110
  • https://github.com/XiaoMi/pegasus-go-client/pull/109
  • https://github.com/XiaoMi/pegasus-go-client/pull/107
  • https://github.com/XiaoMi/pegasus-go-client/pull/105
  • https://github.com/XiaoMi/pegasus-go-client/pull/104
  • https://github.com/XiaoMi/pegasus-go-client/pull/102
  • https://github.com/XiaoMi/pegasus-go-client/pull/101
  • https://github.com/XiaoMi/pegasus-go-client/pull/99

Python Client

  • https://github.com/apache/incubator-pegasus/pull/977

Admin Cli

  • support nodes migrator
    • https://github.com/pegasus-kv/admin-cli/pull/62
    • https://github.com/pegasus-kv/admin-cli/pull/60
    • https://github.com/pegasus-kv/admin-cli/pull/59
    • https://github.com/pegasus-kv/admin-cli/pull/54
    • https://github.com/pegasus-kv/admin-cli/pull/53
    • https://github.com/pegasus-kv/admin-cli/pull/52
    • https://github.com/pegasus-kv/admin-cli/pull/51
    • https://github.com/pegasus-kv/admin-cli/pull/50
  • https://github.com/pegasus-kv/admin-cli/pull/58
  • https://github.com/pegasus-kv/admin-cli/pull/57
  • https://github.com/pegasus-kv/admin-cli/pull/56
  • https://github.com/pegasus-kv/admin-cli/pull/55
  • https://github.com/apache/incubator-pegasus/pull/987
  • https://github.com/apache/incubator-pegasus/pull/969
  • https://github.com/apache/incubator-pegasus/pull/958
  • https://github.com/apache/incubator-pegasus/pull/1006
  • https://github.com/apache/incubator-pegasus/pull/976

Pegasus Docker

  • https://github.com/apache/incubator-pegasus/pull/1011

Code Refactor

  • https://github.com/XiaoMi/rdsn/pull/1068
  • https://github.com/XiaoMi/rdsn/pull/1062
  • https://github.com/XiaoMi/rdsn/pull/1050
  • https://github.com/XiaoMi/rdsn/pull/1032
  • https://github.com/XiaoMi/rdsn/pull/1030
  • https://github.com/XiaoMi/rdsn/pull/1015
  • https://github.com/XiaoMi/rdsn/pull/1012
  • https://github.com/XiaoMi/rdsn/pull/1010
  • https://github.com/XiaoMi/rdsn/pull/1010
  • https://github.com/XiaoMi/rdsn/pull/1003
  • https://github.com/XiaoMi/rdsn/pull/1000
  • https://github.com/XiaoMi/rdsn/pull/998
  • https://github.com/XiaoMi/rdsn/pull/997
  • https://github.com/XiaoMi/rdsn/pull/996
  • https://github.com/XiaoMi/rdsn/pull/995
  • https://github.com/XiaoMi/rdsn/pull/992
  • https://github.com/XiaoMi/rdsn/pull/974
  • https://github.com/XiaoMi/rdsn/pull/968
  • https://github.com/XiaoMi/rdsn/pull/950
  • https://github.com/XiaoMi/rdsn/pull/942
  • https://github.com/apache/incubator-pegasus/pull/1021
  • https://github.com/apache/incubator-pegasus/pull/1005
  • https://github.com/apache/incubator-pegasus/pull/921
  • https://github.com/apache/incubator-pegasus/pull/916

Bug Fix

Core

  • https://github.com/XiaoMi/rdsn/pull/1016
  • https://github.com/XiaoMi/rdsn/pull/1008
  • https://github.com/XiaoMi/rdsn/pull/1099
  • https://github.com/XiaoMi/rdsn/pull/1017
  • https://github.com/XiaoMi/rdsn/pull/1052

Common

  • https://github.com/XiaoMi/rdsn/pull/1047
  • https://github.com/XiaoMi/rdsn/pull/1088
  • https://github.com/XiaoMi/rdsn/pull/1044
  • https://github.com/XiaoMi/rdsn/pull/1037
  • https://github.com/XiaoMi/rdsn/pull/1036
  • https://github.com/XiaoMi/rdsn/pull/1001
  • https://github.com/XiaoMi/rdsn/pull/990
  • https://github.com/XiaoMi/rdsn/pull/984
  • https://github.com/XiaoMi/rdsn/pull/982
  • https://github.com/XiaoMi/rdsn/pull/980
  • https://github.com/apache/incubator-pegasus/pull/828
  • https://github.com/apache/incubator-pegasus/pull/984
  • https://github.com/apache/incubator-pegasus/pull/909
  • https://github.com/apache/incubator-pegasus/pull/911
  • https://github.com/apache/incubator-pegasus/pull/833

Performance

In this benchmark, we use the new machine, for the result is more reasonable, we re-run the Pegasus Server 2.3:

  • Machine parameters: DDR4 16G * 8 | Intel Silver4210*2 2.20Ghz/3.20Ghz | SSD 480G * 8 SATA
  • Cluster Server: 3 * MetaServerNode 5 * ReplicaServerNode
  • YCSB Client: 3 * ClientNode
  • Request Length: 1KB(set/get)
  • Centos7 5.4.54-2.0.4.std7c.el7.x86_64

Pegasus Server 2.3

Case client and thread R:W R-QPS R-Avg R-P99 W-QPS W-Avg W-P99
Write Only 3 clients * 15 threads 0:1 - - - 48805 919 2124
Read Only 3 clients * 50 threads 1:0 370068 402 988 - - -
Read Write 3 clients * 30 threads 1:1 50762 532 5859 50759 1233 4162
Read Write 3 clients * 15 threads 1:3 14471 443 3869 43425 884 1899
Read Write 3 clients * 15 threads 1:30 1583 473 3432 47551 928 2066
Read Write 3 clients * 30 threads 3:1 119093 406 3367 39693 1035 2581
Read Write 3 clients * 50 threads 30:1 322904 435 1034 10762 882 1392

Pegasus Server 2.4

Case client and thread R:W R-QPS R-Avg R-P99 W-QPS W-Avg W-P99
Write Only 3 clients * 15 threads 0:1 - - - 56953 787 1786
Read Only 3 clients * 50 threads 1:0 360642 413 984 - - -
Read Write 3 clients * 30 threads 1:1 62572 464 5274 62561 985 3764
Read Write 3 clients * 15 threads 1:3 16844 372 3980 50527 762 1551
Read Write 3 clients * 15 threads 1:30 1861 381 3557 55816 790 1688
Read Write 3 clients * 30 threads 3:1 140484 351 3277 46822 856 2044
Read Write 3 clients * 50 threads 30:1 336106 419 1221 11203 763 1276

Config-Update

+ [pegasus.server]
+ rocksdb_max_log_file_size = 8388608
+ rocksdb_log_file_time_to_roll = 86400
+ rocksdb_keep_log_file_num = 32

+ [replication]
+ plog_force_flush = false
  
- mutation_2pc_min_replica_count = 2
+ mutation_2pc_min_replica_count = 0 # 0 means it's value based table max replica count
  
+ enable_direct_io = false # Whether to enable direct I/O when download files from hdfs, default false
+ direct_io_buffer_pages = 64 # Number of pages we need to set to direct io buffer, default 64 which is recommend in my test.
+ max_concurrent_manual_emergency_checkpointing_count = 10
  
+ enable_latency_tracer_report = false
+ latency_tracer_counter_name_prefix = trace_latency
  
+ hdfs_read_limit_rate_mb_per_sec = 200
+ hdfs_write_limit_rate_mb_per_sec = 200
  
+ duplicate_log_batch_bytes = 0 # 0 means no batch before sending
  
+ [nfs]
- max_copy_rate_megabytes = 500
+ max_copy_rate_megabytes_per_disk = 0
- max_send_rate_megabytes = 500
+ max_send_rate_megabytes_per_disk = 0
  
+ [meta_server]
+ max_reserved_dropped_replicas = 0
+ bulk_load_verify_before_ingest = false
+ bulk_load_node_max_ingesting_count = 4
+ bulk_load_node_min_disk_count = 1
+ enable_concurrent_bulk_load = false
+ max_allowed_replica_count = 5
+ min_allowed_replica_count = 1
  
+ [task.LPC_WRITE_REPLICATION_LOG_SHARED]
+ enable_trace = true # true will mark the task will be traced latency if open global trace

Contributors

acelyc111 cauchy1988 empiredan foreverneverer GehaFearless GiantKing happydongyaoyao hycdong levy5307 lidingshengHHU neverchanje padmejin Smityz totalo WHBANG xxmazha ZhongChaoqiang

foreverneverer avatar Jul 05 '22 06:07 foreverneverer

Several problems have been found according to the checklist for Incubator release:

  • In LICENSE and .licenserc.yaml, some file paths have been not updated after refactor;
  • NOTICE year is still the last year.

Thus some PRs have been committed to fix these problems as follows:

  • [x] https://github.com/apache/incubator-pegasus/pull/1123
  • [x] https://github.com/apache/incubator-pegasus/pull/1121
  • [x] https://github.com/apache/incubator-pegasus/pull/1119

These PRs should be cherry-picked to v2.4 to meet the requirements for Incubator release.

empiredan avatar Aug 17 '22 06:08 empiredan

Since we have cherry-pick more commit into v2.4, which involved the cmake module. Suggested by @acelyc111, I re-run benchmark as follow:

  • CentOS7 3.10.0-1160.1.0.el7.x86_64

Pegasus Server 2.4

Case client and thread R:W R-QPS R-Avg R-P99 W-QPS W-Avg W-P99
Write Only 3 clients * 15 threads 0:1 - - - 55490 808 3540
Read Only 3 clients * 50 threads 1:0 361112 414 997 - - -
Read Write 3 clients * 30 threads 1:1 63581 469 5447 63580 939 4959
Read Write 3 clients * 15 threads 1:3 16559 396 4228 49664 769 3987
Read Write 3 clients * 15 threads 1:30 1730 413 3669 51966 849 4735
Read Write 3 clients * 30 threads 3:1 135091 376 3007 45304 842 4753
Read Write 3 clients * 50 threads 30:1 319519 444 1442 10643 819 2691

For some reasons, I cannot run under centos7 5.4.54-2.0.4.std7c.el7.x86_ 64, which may also lead to some differences in results from the last, I will retest some previous versions before the official release.

foreverneverer avatar Aug 23 '22 09:08 foreverneverer

Some more license issues have been resolved:

  • https://github.com/apache/incubator-pegasus/pull/1173
  • https://github.com/apache/incubator-pegasus/pull/1176

acelyc111 avatar Sep 28 '22 10:09 acelyc111

https://github.com/apache/incubator-pegasus/releases/tag/v2.4.0

acelyc111 avatar Nov 01 '22 03:11 acelyc111