tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

DM: support PCRE regexp syntax for routing

Open takaidohigasi opened this issue 1 year ago • 7 comments

Is your feature request related to a problem?

I'd like to consolidate many schema by a simple route config.

Describe the feature you'd like

if there are databases for source like

  • mercari_a
  • mercari_b
  • mercari_c
  • mercari_d
  • mercari_e

following config do not work

  route-a:
    schema-pattern: "mercari_a"
    target-schema: "mercari"
  route-others:
    schema-pattern: "mercari_(?!a).*"
    target-schema: "mercari-other"

Describe alternatives you've considered

No response

Teachability, Documentation, Adoption, Migration Strategy

No response

takaidohigasi avatar Oct 03 '24 15:10 takaidohigasi

I might mistake the repository to create issue https://github.com/pingcap/tidb/blob/master/pkg/util/table-router/router.go

takaidohigasi avatar Oct 03 '24 15:10 takaidohigasi

I understood Golang regxp support RE2 syntax, not PCRE

takaidohigasi avatar Oct 03 '24 16:10 takaidohigasi

I understand there isn't lightweight safe proper library for Golang...

takaidohigasi avatar Oct 03 '24 16:10 takaidohigasi

You can use mercari_[^a] in RE2 syntax. I think it's confusing to have two RE flavors and they can be converted to the other in most cases.

lance6716 avatar Oct 08 '24 05:10 lance6716

the actual wildcard syntax is mercari_[!a]. you need to prefix with ~ to use the regexp syntax. but the doc still seems to only recommend the wildcards * and ? at the end

schema-pattern: 'mercari_[!a]'
# or
schema-pattern: '~^mercari_[^a]'

kennytm avatar Oct 08 '24 06:10 kennytm

sorry My example was a bit wrong and I think I can't cope with existing RE2 syntax

  • mercari_sample
  • mercari_sample2
  • mercari_sample_hoge
  • mercari_asample
  • mercari_bsample
  • ...

following config do not work

  route-a:
    schema-pattern: "mercari_sample"
    target-schema: "mercari"
  route-others:
    schema-pattern: "^mercari_(?!sample$).*$"

takaidohigasi avatar Oct 22 '24 10:10 takaidohigasi

It won't work because assertions (?!x) is not "regular" and thus not part of RE2 syntax. There is no plan to support regex flavors other than Golang's built-in one.

Route-rules do not allow overlapping matches[^1], but if the "other" schemas are a finite set you could always do

routes:
  route_a:
    schema-pattern: mercari_sample
    target-schema: mercari
  route_others:
    schema-pattern: '~^mercari_(sample2|sample_hoge|asample|bsample|...etc...)$'
    target-schema: mercari_others

[^1]: I think this can be relaxed, given that route-rules being an array with order, we can easily break the tie by favor the first matching rule.

kennytm avatar Oct 22 '24 11:10 kennytm