OpenGFW icon indicating copy to clipboard operation
OpenGFW copied to clipboard

Add GeoIP and GeoSite support for expr

Open KujouRinka opened this issue 1 year ago • 18 comments

I add geoip and geosite support:

- name: geosite block bilibili
  action: block
  expr: geosite(string(tls?.req?.sni), "bilibili")

- name: geoip block some ip
  action: block
  expr: geoip(string(ip.dst), "cn")

But I think it's better to make this as a draft, for more things that might have further discussion:

  • Most of files are directly copied from github.com/apernet/hysteria/extras/acl, I think only part of them are used so I decide to remove code that don't be used;
  • I have no idea about could I organize file like this and file naming convention;
  • More geo-like function could be added, such as CIDR support.

KujouRinka avatar Jan 28 '24 14:01 KujouRinka

Here's what I do:

  • Remove most of files but keep v2geo and matchers_v2geo.go;
  • Rename matcher.go to host_matchers.go for I think this is helpful;
  • I only keep compileHostMatcher function from compile.go and move it to geo_matcher.go;
  • Add Mutex to protect field in GeoMatcher, but I think it may be better to use RwLock instead.

So it looks like:

ruleset
    ├── buildins
    │   └── geo
    │       ├── geo_loader.go
    │       ├── geo_matcher.go
    │       ├── host_matchers.go
    │       ├── interface.go
    │       ├── matchers_v2geo.go
    │       └── v2geo/
    ├── expr.go
    └── interface.go

If no further problem, I will continue working on this:

Need a way to expose config options for users to specify local geoip/geosite db files, similar to Hysteria.

But may I ask that it's better to put this option to config.yaml in rules.yaml?

Thanks for your reading.

KujouRinka avatar Jan 29 '24 02:01 KujouRinka

👍 The changes look good but maybe remove compileHostMatcher entirely (extract the code that handles geoip/geosite into separate methods in GeoMatcher). btw "buildins" should be "builtins".

But may I ask that it's better to put this option to config.yaml in rules.yaml?

In config.yaml, perhaps create a new section for ruleset settings as we may have more in the future

tobyxdd avatar Jan 29 '24 03:01 tobyxdd

To specify local geoip/geosite db files:

# config.yaml

# other field ....

geo:
  geoip: geoip.dat
  geosite: geosite.dat

KujouRinka avatar Jan 29 '24 07:01 KujouRinka

Just tested it and can confirm that it works great 💯

I just have one last nitpicking to make - can you make it only download GeoIP/GeoSite db if at least one rule uses these functions? (lazy loading similar to ACL in Hysteria)

We already have AST visitor for analyzer dependency check, so it should be easy to implement

tobyxdd avatar Jan 29 '24 20:01 tobyxdd

I didn't find way to lazily build geosite and geoip by only calling expr.Compile() so I use parser.Parse() and ast.Walk to walk on AST before compile. I also move original expr.Compile() 's argument to depVisitor but I am not sure is this right.

https://github.com/KujouRinka/OpenGFW/blob/c26e04b5a6aa5ee7f4b9405702b873a0c4b4accd/ruleset/expr.go#L104-L123

I think geosite and geoip should differ from Analyzer so I add a function func isBuiltInFunction(name string) bool. If I am wrong please correct me.

KujouRinka avatar Jan 30 '24 03:01 KujouRinka

Is it possible to always provide these functions when calling Compile, but only actually load the geo files when the identifier is seen by the visitor during compilation? I think this works the same, but can keep the code the way it was before

tobyxdd avatar Jan 30 '24 03:01 tobyxdd

I agree with you, but I have a more question that loading geo files may counter with internet error, while ast.Visitor interface doesn't provide us how to return this. I think we should introduce a field in depVisitor for recording.

KujouRinka avatar Jan 30 '24 05:01 KujouRinka

Yes, feel free to modify depVisitor as you need.

tobyxdd avatar Jan 30 '24 05:01 tobyxdd

Or maybe just keep depVisitor a visitor (only record identifiers) and do the loading in the outer compile function?

tobyxdd avatar Jan 30 '24 05:01 tobyxdd

I rebase it from 1eb52f7 for making change clear.

KujouRinka avatar Jan 30 '24 05:01 KujouRinka

Or maybe just keep depVisitor a visitor (only record identifiers) and do the loading in the outer compile function?

You mean mark whether geosite and geoip is needed in depVisitor, then after compile function to decide whether to load? I think this is better, for we have no need to record geo.GeoMatcher in depVisitor. Just regard it as "only record identifiers" you mentioned. 😊

KujouRinka avatar Jan 30 '24 05:01 KujouRinka

Yes, maybe depVisitor can have a Functions similar to Analyzers

tobyxdd avatar Jan 30 '24 05:01 tobyxdd

I introduce UseGeoSite and UseGeoIp to depVisitor to mark this.

KujouRinka avatar Jan 30 '24 05:01 KujouRinka

I think I should rebase these commits from first commit of this PR right?

KujouRinka avatar Jan 30 '24 11:01 KujouRinka

I think I should rebase these commits from first commit of this PR right?

What do you mean

tobyxdd avatar Jan 30 '24 17:01 tobyxdd

I think these commits are a bit messy, so before merging, maybe I need to use "git rebase" to merge all commits into one.

KujouRinka avatar Jan 30 '24 23:01 KujouRinka

That's fine, we can use squash merge.

tobyxdd avatar Jan 31 '24 00:01 tobyxdd

I think this draft is about to be completed so I convert it to a PR. If have any problem, please feel free to call me. ❤️

KujouRinka avatar Jan 31 '24 00:01 KujouRinka

Thanks again for the effort. I plan to release v0.0.5 with this & WireGuard analyzer

tobyxdd avatar Jan 31 '24 01:01 tobyxdd