haxe icon indicating copy to clipboard operation
haxe copied to clipboard

migrate pcre

Open andyli opened this issue 4 years ago • 14 comments

See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1000117

andyli avatar Nov 18 '21 13:11 andyli

I'm guessing this is more complicated than just changing libpcre3 to libpcre2 in some places?

Simn avatar Nov 23 '21 10:11 Simn

I've heard that the APIs are different, so prepare to have some code changes.

andyli avatar Nov 26 '21 08:11 andyli

At times like this I realize that GitHub needs a crying emoji...

Simn avatar Nov 26 '21 08:11 Simn

I think the best info is (will be) over https://github.com/PhilipHazel/pcre2/issues/51

andyli avatar Nov 26 '21 08:11 andyli

Recently I did this for mac https://github.com/HaxeFoundation/haxe/commit/62ee6a9a8e727de30ac7c8f6c82a3e8f42b9a22b And it works.

RealyUniqueName avatar Nov 26 '21 09:11 RealyUniqueName

Recently I did this for mac 62ee6a9 And it works.

hmm... Shouldn't pcre2 uses pcre2.h instead of pcre.h? How come it compiles...

andyli avatar Nov 26 '21 09:11 andyli

Maybe the worker has it pre-installed now and our manual compilation of pcre is obsolete?

RealyUniqueName avatar Nov 26 '21 09:11 RealyUniqueName

Maybe the worker has it pre-installed now and our manual compilation of pcre is obsolete?

Probably. Then it is still not using pcre2.

Should also check whether the haxe binary dynamic depends on pcre or not. I think we build and install our own pcre because we want to static link it.

andyli avatar Nov 26 '21 10:11 andyli

Compared to Neko and Hashlink, migrating eval to pcre2 is a bit more involved. The current ocaml pcre library used by haxe is just a copy of: https://github.com/mmottl/pcre-ocaml (besides minor edits, most of which were lost when updating).

Unfortunately, there does not seem to be a pcre2 equivalent of this library for ocaml yet. See https://github.com/mmottl/pcre-ocaml/issues/25.

Unless such a library is released, if we were to do this ourselves it would require writing:

  • a pcre2 equivalent for https://opam.ocaml.org/packages/conf-libpcre/ (the easy bit)
  • a pcre2 port of https://github.com/mmottl/pcre-ocaml. Although, for haxe's purpose this may not be too big a job.
  • Then, any potential ocaml api changes

tobil4sk avatar Apr 23 '22 08:04 tobil4sk

@tobil4sk Since a comprehensive pcre2 ocaml binding does not seem to be forthcoming, we could build our own with ctypes. The down side to this is that we would have to write our own binding but the up side is we would only need to flesh out enough of it to make our haxe eval port work instead of trying to make a comprehensive ocaml binding.

Uzume avatar May 09 '22 15:05 Uzume

Indeed, while writing OCaml bindings isn't much fun, we're only using like 5 or so PCRE functions for which we would need bindings.

Simn avatar May 09 '22 16:05 Simn

* a pcre2 equivalent for https://opam.ocaml.org/packages/conf-libpcre/ (the easy bit)

Well, I submitted ocaml/opam-repository#21349, it was merged and we now at least have conf-libpcre2-8 and should be able to build some sort of binding on top of it.

Uzume avatar May 10 '22 12:05 Uzume

@RealyUniqueName:

Recently I did this for mac 62ee6a9 And it works.

I do not think this actually does work. And I am not sure 56bb846c725d383629460db9161d7e471ce85c28 was the right thing to do either. If we needed mbedtls why was ac399e227b45ad0b137053987b396c3d7b957276 done? Are we trying to move away from homebrew for macos? I notice we are still depending on: conf-libpcre, conf-zlib and conf-neko, each of which specifies homebrew as the system installation source for macos (despite using the HaxeFoundation build of neko and building our own libz during our own build process for ci and binary releases, etc.).

hmm... Shouldn't pcre2 uses pcre2.h instead of pcre.h? How come it compiles...

Yes, we should be and the library should be changed from -lpcre, libpcre-1.dll and /usr/local/lib/libpcre.a to -lpcre2-8, libpcre2-8-0.dll and /usr/local/lib/libpcre2-8.a, etc. But that will only break the build until the code is ported to the pcre2 API specified in pcre2.h.

Maybe the worker has it pre-installed now and our manual compilation of pcre is obsolete?

Probably for two reasons. One, maybe we ran the tests earlier and the lua tests install pcre (not pcre2) including from homebrew on macos here. opam users (I notice opam is way behind for haxe) will fail upon install because we depend on conf-libpcre (see above).

Probably. Then it is still not using pcre2.

Should also check whether the haxe binary dynamic depends on pcre or not. I think we build and install our own pcre because we want to static link it.

Exactly, and building pcre2 during the haxe build just wastes time during the build because it is not actually used. We should be using a system install of that anyway and I recommend using homebrew as we tell our users to do in the build instructions,

Uzume avatar May 14 '22 22:05 Uzume

@tobil4sk: as per my recent comments, you can probably tell I am have been poking about to understand how haxe uses pcre in order to understand how to port to pcre2. There are at least two other uses of pcre: php and lua targets.

I am not sure we can do anything about php. We directly use the binding in php itself which has already changed in php 7.3. Users have noticed a few minor functional changes but apparently the php API stable despite the changes under the hood.

As for the lua target, that might prove pretty easy (but I need to study the interfaces a bit more). Apparently haxe's main use for pcre is to implement a EReg interface and for lua we use lrexlib-pcre. lrexlib already has a binding for pcre2 so maybe we can just switch to lrexlib-pcre2 if the interfaces are similar enough (see manual). I might have a go at that before considering tackling haxe eval.

EDIT: Okay, this is a pain in the butt. It seems our lua test infrastructure (see tests/runci/targets/Lua.hx) likely has not been updated in quite a while. It properly system installs libpcre3-dev via apt on Linux and pcre via brew on macos (I am not sure what it does for Windows; I can only assume that is not tested at all). Then it installs lua and luarocks via python hererocks. So far so good, but then it installs the lua rocks haxe-deps and luasec. luasec isn't so bad but haxe-deps is empty save requiring other dependencies, among them being lrexlib-pcre (yay, we found it). It appears haxe-deps was created to help users install all the dependencies needed for haxe lua but it seems quite outdated despite jdonaldson/haxe-deps#1, an unmerged pull request for updates by @Aurel300 since 2021-09-10. So it is out-of-date, not taking updates and does not include luasec (which we are installing separately). I am not sure how to subsume a luarock and I am not sure I really want to fork that so perhaps I should just consider dropping that in lieu directly installing all the dependencies in the test script but we might want to document this somewhere for user support.

EDIT: Thankfully this has now been straightened out (for now) by bypassing haxe-deps with #10916.

Of course php and lua are source targets and as such are not direct targets of haxe itself but rather the code it generates. So from our perspective, that should only change dependencies here for tests. php takes care of itself (you cannot install php without either pcre or prec2 for 7.3+ being installed) but we will have to update the lua target tests here to specify pcre2 system install vs. pcre previously to account for the move from lrexlib-pcre to lrexlib-pcre2.

Sadly, it appears pcre-ocaml (which our eval binding is based upon) appears to export many lower level details of pcre (such as the study interface that disappears in pcre2; see API doc). I doubt we use that much of it to implement EReg though. We might be able to get by with a our own minimized update of the binding as suggested earlier.

Uzume avatar May 14 '22 23:05 Uzume

Now with #11030 and #11032 merged hxcpp should be only remaining thing left to reference PCRE1.

Uzume avatar Mar 25 '23 13:03 Uzume