libgd
libgd copied to clipboard
Issue-185 changes
These changes should resolve https://github.com/libgd/libgd/issues/185 Please, test and let me know if any other changes need to be made. It is my intent not to break things for other people.
please undo all the whitespace changes. they don't look necessary or related and make it very hard to understand what the patch is actually doing.
I completely replaced 1 function with 3, but I can add the extra whitespaces everywhere if you like (maintaining the style used in the replaced function) ? The file as a whole is already mixed style.
I'll try to clarify. entities.h was updated to include all current html5 entities instead of only html4. The gdTcl_UtfToUniChar function was removed. The comp_entities function was removed. gdTcl_UtfToUniChar became gd_Entity_To_Unicode (no longer requires comp_entities), gd_JISX0208_To_Unicode (2-byte encoding not utf-8), and gd_UTF8_To_Unicode (now accepts 4-byte utf8 for the official unicode limit). The reasoning behind splitting the functions is because JISX0208 and UTF8 are mutually exclusive and overlap. The entities all start with &, so it kind of piggy backs regardless of the encoding, i've maintained that behavior.
I considered adding JISX0213, but that's a different matter.
the indentation in src/entities.h is incorrect.
it's unclear how you're generating this, but it isn't with the entities.tcl that we've used historically. however it's done, entities.tcl needs to be updated or replaced.
I wrote a C program to do it, as the format of the website changed and it no longer works. I can remove the extra tab, but I figured it wouldn't matter. http://sterlingdesktops.com/pub/test/entities.c
In an ideal situation there would be an entities.h with the prototypes and an entities.c containing the function and the lookup table. jisx0208.h as well. I didn't want to modify the build system or take things that far.
We're kinda just putting a bandade on things, without a complete rewrite of gdft.c. There are a lot of things that could be improved.
we want that logic in the repo so we can verify & keep in sync easier. and we don't want to keep around old scripts that are known to be outdated.
I'd have to add the header portion of the printing and make it output the file as a whole. Right now it is just outputting the relevant lines. I suppose I can do that. I never really used tcl, so I can't be certain of how to make that perfect.
I was wondering if there was an irc channel for dev discussion ? Is everything done here on github ?
There is another way I can write gd_UTF8_To_Unicode that uses half as many lines of code as well.
Ok, i've got the entities_gen.c to replace entities.tcl. What else should I do ?
My use of FT_Select_Charmap(face, charmap->encoding); is not ideal. When the user calls the gdImageStringFTEx function with a selected encoding, would the use of FT_Select_Charmap(face, FT_ENCODING_UNICODE); break things for anyone ?
It might be best to rewrite that entire section to map the selected encoding to an FT_Select_Charmap call ie: FT_Select_Charmap(face, FT_ENCODING_ADOBE_CUSTOM); Then we can check the return for error (maybe something for a future pull).
I think only FT_Set_Charmap existed when gdft.c was written.
If you want to cancel this pull request I don't mind. I can reopen again later once i'm done with all the other changes I need to make. It'll likely be in a broken state for a few days.
shrug it doesn't cost us anything to leave open
I think it works now. I'll have to do some extensive testing and look everything over, but i'm close.
As far as I know this is done, unless someone else sees a problem.
Wouldn't it be easier to just merge then we/you can change the minor stuff ? Many of these are style preference changes.
having 30 tiny CLs that just shuffle style & bits around is not an improvement. having a single commit that implements a single thing makes history management a hell of a lot easier.
so yes, there is a lot of style feedback mixed in, but that's because this PR is going to be squashed into a single commit when it's merged.
I guess i'm just getting kinda slow or worn out on this thing at this point. Looks like i'll have to work up the energy for the python updates tomorrow. Thanks.
I think I addressed all/most of those requests. I guess we need a decisions on src/Makefile.am and .gitignore.
utf8-to-uni-01.c is 24% faster than utf8-to-uni-02.c when both compiled with -O2 -march=native utf8-to-uni-01.c is 31% faster than utf8-to-uni-02.c when both compiled with -O3 -march=native
also 15 lines less code. Do you think I should swap out that function ?
http://sterlingdesktops.com/pub/test/utf8/utf8-to-uni-01.c http://sterlingdesktops.com/pub/test/utf8/utf8-to-uni-02.c
It seems kinda silly, the function probably only takes less than 1% of the runtime in libgd. The entities function is the one that can be made much better.
Ok, what do you think now ?
These should address all those requests.
Should we remove the old "* gdTcl_UtfToUniChar is borrowed from Tcl ..." comment as well ? That is 41 lines. Or are we still sharing enough to mention it ?
Should we remove the old "* gdTcl_UtfToUniChar is borrowed from Tcl ..." comment as well ?
i'm indifferent. if you think the amount of borrowed code is still significant, leaving the comment sounds reasonable.
Is this done now ?
What is the status on this PR?
It works for me, although there have been a handfull of changes in master since april. I don't know if there are issues for other people.
@sterlingpickens could you apply a change (whatever, white space etc) to run the CI again? Last one it was broken. We also improved CI now using github actions on Windows, Mac and linux, intel and ARM. Thanks :)
I tried to sync with main, hopefully that didn't break it. I really don't understand this CI stuff. A problem with gd_entities ? Maybe something that got merged wasn't supposed to happen. This entire PR has dragged out too long and i'm lost. It might have to be started over from scratch.
Ok, it looks like adding entities.c to CMakeLists.txt was needed.
Thanks @sterlingpickens :)
About the CI, ono every new PR, or push to master or a PR, the builds&tests are ran on the various configurations. You can see the result in the PR or directly here.
@sterlingpickens And my apologies, it should have been merged long ago. I can give you a hand to get it done.
@vapier what are the remaining issues? The Python script, as long as it works, is not very important. It can always be improved later or use jq to create the header file from Json :)