squid icon indicating copy to clipboard operation
squid copied to clipboard

Refactor some URL functions to SBuf

Open yadij opened this issue 6 years ago • 4 comments

Refactor urlCanonicalCleanWithoutRequest() API to SBuf to remove stringHasCntl() dependency between libanyp.la and String.cc.

  • stringHasCntl() checked for the odd range of 0x80-0x9F as control characters. But the extended-ASCII / UTF-8 set is 0x80-0xFF and RFC 2616/7230 require encoding for the full UTF-8 range.

  • Add simple unit test for this function. Initially to check the strip_query_terms behaviour persists through this refactor. Additional tests for correct %-encoding are still needed, but encoding is not changed at this time.

Refactor for HttpRequest::canonicalCleanUrl() and urlCanonicalFakeHttps() to avoid having to reallocate for return values derived from urlCanonicalCleanWithoutRequest() output.

Refactor AnyP::Uri::cleanup() API to SBuf in order to avoid performance regression from double-reallocation when using the new cleaning functions and methods.

  • Add simple unit test for this function. Initially to check the uri_whitespace behaviour persists through this refactor. Useful expansions to the test are marked as TODO/XXX.

  • The existing code has some odd inconsistencies due to the mixed use of w_space, isspace(), RFC1738_ESCAPE_NOSPACE and hard-coded character scanning. Fixing that is marked with XXX and left for future as it involves UI and behaviour changes.

  • Use of Tokenizer for internal processing is also left out of scope for this update as doing that right will require the inconsistencies be fixed first and refactoring of the main urlParse method may require significant changes to this function design.

yadij avatar Nov 25 '18 18:11 yadij

I have resolved change requests addressed as of commit https://github.com/squid-cache/squid/pull/335/commits/f31f3fae0a920fbf8687d74d5a9867c73540b56f. There are approximately four change requests that remain unresolved, and I have commented on most of them. All of these come from https://github.com/squid-cache/squid/pull/335#pullrequestreview-178528635. I will do another (hopefully more comprehensive) review after the remaining old change requests are addressed.

rousskov avatar Dec 31 '18 17:12 rousskov

I think this is all the requests handled now. Please check.

yadij avatar Dec 09 '19 14:12 yadij

I think this is all the requests handled now. Please check.

I found three unaddressed old change requests: 1, 2, and 3.

rousskov avatar Dec 26 '19 22:12 rousskov

AFAICS the outstanding review issues are now all done. Waiting now on re-review. If this is ready for merge I can move on to completing the AnyP::Uri::parse() method conversion I have underway in the background.

yadij avatar Feb 09 '24 10:02 yadij