uriparse-rs icon indicating copy to clipboard operation
uriparse-rs copied to clipboard

Use smallstring variants for `Segment` str

Open damooo opened this issue 2 years ago • 8 comments

Hello, thanks for great work.

It would be great if it uses a smallstring variant as backend for Segment.

example crates are smolstr, smartstring, kstring, etc.

kstring also provides a struct KStringCow, that just fits in existing scheme of code.

damooo avatar Feb 26 '22 12:02 damooo

I vaguely recall having given this a try using smallvec I think, but it introduced some type errors that I could not get around. Will look at some of the ones you suggested.

sgodwincs avatar Mar 18 '22 20:03 sgodwincs

Is the concern here with converting Segment to an owned Segment and you want to avoid that allocation? That doesn't happen unless you explicitly make it owned. The only allocation in this library that is mandatory is the Vec allocation in Path.

sgodwincs avatar Mar 18 '22 20:03 sgodwincs

Yes, but dealing with lifetimes may be difficult in some involved cases. If owned invariant is available without much allocations, it would be great.

Thanks for your work

damooo avatar Mar 20 '22 18:03 damooo

And it also may helps where normalization is necessary. It seems, currently segment normalization allocates, if it is not already normal.

KStringCow invariant from kstring may co-operate well with existing semantics of code. It just provides a cow version, where borrowed version can be &str, and owned version can be KString, which inlines upto 22 bytes.

damooo avatar Mar 22 '22 12:03 damooo

@sgodwincs, Currently, to get a owned URI, that is not static, one have to allocate for each segment in path. Getting owned version may be required when certain traits require values with static life times.

If small string variants are used, then it each segment-data can be mostly stored inline while being owned and dynamic.

damooo avatar Apr 14 '22 03:04 damooo

@sgodwincs , if open to idea, i can make a PR, that doesn't introduce any breaking change, and changes only little code regarding Segment struct

damooo avatar Apr 14 '22 14:04 damooo

My concern is the increased size of Segment. Each one will be at least 22 bytes (guessing higher with alignment) even if they're just pointers to the heap. If you have a path with many segments and you never make it owned, then you have 22*(number of segments) bytes of wasted space excluding the actual data on the heap.

sgodwincs avatar Apr 15 '22 21:04 sgodwincs

@sgodwincs , We will follow just same pattern as existing segment implementation, that is to use a cow'd invariant. that is if it is not owned, it can be just &str, otherwise a owned smol_str/kstring/.., instead of String. Optimization is only for owned case. By default owned String also takes 24 bytes on stack, along with content size on heap.

kstring by default provide provide a KStringCow too.

In so many cases, where a struct param takes URI or URIReference and expect them to be owned like in Url/http::Uri crates, currently it allocates for every segment, and for every clone.

damooo avatar Apr 16 '22 11:04 damooo