fastutil icon indicating copy to clipboard operation
fastutil copied to clipboard

asString and asCharSequenceView for CharLists

Open techsy730 opened this issue 3 years ago • 9 comments

It may be useful to have asString (returns a new String with the elements of the CharCollection as its characters) and asCharSequenceView (presents a view of the CharList as a CharSequence, no copying) in CharList. Perhaps asString would make sense in CharCollection too as there are some ordered non-list collections where it might make sense (SortedSet, LinkedHashSet, etc). CharSequence is an inherently indexed based api (well, so is String, but that one will be a copy, not a view), so it only makes sense on CharLists

I'm not sure if elementsToString is really a good name. contentsToString and stringValue also come to mind. I do like asCharSequence though. EDIT: Per discussion on the bug, the names decided on are asString and asCharSequenceView

The motivation for this is to:

  1. Let certain subclasses provide a faster way to dumping into a String (like if they are already backed by an array, we can just pass that array to String.copyValueOf instead of a loop copying into another array that is copied again by String's constructors).
  2. Provide a CharSequence view for interfacing with String based APIs without introducing an extra intermediary copy. Also gives an easy way to provide conversions to codepoints (CharSequence comes with default method APIs for that)

techsy730 avatar Jan 19 '21 15:01 techsy730

I've got a PR in progress but I would like to source some feedback before I go too much further with it.

techsy730 avatar Jan 19 '21 15:01 techsy730

The conversion to string sounds more like a helper method for CharCollections. For the view, I'd use charSequenceView(), which makes clear that no copying is performed.

vigna avatar Jan 19 '21 15:01 vigna

The conversion to string sounds more like a helper method for CharCollections.

Yes, but then things like CharArrayList won't be able to avoid an extra copy converting to a String. We have to copy once into an accumulating array, and then once again done by String construction (String always copies arrays it is handed). By letting array based subclasses to provide their own implementation, one copy step can be removed. We can pass the array straight to String (with offset and length) which will make its own copy, but now we are down to one copy instead of two.

I could special case ArrayList and ArraySet in the static method, but that seems dirty. :P

For the view, I'd use charSequenceView(), which makes clear that no copying is performed.

Yeah, that sounds good

techsy730 avatar Jan 19 '21 16:01 techsy730

Good point, if you can do much better than a method might be a good idea. Maybe you can default it in CharCollection and then have improved overrides.

Presently I'm more focused on reviewing the code and #162 , as I want either to complete it (with functions) or at least be sure we won't get into trouble in the future when we complete it (e.g., method calls becoming ambiguous).

vigna avatar Jan 19 '21 16:01 vigna

Thoughts on the name of the method? elementsToString, contentsToString, or stringValue?

And yeah, I wouldn't mind if this even gets pushed to 8.6.0, not a high priority feature by any means.

techsy730 avatar Jan 19 '21 16:01 techsy730

How does asString() sound?

vigna avatar Jan 19 '21 16:01 vigna

asString sounds good. Concise, but easy enough to remember what the difference is between that and toString once you learn about it. We would need some good Javadoc on the interface's asString and toString methods what the difference is (the former is the contents of the list into a string, the latter is a representation of the state of the list, usually the contents represented as a list type format like [a, b, c, ...]).

EDIT: Edited the OP with these new method names

techsy730 avatar Feb 02 '21 16:02 techsy730

Just a heads up, I have a branch halfway done implementing this, but IRL work got busy before I could clean it up for submission

techsy730 avatar Feb 20 '21 17:02 techsy730

Ok, I have the type I want to expose for unmodifiable view (basically a plain char sequence but also supporting many of the non-mutation methods StringBuilder provides, like getChars) The issue now comes from how many mutation based methods I want to provide, like java.lang.Appendable.

I may limit my scope to an unmodifiable view first, and then add in optional mutable views in later PRs

techsy730 avatar Mar 21 '21 20:03 techsy730