seastar
seastar copied to clipboard
sstring causes interoperability problems with other libraries
The reason why seastar has its own "sstring" type instead of using std::string is historical, due to C++ library implementation when the Seastar projects started: In gcc before version 5, std::string used to use atomic operations which are bad for many-core applications like Seastar. As the nice blog post https://shaharmike.com/cpp/std-string/ explains why this was the case in pre-C++-11 and why gcc for ABI backward-compatibility reasons left this implementation long after 2011. But since gcc 5, this old implementation is gone. Today's std::string implementations in gcc and clang use exactly the same techniques as seastar::sstring does: no atomic operations, and in-structure storage for small strings (a.k.a. "small-string optimization") - the aforementioned blog points out that clang's implementation is particularly clever.
For this reason, seastar::sstring is no longer needed and we can switch to using std::string.
Since Seastar does not care about ABI compatibility (we do not yet support a Seastar shared library - see issue #467), and only about source-code compatibility, it should be easy to change Seastar so that seastar::sstring is merely an alias to std::string, without breaking compatibility for existing applications. Applications can then start gradually changing their own code to use std::string instead of seastar::sstring.
@avikivity pointed out in the mailing list, "I don't think it's simple. The interfaces are not identical, sstring has several features beyond std::string.".
One example I found is the null-termination template parameter introduced in commit 7328d17337523ae1e64dbd07cc5aa14628a022f0 which std::basic_string doesn't support, but seastar::basic_sstring does.
Another difference is that seastar::sstring
has a constructor taking sstring::initialized_later()
and a size, which std::string
doesn't have. This constructor allows creating an uninitialized string of a certain length, and only then fill its data. We also have a convenience function make_sstring()
to take a bunch of arguments and concatenate them into one new string. The closest constructor that std::string
has is one taking an iterator pair. It's not as general as make_sstring()
allowing parameters of different type, but I don't think this is a critical feature (of course there's the issue of backward compatibility, though....).
Yet another feature we have and std::string doesn't is sstring::release()
, introduced in commit a2ca5568369c2f782477de890b468b0650ac2a33. This allows taking out the internal buffer held by sstring so it can be held by another owner without copying.
Another gratuitous difference is that seastar::sstring::begin()
returns a bare character pointer, while std::string::begin()
returns an std::string::iterator
. This breaks for example a call like seastar::output_stream
That's really a problem in output_stream::write(), it should support iterator+len and iterator pair.
In commit fc0750bf5c0e16abc8423f38a88118c3640ca75d, a compile-time flag SEASTAR_SSTRING was added. Our cmake sets it by default, but if not set, seastar::sstring
becomes just an alias for std::string
. This doesn't solve any of the above differences in the APIs, but should make it easier to check what happens if we were to do that conversion.
Another difference is that
seastar::sstring
has a constructor takingsstring::initialized_later()
and a size, whichstd::string
doesn't have. This constructor allows creating an uninitialized string of a certain length, and only then fill its data.
I just noticed that although std::string
doesn't have a way to create an uninitialized string with a certain length and fill it later, it does a push_back()
method (which we are lacking, see https://github.com/scylladb/seastar/issues/2104) which is guaranteed to have (amortized) complexity O(1), so one can create a string of length N in O(N) by incrementally calling push_back()
.
Even better, std::string
also has a reserve()
method (which I believe we're also lacking!) which will make such a loop even more efficient. Even more efficient can be to use append()
or the new C++23 append_range()
after a reserve()
instead of doing push_back()
one character at a time.
I don't think that sstring::initialized_later()
gives us anything in features or performance that these standard solutions can't.