poco icon indicating copy to clipboard operation
poco copied to clipboard

Supporting string_view

Open aleks-f opened this issue 1 year ago • 3 comments

Discussed in https://github.com/pocoproject/poco/discussions/4331

Originally posted by teksturi December 10, 2023 Hello.

We use Poco and are quite happy with it. However, we have already converted our entire internal codebase to use std::string_views. In many places, we now need to convert string_views to strings, as Poco only accepts the latter. These are the same places where we previously needed to create new strings, so it's not a catastrophic issue.

I'm still interested in whether we could support string_views in Poco and would like to know if this is something you would consider. I'm happy to start working on this, but first, we need a plan to determine if this will move forward. I understand that the minimum version Poco supports is C++17. However, there are still some places in the CMake files where C++14 is checked, but I wonder if that's just an oversight?

My proposal for proceeding with this is to start by converting to std::string_views in internal classes and source files. This should not affect external users at all. After that, we should decide whether to make a breaking change at some point and switch to using only string_views. Alternatively, we could take an intermediate step where we make this a compiler option. I've created a concept on Godbolt 1. We could get away of lot of things which currently use string iterators. These could in many places made just with string_views.

This is just an initial post on this subject. I have also done some prototyping with real code. If necessary, I can make a PR for a small portion, and maybe it would be easier to discuss it there.

aleks-f avatar Dec 10 '23 20:12 aleks-f

Here is some ideas so we can start planning. Some of these things might be obvious to you but I do not know your codeing strandards or are not totally familier with update/breaking policiyes.

https://godbolt.org/z/8TqWoYKcM

#include <string_view>
#include <cctype>
#include <fmt/core.h>
#include <type_traits>

#define POCO_USE_STRING_VIEW
#if defined(POCO_USE_STRING_VIEW)
 using StringRef = std::string_view;
 using StringRefConstIt = std::string_view::const_iterator;

 using StringRefSizeType = std::string_view::size_type;
#else
 using StringRef = const std::string&;
 using StringRefConstIt = std::string::const_iterator;

 using StringRefSizeType = std::string::size_type;
#endif

// Note that these are examples. Many things might be ambiguous.

class Foo
{
public:
    // This is current way we do in lot of places. We have couple option here
    // what to do.
    // Option 1 is to keep these as they are.
    Foo(const std::string& name) : _name(name) {}

    // Option 2 Just use string_view. This is easy but not as effiefent.
    Foo(std::string_view name)
        : _name(std::string(name))
    {}

    // Option 3 Use move idiom. It seems that it is actually already used in
    // some places. This is my personal choice.
    Foo(std::string&& name)
        : _name(std::move(name))
    {}

    // Other X. Of course there are more ways so please comment if there are
    // ways we should consider. Of course in some places string_view might be
    // totally good option if we process data but if we just store something
    // like _name then things are different.


    // We cannot change this without breaking lot of users code. For this we
    // probably need to use "StringRef".
    virtual void func1(const std::string& str);

    // Now we do not break existing user right away. They can however start
    // using string_views if they compile Poco with that option enabled.
    virtual void func1(StringRef str);

    // Iterators does not work so well. Happily with string_views many of these
    // kind become obsolete. As one can do this
    //
    //  func1(std::string_view(str).substr(4, 10));
    //
    // So maybe we should just leave these untouched for first and worry them
    // later on. This way existan user code will still work.

    // Option 1: Just keep old std::string::const_iterator untouched and silenty
    //           obsolete those over time.
    void func1(std::string::const_iterator it) {}

    // Option 2: Add also string_view version
    void func1(std::string_view::const_iterator it) {}

    // Option 3: Use some template magic. Does not seem very Poco way.
    template <typename T>
    struct is_string_or_string_view_iterator {
        static constexpr bool value = std::is_same_v<T, typename std::string::const_iterator> ||
                                    std::is_same_v<T, typename std::string_view::const_iterator>;
    };

    template <typename IteratorType, typename = std::enable_if_t<is_string_or_string_view_iterator<IteratorType>::value>>
    void func1(IteratorType it) {}

    // Option 4: Just remove iterator functions which become "obsolete".

    // Option 5: Use StringRefConstIt or something like that where it makes
    //           sense.
    void func1(StringRefConstIt it, StringRefConstIt end) {}

    // I think Option 1 or 5 are best options.


    // One thing to notice is that now many places (about 100) we do
    // 
    //   const std::string::const_iterator&
    //
    // It does not make sense to pass const_iterator as const reference.
    // Should we fix this also?


    // One thing I would like to ask if it is ok to use auto with those
    // iterator types? This will make sense as at some point StringRefConstIt
    // might go away so we do not need to touch those lines anymore.
    //
    // auto newIt = it;
    // std::string_view::const_iterator newIt = it;
    // StringRefConstIt newIt = it;

private:
    std::string _name;
};

int main() {
    Foo f(std::string("haha"));

    std::string name_str{ "william" };
    std::string_view name_sv{ "george"};

    f.func1(name_str.begin());

    f.func1(name_sv.begin());

    return 0;
}

teksturi avatar Dec 11 '23 18:12 teksturi

We also need to discuss about static strings. Here is code

https://godbolt.org/z/s3WqPd718

#include <string_view>
#include <string>

#define POCO_USE_STRING_VIEW
#if defined(POCO_USE_STRING_VIEW)
 using StringRef = std::string_view;
 using StringRefConstIt = std::string_view::const_iterator;

 using StringRefSizeType = std::string_view::size_type;

 using StringView = std::string_view;
#else
 using StringRef = const std::string&;
 using StringRefConstIt = std::string::const_iterator;

 using StringRefSizeType = std::string::size_type;

 using StringView = std::string;
#endif

// Note that these are examples. Many things might be ambiguous.

class Foo
{
public:
    // This was old way. We have couple option with these also.
    static const std::string HTTP_REASON_CONTINUE;

    // Start using string_view. Existing user might not be happy.
    // Maybe still private ones can be made like this.
    static const std::string_view HTTP_REASON_CONTINUE1;

    // Utilizing C++17's static inline feature for variable
    // declarations in header files. This enhances the clarity
    // and expressiveness of the code.
    static inline const std::string HTTP_REASON_CONTINUE3 = "Continue";

    // Those style combined we get.
    static inline const std::string_view HTTP_REASON_CONTINUE4 = "Continue";

    // Of course could also use eather of these also
    static const StringView HTTP_REASON_CONTINUE5;
    static inline const StringView HTTP_REASON_CONTINUE6 = "Continue";

    // 6 would be my personal choice here.
private:
    
};

const std::string Foo::HTTP_REASON_CONTINUE  = "Continue";
const std::string_view Foo::HTTP_REASON_CONTINUE1 = "Continue";
const StringView Foo::HTTP_REASON_CONTINUE5  = "Continue";

int main() {

    return 0;
}

teksturi avatar Dec 11 '23 19:12 teksturi

@teksturi, do you have some concrete places in mind where you think supporting string_view would be particularly useful?

matejk avatar Oct 03 '24 11:10 matejk