pugixml
pugixml copied to clipboard
std::string and std::string_view support in the same way for setters and constructors as for const char *
Hi and thanks for amazing library!
Do you plan to add std::string and std::string_view overloads for most setters/constructors? It makes code usage much more convenient.
This might become much more interesting when the standard library will have std::string_view
Here's some previous discussion on the topic: https://code.google.com/p/pugixml/issues/detail?id=49
That thread also has different implementations of a possible solution.
I experimented a bit with string_arg approach (the link above appears to be dead, but basically one way to implement this is to replace all existing const char_t*
arguments with string_arg
which would be a custom wrapper type for a char_t pointer with implicit ctor from const char_t*
and const std::basic_string<char_t>&
.
The biggest issue apart from breaking binary compatibility (which would mean I'd like to defer this until pugixml 2.0 to avoid potential issues with existing applications that link to pugixml dynamically) is the overload issues with xml_attribute::set_value
/ xml_text::set
, where bool
overload is now selected instead of string_arg
one (which is probably fixable by keeping the const char_t*
overload...)
Overall this seems like a good solution. There's obviously extra overhead in debug builds but there is no cost if compiler inlines the helper (which it should when optimizations are enabled).
Do you plan to add std::string overloads for most setters/constructors? It makes code usage much more convenient.
It would be better to support std::string_view, it will cover any strings and will solve problem with using unterminated strings. Actualy with modern project which uses std::string_view or boost::string_view instead of raw char const * every time i need to pass data to pugi i have to create temporary string for null termination
Btw - great library i love projects which have stl compatibile interface
A smaller but still useful change would be to have a
bool xml_node::set_value(const char_t* rhs, size_t strlength );
and just not do the impl::strlength(rhs) call in the implementation. The existing set_value can just be
bool xml_node::set_value(const char_t* rhs ) { return set_value( rhs, impl::strlength(rhs) ); }
I think std::(w)string_view is much more desirable as std::(w)string. And constructing a string_view from a std::string is very cheap, however constructing a string from a string_view is very expensive, so I would skip the std::string implementation and favor the std::string_view implementation.
@air2 The question you would need to ask yourself is if you need a null-terminated string or not. Because if that is the case then std::string_view
would not be the way to go per-say. It's a difficult question. However I think in this case I can't think of a reason why std::string_view
couldn't be used. However, this topic may be long dead as of now since it hasn't been updated for a while.
The other issue would require those who use this library to require a compliant C++17 compiler. With const std::string&
it would only require C++11 which I feel a lot more compilers have as their default now.
The other issue would require those who use this library to require a compliant C++17 compiler. With
const std::string&
it would only require C++11 which I feel a lot more compilers have as their default now.
It is not required to use C++17. Libraries such as fmt and spdlog (which uses fmt) use a custom tailored std::string_view
for pre-C++17, because std::string_view
uses/exploits no C++17 specific features and could thus be used pre-C++17.
@matt77hias But that would require the dependencies of an external library which I feel defeats the purpose of this library having to work on it's own without any dependencies except for the C headers that it uses.
I did not mean to add a dependency. A string_view is just a data pointer and data count providing some implicit/explicit constructors and cast operators. It is trivial and lightweight to implement yourself (one does not need all the other functionality) and one can replace it with std::string_view once C++17 is supported.
On Fri, Feb 14, 2020, 00:42 Raul Ramirez [email protected] wrote:
@matt77hias https://github.com/matt77hias But that would require the dependencies of an external library which I feel defeats the purpose of this library having to work on it's own without any dependencies except for the C headers that it uses.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zeux/pugixml/issues/73?email_source=notifications&email_token=AASZSE653KI7UDH5XHRN7DLRCXLHJA5CNFSM4BW5HIAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELXA5EY#issuecomment-586026643, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASZSEZURN4YTEDVGRGNJNLRCXLHJANCNFSM4BW5HIAA .
Is there any progress on this? As mentioned in https://github.com/zeux/pugixml/issues/73#issuecomment-551153613 there is no need to depend on any new functionality, just provide an overload which accepts size parameter (which already is done internally). This doesn't break any binary compatibility either: in worst case (app built with newer version load old version of pugixml shared library) there will be a dynamic linker error.
It is not always input value has terminating null and adding one means creating an unnecessary copy on heap.
The biggest issue ... is the overload issues with xml_attribute::set_value / xml_text::set, where bool overload is now selected instead of string_arg one (which is probably fixable by keeping the const char_t* overload...)
That's why you need std::enable_if
overloads to narrow down the set. But this probably better to postpone to 2.0 where you can just straight to concepts 😄
This one https://github.com/halx99/pugixml/pull/2 done setter/getter with string_view and other improvements, but the PUGIXML_COMPACT
was broken.
7 years passed :) Is it feasible to propose a pull request with the necessary changes? Will it be accepted?
pugixml performance will be better with string and string_view, why it wasn't considered yet?
@zeux maybe there should be separate branch for possibly breaking string_view-friendly API for people who need it?
I am geniunely consider dropping or forking pugixml due to being unable to use string_view
in my own API that calls pugixml under the hood.
A C++ library should have a modern and fast C++ interface...
Please keep the comments on topic and substantive.
In a recent change, several methods (like set_value and xml_text::set) gained support for size_t argument. To what extent adding string_view support just for these, and keeping functions that deal with attribute/element names as they are, would satisfy the need here?
It's of course possible to add string_view overloads for every method. That doubles the interface size for the library, and requires alternate implementations in certain cases where currently the internals assume null terminated inputs - this is obviously possible, but also not ideal.
The original idea for this issue has been to replace all methods with something that uses a wrapper, but given the binary compatibility concerns and uncertain timeline for 2.0 it's not clear that this is actually optimal.
The reason why names might be special is that code is generally more likely to use string literals than dynamic strings for them.
In a recent change, several methods (like set_value and xml_text::set) gained support for size_t argument. To what extent adding string_view support just for these, and keeping functions that deal with attribute/element names as they are, would satisfy the need here?
@zeux For me personally, I don't really care about supporting std::string_view
per se because I use custom string_view
anyway. My issue is about using non-0-terminated strings (coming from my code) in pugixml API in general.
Perfect solution for me would be to have optional size_t
parameter for each function consuming const char*
, or any functionally equivalent solution (like having wrapper class, or just having std::string_view)
As mentioned above, certain functions like xml_node::set_value
/ xml_text::set
have overloads that accept size_t
, see https://github.com/zeux/pugixml/pull/490. This is not part of a versioned release yet. Do you have examples of the code for which these APIs do not suffice?
Do you have examples of the code for which these APIs do not suffice?
I have XMLElement::GetChild(name)
, where XMLElement
is a wrapper on top of pugixml that provides utility functions useful in my code.
Then I have virtual Archive::Serialize(name, value)
which may or may not use XML in the implementation.
Both Archive
and XMLElement
are public API in my library.
My issue is that I am effectively forbidden from using string_view
in any API built on top of pugixml. I am forced to use const char*
in my public API because otherwise I cannot work with pugixml hidden deep in the implementation details w/o doing string allocation.
For the wrapper library use case it feels fairly easy to implement GetChild(name) as something like
for (pugi::xml_node child : node)
if (child.name() == name)
return wrap(child);
(I do think that for direct use for people who are used to string_view it would be more ergonomic to support it directly as right now you need to use .data()/.size() and call the overload that supports size which isn’t ideal)
edit: the above doesn’t cover set_name, so that probably does need an extra overload (now part of master).