imgui
imgui copied to clipboard
ImStrv: Use a string span for consuming strings in API
EDIT Moved to features/string_view branch
WE EXPECT TO MERGE THIS IN 2021 BUT ARE WAITING FOR (1) NEW BINDING GENERATOR (2) INVESTIGATING SWITCHING TO STB_PRINTF IN ORDER TO SUPPORT NON-ZERO TERMINATED FORMATS
This PR implements a custom ImStr
string span class and uses it in the API instead of const char* text
and const char* text, const char* text_end
. PR is a spiritual successor of #683.
Rationale: this change facilitates use of Dear ImGui from languages that do not zero-terminate their strings. So basically rust. Even though rust does have zero-terminated c strings - they are not default and therefore transparent interop between rust strings and this library would require string copying.
Things to note:
- Variadic functions are not covered. Use of custom class breaks static format string analysis of GCC. Ideally these variadic functions should get non-variadic counterparts and user should do formatting on their end (like
Text()
vsTestUnformatted()
. - Most functions from
imgui_internal.h
that are not expected to be used by vast majority of users are not covered. - Some functions from
imgui_internal.h
that takeconst char* format
are not covered for performance reasons. - Some functions taking arrays of
const char*
or callbacks are not covered. -
Input*()
,Sider*()
andScalar*()
functions are covered, but that limits size of format parameter to 63 bytes. Format is copied to a buffer on stack because internal functions expect it to be zero-terminated. - Bunch of functions taking
const char* text, const char* text_end
were obsoleted, becauseImStr
version does exactly same. -
ImStr
is implicitly constructible fromconst char*
. -
ImStr
is implicitly convertible to bool, yieldsfalse
whenImStr
has no string attached. Yieldstrue
if any string is attached, even if string length is 0. This is essentially same as checking if string pointer is null.
This looks awesome, we also use IMGUI in Rust where we have string slices instead of null-terminated string pointers. And we also use IMGUI inside a WASM sandbox where we explicitly have to transfer the exact right memory between WASM and the host and where string slices with explicit length are a lot easier, faster, and safer than loose null-terminated string pointers where one have to dynamically determine the length of them for the marshalling.
So would love to see this integrated in to core IMGUI!
Good to hear! Would you care to try this branch with your application? This PR may have issues or things missing. Real world experience would be invaluable for finishing this PR and merging it.
Unfortunately it would be a bit tricky for us to try it I think because we use IMGUI through cimgui
through imgui-rs
which we have our own WASM fork of. So quite a few layers of indirection.
But @Jasper-Bekkers and @MaikKlein in our team here at Embark have more experience with how that works and may be able to give feedback about this PR
No need to immediately act upon that in PR, but as C++20 is aiming at inconveniently making UTF-8 literal u8"some text"
not be const char*
anymore but const char8_t*
, adding the extra constructor taking const char8_t*
would be an added incentive for adopting this.
Another one is that it would facilitate a path where wchar_t could be supported as the default compile-time target, to support e.g. Unreal Engine without utf-8 > wchar_t conversions.
@ocornut Is this PR something you'd consider merge in in the near future?
Hello,
Not super “near” future as I don’t have resources to review/tweak it thoroughly the way I’d like (considering concerns such as cognitive and debug overhead for c++ users) but I imagine we should be able to tackle it in 2020, which is the reason I asked Rokas to revive the old PR.
Real world feedback would also be helpful.
If there’s traction, one possibility is to first merge the changed function signatures which constitute the bulk of the diff, so the remaining diff gets easier to update for as long as this is not fulled solved.
Getting the updated function signatures in would already help us quite a bit. Having strings the way they are now has had quite an impact on imgui-s for us so investigating changes there would also take some time.
Since the PR only mentions Rust, I just wanted to mention that this would also really benefit my usecase of using imgui from D.
"Real world feedback would also be helpful." I did not go trough all the changes, but I think it can make imgui usable with C++17 and later which are now more and more used. I found this PR when I was considering if we can use imgui for our next project or not. So my question is if it will be merged any time soon, or we should seach for another library? Thanks in advance for the answer.
This would also be helpful for the Java binding I maintained.
(moved message from #494, sane to resume talking in the newer issue)
I have a pushed an updated features/string_view
branch https://github.com/ocornut/imgui/tree/features/string_view based over latest. Some changes over Rokups's recent is that I've enforced the end pointer to be valid if the begin pointer is valid. This allows us to remove late validation code, we now just expect that both pointers are set together by the constructor.
That gives the extra benefit that with the single-param constructor: literals which are frequent with raw C/C++ usage can now using compile-times strlen.
That's over the general benefit that pulling data from another language or string type which has constant time access to length can now avoid the strlen() alltogether - that's another benefit of this branch.
Some quick measurement done with our testing framework gives it a nice edge in Release build, and Debug build (with all VS security checks) gave it a minor 2-3% slow-down. This is before I made some other tweaks which perhaps made it better. Will do more thorough measurement later.
Things I would like to be able to move forward:
- [ ] 1. general testing + thorough performance measurements (Rokups and I will work on that)
- [ ] 2. looking at this from the pov public/internal api breaking changes. current version is designed to minimize those (to confirm?) but I believe some of the internal api would be saner to break. (Rokups and I can work on that, but any feedback welcome)
- [ ] 3. reestablish a working proof of concept for Rust users and try to have 1-3 people toying with it even quickly.
- [ ] 4. figure out how to integrate this with cimgui
Point (3) will need involved Rust users.
Point (4) is interesting to discuss: @bitshifter old cimgui fork replaced all cimgui api with ImStr: https://github.com/cimgui/cimgui/compare/master...bitshifter:imstr I do believe this would however be problematic for actual C user. We could have a compile-time option for cimgui to use const char* or ImStr but my gut feeling is that may be create hurdle with people pulling from package manager and not building themselves?
Otherwise cimgui could provide two api with a different prefix. One e.g. igButton()
would take const char* + perform the strlen() itself, one e.g. igsButton()
would take ImStr or simply two const char*. Since those functions are super thin wrapper I don't feel it would be problematic for main cimgui API. But this scheme would need to be adjusted for non ImGui:: functions a swell: ImDrawList_AddText
, `
Either way this needs a solution. As a general feeling I believe that only direct C users would want to use const char*
and every indirect users (LUA, C#, Rust, language bindings) would be better off with ImStr.
It's too late for 1.80 but tentatively if the right things are in place we could aim for it by 1.81 or 1.82. The branch being in place it would be ideal to start now working on nailing 3/4 so we have some real world feedback and then by the time we are done with 1.80 we can look at that.
It has been a long time since I looked at this. It looks like cimgui
is now autogenerated rather than hand written like it was when I forked it. So it will be interesting to see if that "just works" . On the Rust side of things, I think either a #[repr(C)]
ImStr
struct or a start and end const char*
pointers could work, most likely it could be hidden in the imgui-rs
FFI layer. I think the reason I went with ImStr
was it was less changes than adding start and end pointer parameters everywhere and a bit easier to see where my changes were. At the time there were a few dear imgui methods that already took an end pointer which would have made it a bit harder to keep track of what I'd changed.
As a general feeling I believe that only direct C users would want to use
const char*
and every indirect users (LUA, C#, Rust, language bindings) would be better off with ImStr.
In LuaJIT-ImGui I would prefer
ig.Text("some_text")
over
ig.Text(ImStr("some_text"))
In LuaJIT-ImGui I would prefer
Yes of course. From pretty much any language you would do Text("hello") and the bindings would convert that to the ImStr function implicitly/automatically or explicitly (preferable as it can leverage knowing string length without a strlen call).
In cimgui ImStr would just be
typedef struct ImStr ImStr;
struct ImStr
{
const char* b, *e;
};
In LuaJIT
local ffi = require"ffi"
-- C definition
ffi.cdef[[
typedef struct ImStr ImStr;
struct ImStr
{
const char* b, *e;
};
]]
-- LuaJIT wrapping
local ImStr= {}
function ImStr.__new(ctype,a,b)
b = b or ffi.new("const char*",a) + #a
return ffi.new(ctype,a,b)
end
function ImStr.__tostring(is)
return ffi.string(is.b,is.e~=nil and is.e-is.b or nil)
end
ImStr.__index = ImStr
ImStr = ffi.metatype("ImStr",ImStr)
-- use it
local is = ImStr("hello")
print(is,is.b,is.e)
would print hello cdata<const char *>: 0x002b7a1c cdata<const char *>: 0x002b7a21
And the burden would be in all cimgui functions taking one or more ImStr: When an ImStr typed argument is needed it will have to be wrapped inside ImStr()
now is
function M.TextUnformatted(text,text_end)
text_end = text_end or nil
return lib.igTextUnformatted(text,text_end)
end
after it will be
function M.TextUnformatted(text)
return lib.igTextUnformatted(ImStr(text))
end
using cimgui from C could be even more difficult.
ImStr str;
str.b = "hello";
str.e = str.b + strlen(str.b);
igTextUnformatted(str);
ImStr str; str.b = "hello"; str.e = str.b + strlen(str.b); igTextUnformatted(str);
This strikes me as the wrong way to surface ImStr
in cimgui.
I would expect cimgui to generate ImStr
and non-ImStr
versions. The average C user would use the non-ImStr versions and bindings from languages which naturally use ImStr
-like string representation would use those.
For example, I'd expect something like this from cimgui for SmallButton
:
CIMGUI_API bool igSmallButton(const char* label)
{
return ImGui::SmallButton(ImStr(label));
}
CIMGUI_API bool igSmallButton_ImStr(ImStr label)
{
return ImGui::SmallButton(label);
}
A Lua binding might not even surface the _ImStr
variants, whereas a binding for .NET (where strings are length-prefixed) would likely use them under the hood exclusively.
@PathogenDavid I agree 100% I opened a thread in cimgui repo to discuss this https://github.com/cimgui/cimgui/issues/175
@ocornut @PathogenDavid I mostly agree. Only a few naming issues.
Take for example in imgui.h
IMGUI_API void PushID(ImStr str_id);
IMGUI_API void PushID(const char* str_id_begin, const char* str_id_end = NULL);
IMGUI_API void PushID(const void* ptr_id);
IMGUI_API void PushID(int int_id);
That would be in cimgui.h
void igPushIDStr(ImStr str_id);
void igPushIDChptChpt(const char* str_id_begin,const char* str_id_end);
void igPushIDPtr(const void* ptr_id);
void igPushIDInt(int int_id);
and the overloaded function taking ImStr would trig generation of another function
void igPushIDChpt(const char *str_id)
{
ImGui::PushID(str_id); //will use implicit conversion
}
That would behave as the old igPushIDStr
@rokups I think that this should be tested with the changes in this PR. I am not sure how this should be done: your brach https://github.com/rokups/imgui/tree/rk/ImStr is 4 commits ahead, 300 commits behind ocornut:master. Is this a problem?
The PR has been moved to this repository, features/string_view
It is up to date
Done pull imgui/features/string_view and generation after modification avoiding name clash in overloadings (char* was Str now is Chpt, ImStr is now Str) in https://github.com/cimgui/cimgui/tree/string_view
soon will work on making const char *
versions
@ocornut Just finished and examples working but I have a question Is there any reason for using const in Button but not in SmallButton
IMGUI_API bool Button(const ImStr label, const ImVec2& size = ImVec2(0, 0)); // button
IMGUI_API bool SmallButton(ImStr label); // button with
While working on LuaJIT wrapping I realized that ImStr to const char* susbtitution would clash with already present substitutions on imgui. So when the signature resulting from the substitution is already present in imgui the const char*
version is not added now.
The place where this substitution is made has changed: now the new signature is added before computation of overloading names so that naming conventions have changed (updated in the message above).
Is there any reason for using const in Button but not in SmallButton
It's a bug! Will patch/repush the branch now.
Let's try to discuss the rest in https://github.com/cimgui/cimgui/issues/175
Rebased this branch on master and renamed ImStr
to ImStrv
in history, so we have a basic type that convey "string view" (lack of ownership) better.
FYI this is maintained as features/string_view branch and we are starting to push users to give it a try (@emoon is trying to write a new backend).
One thing that annoys me in our ImStrv branch is we are forced to copy and zero-terminate format-string because we are relying on printf functions. So currently functions like SliderScalar(), DragScalar() etc need to do that:
const char* format;
char format_buf[64];
if (!format_p)
{
// Default format string when passing NULL
format = DataTypeGetInfo(data_type)->PrintFmt;
}
else
{
// Copy format string (format may not be zero-terminated)
format = format_buf;
IM_ASSERT(format_p.End - format_p.Begin < IM_ARRAYSIZE(format_buf));
ImStrncpy(format_buf, format_p, IM_ARRAYSIZE(format_buf));
}
And Text(), TreeNode() etc don't yet support format string as non-zero terminated for the same reason.
It's particularly annoying because many format strings would ALREADY be zero-terminated but we cannot assume it (and I don't think it would be legal to read at the *format.end
address, probably wouldn't crash but would trigger asan, etc.)
So I think we'd need a printf implementation which support format as a non-zero terminated string-view.... May look into stb_printf..... It doesn't support it but we could envision making that change.
Any other idea?
(Embedding our own printf implementation would also standardize localisation behaviors such as #2278)
printf("%.*s", length, string);
works (though will end the string at nulls inside the non-null-terminated string).
@hrydgard I am talking about the format string itself not being null-terminated.
printf("%.*s", length, string);
works (though will end the string at nulls inside the non-null-terminated string).
This doesn't help: printing a string without termination is not the problem, having the Format-String itself not being terminated is the problem.
Doh, right :/ Yeah, will have to go for a custom implementation indeed.