getargs icon indicating copy to clipboard operation
getargs copied to clipboard

RFC: A new approach to OsStr support

Open EliteTK opened this issue 1 year ago • 5 comments

I would like to preface this by saying that I am happy to just create a fork of getargs with my ideas and leave this project be.

I have prepared a set of patches of progressive improvements which culminate in the addition of &OsStr as Argument support.

I think more tests would be useful to have etc. But this was mainly a proof of concept for myself.

Additionally the patch set adds the feature of not negative numbers as options taken from argparse in python. This is optional.

Really, I am mostly interested in comments and feedback from others who have worked on this project as I think they may be best suited to evaluate the commits.

I am curious as to what I could have done better and if this could ever be merged.

Edit: It's also worth mentioning that this change does introduce some minor (10%-50%) performance regressions. For me this is acceptable given the performance is still above and beyond anything which I would ever be concerned by. But I would be curious to know if this can be improved on.

EliteTK avatar Oct 02 '24 00:10 EliteTK

Interesting! I have thought about changing the API before, mostly thinking about refactoring the code to mimic what regex does: have separate types for bytes vs strings. For example, Options<&str, I> becomes getargs::Options<I>, and Options<&OsStr, I> becomes getargs::os_str::Options<I>. This would be less confusing since it removes a generic parameter.

Also, currently, the Result type is really weird, since it's not Result<Success, Error>, it's Result<Argument, Success>. This could also be improved with the above refactoring.

I haven't looked at your code yet, since I'm currently quite busy with university, but I'll try to get to it soon. Thanks for your contribution!!

j-tai avatar Oct 02 '24 14:10 j-tai

Interesting! I have thought about changing the API before, mostly thinking about refactoring the code to mimic what regex does: have separate types for bytes vs strings. For example, Options<&str, I> becomes getargs::Options<I>, and Options<&OsStr, I> becomes getargs::os_str::Options<I>. This would be less confusing since it removes a generic parameter.

This is an interesting idea. I will try to look into incorporating it as an alternative branch at some point in the near future.

Also, currently, the Result type is really weird, since it's not Result<Success, Error>, it's Result<Argument, Success>. This could also be improved with the above refactoring.

Yes I noticed this, but after working with the code a bit I realised that neither order made much sense. But your suggestion above would make this easier.

I haven't looked at your code yet, since I'm currently quite busy with university, but I'll try to get to it soon. Thanks for your contribution!!

No worries, take your time. I will add comments if I make any changes/updates but I do not expect you to review anything. I really published this more as a discussion piece at this point in time.

EliteTK avatar Oct 02 '24 23:10 EliteTK

Edit: It's also worth mentioning that this change does introduce some minor (10%-50%) performance regressions. For me this is acceptable given the performance is still above and beyond anything which I would ever be concerned by. But I would be curious to know if this can be improved on.

I ran valgrind on an epaper tablet to optimize the ever living fuck out of this crate. It is possible to make the improvements you've mentioned if you give up that goal of performance.

Also to address a few of your commit messages,

ae2bb9e53dec553476976e7b2653f57e9b32dd21: I'll give you the second one, 'expect' is indeed better. The first one was due to an assumption that others would also perceive 'option requires a value' as simply the enum variant rather than prose. I was coding for myself at the time and wasn't thinking about serving other neurotypes as users

5d3f623f6b7fc7981a6c0a0e273000982cbec5b8: starts delving more into behaviors that I would never personally need but that may be useful to serve users who think differently.

As for the rest, especially 574e253d834c7db356b9606134b4794bc80fe18b,

This breaks certain fundamental assertions of the crate such as not ever requiring or assuming UTF-8. For me those assertions are important, but they may not be as important for everyone.

Thank you for putting together this PR, it is well made.

LoganDark avatar May 21 '25 20:05 LoganDark

5d3f623 is a bit weird of a feature, but I think it might make more sense if I confess that I was trying to port some code from python's argparse and discovered that my original code was relying on this specific weird feature.

As for the rest, especially 574e253,

This breaks certain fundamental assertions of the crate such as not ever requiring or assuming UTF-8. For me those assertions are important, but they may not be as important for everyone.

So I think it's a fair complaint, for me it was a trade-off worth having. Although it's still worth noting that the patches never prevent end-users of the application from passing arbitrary byte sequences as option values or positional arguments. Only the option characters/names are forced into UTF-8.

Thank you for the comments though, I appreciate the review.

EliteTK avatar May 21 '25 20:05 EliteTK

Ah, trying to use Rust to replace Python is an interesting use-case that I definitely did not consider. That is fair.

LoganDark avatar May 21 '25 21:05 LoganDark