clap
clap copied to clipboard
[modular] User-driven validation
Please complete the following tasks
- [X] I have searched the discussions
- [X] I have searched the existing issues
Clap Version
master
Describe your use case
- We want to get clap smaller (#1365) and right now validation adds bulk regardless of what parts people use
- We could put parts behind a feature flag but that could end up getting messy
- In https://github.com/clap-rs/clap/discussions/3005 / #3135, a user requested specialized validation logic
Describe the solution you'd like
Define one or more App Validation trait
- Validate: Takes a built
App
andArgMatches
as an input, reports errors - Requires: Takes a built
App
,ArgMatches
, and args as input, reports which args are required -
(maybe) Default value: Takes a built
App
,ArgMatches
, and args as input, reports default values for them
With maybe a hint(&self, app: &App) -> Option<String>
(see #1695 and #3321)
- Should it also accept a
matches: Option<&ArgMatches>
for smart hints?
Steps
- Refactor all requires, conflicts, etc APIs to be implemented in terms of this
- Try to overcome weirdnesses in the
default_value_if
API with its use ofOption
by clarifying intent through builders
- Try to overcome weirdnesses in the
- Expose the trait and structs to the user in a
clap::validate
- Deprecate the old
Arg
functions so now only those using this API pay the cost due to dead code elimination
Alternatives, if applicable
No response
Additional Context
See also https://github.com/clap-rs/clap/discussions/2832
Risks are usability and performance. The most basic APIs, like required
, we probably need to keep baked in. Its the special inter-arg interactions that we should be focusing on.
I'm not quite understanding what the Validate
trait will look like. Is this a project-level trait that will be implemented on a certain Fn / FnMut
, meaning users don't need to worry about it, or will it be provided by a clap_validate
crate, and users need to include that crate and its trait to get App::validator_all
?
My general expectation for traits in this issue and related ones (e.g. https://github.com/clap-rs/clap/issues/2683) is they would live in clap
(we'd need access to them) and we will have an impl for Fn
signatures where they make sense for convinience. We'd work to duplicate clap's validation functionality in structs that implement these traits. These would most likely live in clap, relying on the compiler deleting dead code (this will be easier for a compiler to detect than the current under-used validation code).
All of this would start out behind an unstable-*
feature flag as we work to polish it up. Once we felt good with them, we'd remove the feature flag and deprecate the old functionality. On the next major release, we'd then remove the old functionality, pushing people to use the trait-based version.
This is all contingent on the design working out. We might find it just isn't worth completing this.
Wait, why would clap
need access to the trait?
The trait could just fit into clap
's builder pattern for App
, providing a function for App
which modifies App
.
Then you could call App::error
in the Validator
trait impl. clap
wouldn't need to know the validator exists so long as App
has the right struct members pub
or fn
's to get them.
The proposed validator trait would be called during parse so we have the context of the specific subcommand being processed.
Forgive me for not knowing the internals as well as you, but couldn't the Validator trait, instead of being called on App
, be called on ArgMatches
instead?
Perhaps as app.get_matches().validate()
.
It needs access to the relevant App
for any extra error reporting it wants to do, like
- Checking for additional context for whether validation has failed or not
- Showing usage
- Showing help if we made this replace
ArgsRequiredElseHelp
'
I see. Honestly, as much as I don't like it, perhaps you do have a strong case for pushing App::error
on downstream users, and we're just going to have to learn to live with it. ;-)
The primary motivation behind this is to see if we can make the cost of validation paid by those who use it rather than everyone (ie clap's existing validation logic like conflicts, special require rules, etc). A secondary motivation is to simplify the APIs for the more complex validation APIs in clap today. All other validation use cases would just be a bonus since we do have solutions like App::error
. This might work and be merged; it might fail and we reject this issue.
Hopefully you can avoid my horrible situation of the past week: adding an API (https://github.com/MFEK/glifparser.rlib/commit/6aaf88012709fce8fd21ccf0b61d1074a95307a3), realizing it's broken and trying to fix it (https://github.com/MFEK/glifparser.rlib/commit/d9aa702df074094f1509f071e15e11cc696a4ca3)…twice (https://github.com/MFEK/glifparser.rlib/commit/9b06a9422b796d25ab2b270abfd51500c6e5857b). Maybe it has no edge cases now, we shall see. Popular software can't afford such…unprofessionalism, I'm aware. :)
This would most likely be implemented via
- Refactor the existing implementation to use a
pub(crate) trait
(dog fooding) - Expose it as a
pub trait
behind a feature flag with a stablization process (e.g. https://github.com/clap-rs/clap/issues/2861) (collect public feedback) - Stablize the feature (point of no return ... until next major version)
- On next major version, remove parts of API not needed anymore
What about relying on serde
for custom validation instead? clap
would serialise it, and users who would write custom deserialisation, helped with the helper methods of serde. That would a be more modular than clap
implementing its own trait and visitor
I worry serde
would too much complexity for most cases since a lot of the cases people are interested in validating would require hand-writing serde
traits which can be burdensome while just providing a Vaidator
trait in clap that works on an ArgMatches
would be relatively trivial to implement. No visitors needed.
Not necessarily.
With serde_value, you get a Value
type that represents any Serde "object". These can be coerced into the user's structs because they also implement Deserialize
.
I think @kraktus' idea is a good middle ground.
With serde_value, you get a Value type that represents any Serde "object". These can be coerced into the user's structs because they also implement Deserialize.
I'm not seeing how that addressed the concern I raised. I'm assuming there is context missing that I could make guesses at but I would prefer to understand what you are getting at then understanding my assumptions :)
The other problem with serde is it assumes Paths are UTF-8 so naive handling of paths will be done incorrectly while clap is wanting to ensure paths are default handled as a bag of bytes rather than utf-8
I was thinking that you'd have Deserialize
on the whole ArgMatches
. It's already normal for me to have an Args
struct. If ArgMatches
could become a serde::Value
then it could become also an Args
, right?
How does users writing custom validation fit into that picture? If the user just wants to do a field-level validator, they can use the field attributes to provide their own custom deserializer. If they want to validate the entire struct, which is what this issue is focusing on, then they would need to hand write the entire Deserialize
trait which is not pretty to do by hand.
Compare that to just passing in a value that implements Validator
as described earlier. That is a large burden for the user to do one vs the other and so doesn't seem a viable way of solving this problem. We also have a custom derive that offers a lot of domain-specific value that can't be had with clap, which lowers the value add of going down this route for custom validation.
That said, it would be interesting for people to support serde support for ArgMatches
but I think that would best start out as an external crate so it gives people room to experiment and try out ideas more to see what works or doesn't before pulling it into clap. This will require a newtype due to the orphan rules. Some more features will be needed from ArgMatches
which I would like to provide but they are blocked on some v4 changes because of difficulties with our internal Id
type.
My problem was I was only considering my patch #3029 and not that you wanted to make every case work.
To refresh memories, my patch was focused on validating a single struct field that contains multiple values.
So I could use a Newtype and impl FromStr
.
But you're right this won't let you validate the whole struct.
Erm, or I could impl From<Vec<…>> for Newtype
.
Then when I call values_of
/ values_of_os
...I could quickly make the returned Vec into my Newtype.
I hadn't considered the other cases, my bad, I forgot this issue was not just about #3029
Where is work being done on this? Is there more discussion that must be had first?
For myself, I have tasks on other projects that I feel are higher priority on this. Someone else could do some exploratory work on this but the challenge is this is very open-ended work and someone working on it would need to keep us in the loop on decisions to make sure there is buy-in on the direction being taken.
It seems like what this is proposing is a validator trait on functions which can take a validator to make new validator. Then use Combinatory_logic to define all the validators. How is this proposed approach different from copying bpaf?
How is this proposed approach different from copying bpaf?
In bpaf
you would compose parsers instead of validators, see this article for a general idea: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
Disclaimer, I'm not an expert in clap
, but I can try to give some examples according some hopefully idiomatic existing code.
cargo-add
needs user to specify the source, so it defines https://github.com/rust-lang/cargo/blob/5b377cece0e0dd0af686cf53ce4637d5d85c2a10/src/bin/cargo/commands/add.rs#L85 a bunch of arguments with some validation - "tag" requires "git", "path" conflicts with "git", etc.
Values later extracted into variables https://github.com/rust-lang/cargo/blob/5b377cece0e0dd0af686cf53ce4637d5d85c2a10/src/bin/cargo/commands/add.rs#L226
And consumed with https://github.com/rust-lang/cargo/blob/5b377cece0e0dd0af686cf53ce4637d5d85c2a10/src/bin/cargo/commands/add.rs#L259
Biggest problem with that is that user code (parse_dependencies
) relies on validated invariants, not parsed ones.
With bpaf you would have those values constructed either by derive or construct!
macro.
struct Options {
...
source: Input,
...
}
enum Ref {
Branch { branch: String },
Tag { tag: String },
Rev { rev: String },
}
enum Input {
Crate { name: String },
File { path: PathBuf },
Git { uri: String, ref: Option<Ref> },
}
With parsers and validators operating on strictly typed Rust values.
For hypothetical input validator (let's say against crate blacklist) clap
version would have to deal with the same steps - getting fields, assuming some invariants been vitiated earlier (unreachable!("clap should ensure we have some source selected");
) and validating them. With bpaf
validator would get a val: &Input
or val: &Ref
depending on where it is placed.
Now say you want to add one extra input type, let's call it "interactive picker".
With clap you modify parser definition and then you have to go over all the places where ArgMatches
are used and update them, otherwise things will explode on runtime with unreachable!("clap should ensure we have some source selected");
With bpaf
compiler will tell you all the places where match val
needs to be updated to handle one more branch.
This becomes a problem when argument parser or parts of it are shared across multiple binaries, but even within a single crate it is easier to work with the right type
enum OutputFormat {
Intel,
Att,
Llvm,
LlvmInput,
Mir,
Wasm,
McaIntel,
McaAtt,
}
rather than trying to emulate it using product types:
struct OutputFormat {
intel: bool,
att: bool,
llvm: bool,
...
}