CAIPs icon indicating copy to clipboard operation
CAIPs copied to clipboard

CAIP-358: Universal Payment Request Method

Open lukaisailovic opened this issue 7 months ago • 20 comments

This CAIP standardizes a wallet_pay JSON-RPC method enabling one-click cryptocurrency payments across wallets and dapps. Allows merchants to specify multiple payment options (cross-chain assets) in a single request, letting wallets automatically select the optimal payment method based on user's available assets. Eliminates the current multi-step payment flow (select token → select chain → generate address → manual transfer) by moving payment choice and execution to the wallet, reducing friction from 4-6 clicks to a single interaction.

lukaisailovic avatar May 26 '25 16:05 lukaisailovic

This seems quite focused on URL specifics for mobile app wallets. We likely need to address both mobile and desktop browser implementations too. For example, within brave browser on mobile the user can interact with it using the same interface that's used by an extension on a desktop. If we only focus on the QR code scanning interaction, we'll lose support for this when people are on desktop browser (or in limited circumstances, browsing on a wallet's WebView or relying on Brave's mobile wallet implementation).

This is one of the reasons I was looking towards a JSON-RPC that can be included in a specific script, and done so in a way that has it's agnostic wallet object for this.

For example, window.wallet() is the script I had in mind for this because the intention here is to migrate away from chain specific logic. In this case too, this would occur before a CAIP-25 session setup flow too I would expect because a session isn't needed to submit the request. A session should only be needed when permission is necessary for lower level RPC interactions.

kdenhartog avatar May 26 '25 22:05 kdenhartog

Also as a side note, would you prefer that I publish my draft document or offer a PR onto yours so that we can account for it? We'll likely need both interactions to seamless make this work across platforms and interactions.

kdenhartog avatar May 26 '25 22:05 kdenhartog

Would it be a good idea to align the API to PaymentRequest#details? There are some properties in there which may be nice for this API too.

jxom avatar May 27 '25 05:05 jxom

@jxom Great idea! Makes a lot of sense IMO

lukaisailovic avatar May 27 '25 07:05 lukaisailovic

@kdenhartog I don't think this only focuses on the QR code scan. It defines the RPC but not the method of transfer. It can be via the QR Code (WalletConnect) or via the provider. I think it can work quite well for the browser extension wallets.

I'll create a discussion and we can follow up there

lukaisailovic avatar May 27 '25 07:05 lukaisailovic

@kdenhartog Discussion: https://github.com/ChainAgnostic/CAIPs/discussions/359

lukaisailovic avatar May 27 '25 07:05 lukaisailovic

Would it be a good idea to align the API to PaymentRequest#details? There are some properties in there which may be nice for this API too.

What does this mean, concretely? does this mean renaming orderId to details.id, or just mentioning the latter in the definition of the former so people know they can pass the former in the place of the latter? I'm all for using the w3c precedent but I am curious how far y'all are thinking of "aligning" the format? it might be good to tag the other yoav weiss or someone else from shopify's web/devrel team, as they might have deep insight into how that API gets used in the wild in "web2 ecommerce" that could inform this minimalist implementation thereof if you take the alignment far enough to be compliant!

bumblefudge avatar Jun 02 '25 16:06 bumblefudge

What does this mean, concretely?

It means exposing the interface (and possibly a lot of the spec itself!) to Wallets, and not just Browsers. Deduping and not reinventing too much would be ideal, as I see this spec having nearly identical intention.

jxom avatar Jun 02 '25 16:06 jxom

It means exposing the interface (and possibly a lot of the spec itself!) to Wallets, and not just Browsers.

Am I understanding properly that you want to polyfill on this? I think if we do this we're going to walk back into the 6963 problem where multiple extensions will polyfill ontop of this. This is one of the issues that's plagued passkeys too is that multiple providers want to participate in the flow (extensions/browsers/OS).

While I'm a fan of mapping the logic to what they've already got, I'm a bit concerned about reintroducing these issues we just solved. Also, there is a possibility for a browser to just prevent overrides of the scripts they inject into the page, so we may want to be careful here such that wallets' participation in this flow don't get consumed by the underlying browser implementation. Brave specifically has taken the stance we won't do this and have settings to allow the user to override it. However, hearing what happened during the first go with PaymentRequest API (plus hardly ever seeing it used) I'm cautious about how others may participate since core business revenues are often built on capturing these flows and the data that's passed in it.

In laying out all my cautions now, I will say I'm actually quite supportive of this direction if we can figure out how to make it work. This seems like a good way to adversarial interop on it and pick up adoption from the limited use of it so far.

kdenhartog avatar Jun 04 '25 04:06 kdenhartog

No, I purely mean just aligning the interface that can be used in the wallet_pay JSON-RPC request (utilizing 6963 😉).

jxom avatar Jun 04 '25 04:06 jxom

No, I purely mean just aligning the interface that can be used in the wallet_pay JSON-RPC request (utilizing 6963 😉).

To start, I agree this makes the most sense. You're going to hate me here though... but I want to expose this only with the 282 work so that we have an actually chain agnostic approach here. For example, it would be odd to call the following through window.ethereum since there's no certainty that the wallet can send to Solana:

{
    "orderId": "order-123456",
    "acceptedPayments": [
      {
        "recipient": "solana:5eykt4UsFv8P8NJdTREpY1vzqKqZKvdp:9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM",
        "asset": "solana:5eykt4UsFv8P8NJdTREpY1vzqKqZKvdp/slip44:501",
        "amount": "0x6F05B59D3B20000"
      }
    ],
    "expiry": 1709593200 
  }

The reason I'm suggesting that though is because we're now at the point where we've got window.ethereum, window.solana and adding window.injectedWeb3 (polkadot), and window.cardano and I was thinking someone was proposing something like window.bitcoin awhile back too. I expect Phantom and Metamask are going to be encountering the same problem too now that they're both supporting window.ethereum and window.solana too.

kdenhartog avatar Jun 05 '25 09:06 kdenhartog

To start, I agree this makes the most sense. You're going to hate me here though... but I want to expose this only with the https://github.com/ChainAgnostic/CAIPs/pull/282/

It's good that we agree there! I was only concerned about the interface part, not the transport part. From my understanding, this spec outlines a JSON-RPC format and the behavior of it, but not the underlying transport itself. Is it right to assume that this can be plugged into any JSON-RPC compatible transport? Whether it be EIP-6963, CAIP-282, window.ethereum, window.solana, window.mysickwallet, etc? Seems like the most backwards-compatible approach. I am not disagreeing with you, just want to understand the scope of this spec.

jxom avatar Jun 05 '25 10:06 jxom

To start, I agree this makes the most sense. You're going to hate me here though... but I want to expose this only with the https://github.com/ChainAgnostic/CAIPs/pull/282/

It's good that we agree there! I was only concerned about the interface part, not the transport part. From my understanding, this spec outlines a JSON-RPC format and the behavior of it, but not the underlying transport itself. Is it right to assume that this can be plugged into any JSON-RPC compatible transport? Whether it be EIP-6963, CAIP-282, window.ethereum, window.solana, window.mysickwallet, etc? Seems like the most backwards-compatible approach.

I am not disagreeing with you, just want to understand the scope of this spec.

@jxom You're exactly right. The spec intentionally doesn't define the transport layer, and any one of those you mentioned can be used.

lukaisailovic avatar Jun 05 '25 10:06 lukaisailovic

@jxom that's a great clarifying question. I hadn't reached the conclusion until we discussed further on the separate discussion. Ok, cool sounds like we're all on the same page for that then because that was what I was expecting too.

I think the other thing that might need to be defined is how the RPC request data can encoded into a URI that can be fired and forgotten. Then the recipient would poll the chain for data round it or potentially make a callback to a webhook or something. This way we can make payment requests with the same format over QR codes/NFC/deeplinks/etc

The question I'd have for you @lukaisailovic is would this be better defined within the scope of this spec or better done as a separate spec?

kdenhartog avatar Jun 05 '25 11:06 kdenhartog

Also, I can open a PR onto lukaisailovic:wallet-pay-caip for security/privacy considerations hopefully next week.

kdenhartog avatar Jun 05 '25 11:06 kdenhartog

@kdenhartog URI encoding + webhook is significantly more complex, primarily for dapps who now need a backend to check payment status, but for the wallet as well. I specifically wanted to avoid that approach as this is much lower lift to implement. I'd prefer to leave it out of this spec, and if someone wants to do that, they can create a new spec and define exactly how that should be handled. Additionally the same method can be transmitted via NFC if someone defines that as a universal transport for RPCs.

TLDR Different spec for general transport is better IMO. No need to couple things unnecessarily

lukaisailovic avatar Jun 05 '25 11:06 lukaisailovic

@lukaisailovic nice proposal! curious if you've considered a path where this use case could be handled as elegantly using EIP-5792. happy to provide more feedback on this API specifically, but i just wonder if we'd have more impact overall by leveraging momentum the space already has with wallet_sendCalls. as we've seen it takes a lot to coordinate all the necessary players 😅 .

for reference, Coinbase is also working on this feature: https://docs.base.org/identity/smart-wallet/guides/profiles#profiles , which uses EIP-5792 and could serve things like express checkout well. so im curious if there's some middle ground to be found here

lukasrosario avatar Jun 05 '25 17:06 lukasrosario

@lukasrosario EIP-5792 is EVM specific in it's design whereas this is intended to be a chain agnostic design. Additionally, EIP-5792 is a low level RPC call that presumes a DApp has requested access to the account address and will format the transaction. This takes a different approach aiming for it to be a high level RPC API that doesn't require a connection for the page to engage with the RPC.

The advantage here is that the wallet can receive payment requests without a connection request in the first place. It simply detects the wallet via a channel like EIP-6963/CAIP-282 then submits a request and the wallet handles the rest. The advantage here is that, it reduces the complexity on the site developer to not need to understand how to formulate a transaction specific to the blockchain in use or being bridged to.

With that in mind, I'd expect the wallet may use EIP-5792 transaction bundles when interacting with EVM chains rather than discrete transactions in order to avoid high latency, but this is an implementation choice of each wallet so would remain out of scope.

kdenhartog avatar Jun 06 '25 00:06 kdenhartog

Additionally, EIP-5792 is a low level RPC call that presumes a DApp has requested access to the account address and will format the transaction. This takes a different approach aiming for it to be a high level RPC API that doesn't require a connection for the page to engage with the RPC.

FWIW, ERC-5792 does not require a connection either (from is optional).

jxom avatar Jun 06 '25 06:06 jxom

TIL, thanks for the correction @jxom. In any case, I do think it still introduces too much complexity to the site for this specific case, because they need to know which protocols to interact with and their calldata formats, but in general I do think it fits in well as a powerful low level primitive that sits next to this.

kdenhartog avatar Jun 07 '25 00:06 kdenhartog

From today's WC UX Council meeting discussion of CAIP-358:

Side note: I wonder if we should have hints in the response to let the daps know what to expect next. For example: wait for confirmation, wait for signing, tx confirmed, etc?

Should there be a return address field in the wallet response? if the order can't be processed ?

Does that have to be in v1, or is that a v2 feature?

does it need to be an address actually involved in the payment/swap/intent-execution? maybe not? privacy quandary

Would be helpful to see documentation on wallet_pay implementation guidance for EVM injected providers in addition to the WalletConnect flow...

bumblefudge avatar Jul 31 '25 15:07 bumblefudge

I think this is complete enough to publish as a draft, but i'd love to see test cases long before last call if possible, to help along alternate implementations

++ I think it makes sense for us to get the draft in and then we can iterate as necessary from here

kdenhartog avatar Aug 01 '25 01:08 kdenhartog

@bumblefudge I think you guys have to merge it right?

lukaisailovic avatar Aug 01 '25 15:08 lukaisailovic

This CAIP standardizes a wallet_pay JSON-RPC method enabling one-click cryptocurrency payments across wallets and dapps. Allows merchants to specify multiple payment options (cross-chain assets) in a single request, letting wallets automatically select the optimal payment method based on user's available assets. Eliminates the current multi-step payment flow (select token → select chain → generate address → manual transfer) by moving payment choice and execution to the wallet, reducing friction from 4-6 clicks to a single interaction.

kelvin7891 avatar Nov 13 '25 12:11 kelvin7891